Changes between Initial Version and Version 1 of AdvancedTopics


Ignore:
Timestamp:
06/07/26 21:46:39 (10 days ago)
Author:
231103
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • AdvancedTopics

    v1 v1  
     1Advanced Topics
     2
     3== Chosen Topic
     4
     5For the Advanced Topics phase of the project, we chose to implement a ''Data Warehouse'' together with several ''Data Cubes''.
     6
     7The main motivation behind this choice was our interest in analytical databases and business intelligence systems. While our original relational database was designed for day-to-day operational activities of a library, a data warehouse allows us to analyze historical data and generate meaningful statistics and insights.
     8
     9Our library management system already contained large amounts of transactional data related to rentals, memberships, libraries, resources, and events, making it a suitable candidate for data warehousing techniques.
     10
     11== Inspiration and Motivation
     12
     13During the course we learned that operational databases and analytical databases serve different purposes.
     14
     15The operational database is optimized for transactions such as:
     16
     17 * Creating memberships
     18 * Borrowing resources
     19 * Returning resources
     20 * Registering users for events
     21
     22However, answering analytical questions such as:
     23
     24 * Which library has the highest number of rentals?
     25 * Which books are the most popular?
     26 * What is the average rental duration?
     27 * Which event types attract the most participants?
     28 * How often are resources returned late?
     29
     30can become expensive and complicated on a transactional database.
     31
     32For this reason we decided to design and implement a dedicated data warehouse that stores historical and aggregated information optimized for reporting and analysis.
     33
     34== Data Warehouse Design
     35
     36The warehouse was built around two major business processes:
     37
     38 * Resource rentals
     39 * Library events
     40
     41For each process we created transactional fact tables that store detailed events and several aggregated fact tables that serve as data cubes for analytical processing.
     42
     43The warehouse follows a dimensional modeling approach consisting of fact tables and dimension tables.
     44
     45== Rental Analytics
     46
     47The original system contains a ''Rental'' table where every borrowing transaction is recorded.
     48
     49Using this information we created a rental-oriented warehouse model that enables analysis of:
     50
     51 * Rental frequency
     52 * Penalties
     53 * Late returns
     54 * Rental duration
     55 * Library performance
     56 * Resource popularity
     57
     58=== Rental Warehouse Schema
     59
     60[[Image(rental_warehouse_schema.png, width=900)]]
     61
     62''Figure 1: Rental data warehouse schema.''
     63
     64The schema contains:
     65
     66 * Dimension tables describing members, libraries, editions, and dates.
     67 * A transactional fact table storing individual rental activities.
     68 * Multiple aggregated fact tables used as analytical data cubes.
     69
     70=== Rental Dimensions
     71
     72The following dimensions were created:
     73
     74 * '''dim_member''' – information about members and membership plans.
     75 * '''dim_library''' – information about libraries and locations.
     76 * '''dim_edition''' – information about editions and resources.
     77 * '''dim_date''' – calendar dimension used for time-based analysis.
     78
     79Several dimensions implement Slowly Changing Dimensions (SCD Type 2) in order to preserve historical changes.
     80
     81=== Rental Fact Tables
     82
     83The rental warehouse contains the following fact tables:
     84
     85 * '''fact_rental''' – transactional fact table containing individual rental records.
     86 * '''fact_rental_monthly''' – monthly aggregation by member, library, edition, and date.
     87 * '''fact_rental_library_monthly''' – monthly aggregation by library.
     88 * '''fact_rental_library_edition_monthly''' – monthly aggregation by library and edition.
     89
     90These tables provide the foundation for OLAP-style analysis and reporting.
     91
     92== Event Analytics
     93
     94Besides rental information, our system also stores information about library events and user registrations.
     95
     96To support event-related analytics we designed a separate warehouse model focused on event participation and popularity.
     97
     98The warehouse enables analysis of:
     99
     100 * Event attendance
     101 * Event popularity
     102 * Capacity utilization
     103 * Fully booked events
     104 * Registration behaviour
     105 * Event type trends over time
     106
     107=== Event Warehouse Schema
     108
     109[[Image(event_warehouse_schema.png, width=900)]]
     110
     111''Figure 2: Event data warehouse schema.''
     112
     113=== Event Dimensions
     114
     115The event warehouse reuses existing dimensions and introduces:
     116
     117 * '''dim_event_type''' – event category dimension.
     118
     119=== Event Fact Tables
     120
     121 * '''fact_event''' – transactional fact table containing detailed event statistics.
     122 * '''fact_event_type_monthly''' – monthly analytics by event type.
     123 * '''fact_event_type_library_monthly''' – monthly analytics by event type and library.
     124
     125== Data Cubes
     126
     127Based on the fact tables we implemented several analytical cubes.
     128
     129=== Rental Cube
     130
     131Dimensions:
     132
     133 * Date
     134 * Library
     135 * Edition
     136 * Member
     137
     138Measures:
     139
     140 * Rental count
     141 * Total penalties
     142 * Average penalties
     143 * Average rental duration
     144 * Overdue rate
     145 * Average return delay
     146
     147=== Library Performance Cube
     148
     149Dimensions:
     150
     151 * Date
     152 * Library
     153
     154Measures:
     155
     156 * Rental count
     157 * Total penalties
     158 * Average rental duration
     159 * Overdue rate
     160
     161=== Event Participation Cube
     162
     163Dimensions:
     164
     165 * Date
     166 * Event Type
     167 * Library
     168
     169Measures:
     170
     171 * Event count
     172 * Total registrations
     173 * Capacity utilization
     174 * Average fill rate
     175 * Fully booked events
     176 * Average registration lead time
     177
     178== ETL Process
     179
     180The warehouse is populated through a series of ETL procedures implemented in PostgreSQL.
     181
     182The ETL process performs:
     183
     184 * Extraction of data from the operational database.
     185 * Transformation of source records into dimensional structures.
     186 * Handling of Slowly Changing Dimensions.
     187 * Loading of transactional fact tables.
     188 * Aggregation of data into analytical fact tables and cubes.
     189
     190== Conclusion
     191
     192The implementation of a Data Warehouse and Data Cubes extended the functionality of our library management system beyond transactional processing.
     193
     194The warehouse enables efficient analytical queries, supports historical reporting, and provides valuable insights regarding rentals, members, libraries, resources, and events.