wiki:AdvancedTopics

Advanced Topics

Chosen Topic

For the Advanced Topics phase of the project, we chose to implement a Data Warehouse together with several Data Cubes.

The main motivation behind this choice was our interest in analytical databases and business intelligence systems. While our original relational database was designed for day-to-day operational activities of a library, a data warehouse allows us to analyze historical data and generate meaningful statistics and insights.

Our library management system already contained large amounts of transactional data related to rentals, memberships, libraries, resources, and events, making it a suitable candidate for data warehousing techniques.

Inspiration and Motivation

During the course we learned that operational databases and analytical databases serve different purposes.

The operational database is optimized for transactions such as:

  • Creating memberships
  • Borrowing resources
  • Returning resources
  • Registering users for events

However, answering analytical questions such as:

  • Which library has the highest number of rentals?
  • Which books are the most popular?
  • What is the average rental duration?
  • Which event types attract the most participants?
  • How often are resources returned late?

can become expensive and complicated on a transactional database.

For this reason we decided to design and implement a dedicated data warehouse that stores historical and aggregated information optimized for reporting and analysis.

Data Warehouse Design

The warehouse was built around two major business processes:

  • Resource rentals
  • Library events

For each process we created transactional fact tables that store detailed events and several aggregated fact tables that serve as data cubes for analytical processing.

The warehouse follows a dimensional modeling approach consisting of fact tables and dimension tables.

Rental Analytics

The original system contains a Rental table where every borrowing transaction is recorded.

Using this information we created a rental-oriented warehouse model that enables analysis of:

  • Rental frequency
  • Penalties
  • Late returns
  • Rental duration
  • Library performance
  • Resource popularity

Rental Warehouse Schema

Figure 1: Rental data warehouse schema.

The schema contains:

  • Dimension tables describing members, libraries, editions, and dates.
  • A transactional fact table storing individual rental activities.
  • Multiple aggregated fact tables used as analytical data cubes.

Rental Dimensions

The following dimensions were created:

  • dim_member – information about members and membership plans.
  • dim_library – information about libraries and locations.
  • dim_edition – information about editions and resources.
  • dim_date – calendar dimension used for time-based analysis.

Several dimensions implement Slowly Changing Dimensions (SCD Type 2) in order to preserve historical changes.

Rental Fact Tables

The rental warehouse contains the following fact tables:

  • fact_rental – transactional fact table containing individual rental records.
  • fact_rental_monthly – monthly aggregation by member, library, edition, and date.
  • fact_rental_library_monthly – monthly aggregation by library.
  • fact_rental_library_edition_monthly – monthly aggregation by library and edition.

These tables provide the foundation for OLAP-style analysis and reporting.

Event Analytics

Besides rental information, our system also stores information about library events and user registrations.

To support event-related analytics we designed a separate warehouse model focused on event participation and popularity.

The warehouse enables analysis of:

  • Event attendance
  • Event popularity
  • Capacity utilization
  • Fully booked events
  • Registration behaviour
  • Event type trends over time

Event Warehouse Schema

Figure 2: Event data warehouse schema.

Event Dimensions

The event warehouse reuses existing dimensions and introduces:

  • dim_event_type – event category dimension.

Event Fact Tables

  • fact_event – transactional fact table containing detailed event statistics.
  • fact_event_type_monthly – monthly analytics by event type.
  • fact_event_type_library_monthly – monthly analytics by event type and library.

Data Cubes

Based on the fact tables we implemented several analytical cubes.

Rental Cube

Dimensions:

  • Date
  • Library
  • Edition
  • Member

Measures:

  • Rental count
  • Total penalties
  • Average penalties
  • Average rental duration
  • Overdue rate
  • Average return delay

Library Performance Cube

Dimensions:

  • Date
  • Library

Measures:

  • Rental count
  • Total penalties
  • Average rental duration
  • Overdue rate

Event Participation Cube

Dimensions:

  • Date
  • Event Type
  • Library

Measures:

  • Event count
  • Total registrations
  • Capacity utilization
  • Average fill rate
  • Fully booked events
  • Average registration lead time

ETL Process

The warehouse is populated through a series of ETL procedures implemented in PostgreSQL.

The ETL process performs:

  • Extraction of data from the operational database.
  • Transformation of source records into dimensional structures.
  • Handling of Slowly Changing Dimensions.
  • Loading of transactional fact tables.
  • Aggregation of data into analytical fact tables and cubes.

Conclusion

The implementation of a Data Warehouse and Data Cubes extended the functionality of our library management system beyond transactional processing.

The warehouse enables efficient analytical queries, supports historical reporting, and provides valuable insights regarding rentals, members, libraries, resources, and events.

Last modified 10 days ago Last modified on 06/07/26 21:54:33

Attachments (2)

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.