wiki:OtherTopics

Version 2 (modified by 221511, 10 days ago) ( diff )

made the tests on the report queries from phase 6 and expanded the other explanations

Other Topics

SQL Performance

Performance analysis was conducted against the full complex report queries from Phase P6 (ComplexReports). The tests were run on the local development database after loading 200,000 synthetic reservations spread uniformly across 2 years (2024-2026). The small seed dataset (26 rows) cannot reveal index usefulness because sequential scans are optimal at that scale.

The testing method: each P6 report query is executed with EXPLAIN (ANALYZE, BUFFERS) before adding new indexes and after. We compare execution time, scan type, and buffer usage.

Scenario 1: Phase P6 Report 1 — Resource Utilization (quarterly)

Query source: the full "Report 1: Resource Utilization and Demand Analysis" from ComplexReports. Uses 5 CTEs: quarter_bounds, quarter_days, resource_availability, reservation_stats, popular_day, peak_hour. Window function RANK() for demand ranking. Filters by a quarterly date range.

Proposed index:

CREATE INDEX idx_reservations_start_resource_status
    ON reservations (start_time, resource_id, status);

Before index (excerpt of plan):

HashAggregate  (actual rows=189 loops=1)
 ->  Nested Loop (actual rows=10383 loops=1)
       ->  Bitmap Heap Scan on reservations rv_1 (actual rows=10383 loops=1)
             Recheck Cond: start_time >= q_start AND start_time < q_end
             Filter: status IN ('approved','completed')
             Rows Removed by Filter: 15537
             Heap Blocks: exact=364
             ->  Bitmap Index Scan on idx_reservations_times
Execution Time: 181.324 ms

After index:

 ->  Index Only Scan using idx_reservations_start_resource_status
       on reservations rv_1
       Index Cond: start_time >= q_start AND start_time < q_end
       Heap Fetches: 0
Execution Time: 82.307 ms

Analysis: The new composite index includes status, so the planner can filter by status inside the index without touching the heap. The popular_day and peak_hour CTEs both switched from Bitmap Heap Scan with "Rows Removed by Filter: 15537" per iteration to a clean Index Only Scan. Heap Fetches is 0.

Result: 181 ms -> 82 ms (55% faster). Index Only Scan confirmed in the plan.

Scenario 2: Phase P6 Report 3 — User Activity and Behavior Analysis

Query source: the full "Report 3: User Activity and Behavior Analysis" from ComplexReports. Uses 2 CTEs (user_stats, favorite_resource), DISTINCT ON, RANK(), COUNT(*) FILTER (WHERE ...) for per-status counts, and aggregates over the entire reservations table grouped by user_id.

Proposed index:

CREATE INDEX idx_reservations_user_status_resource
    ON reservations (user_id, status, resource_id)
    INCLUDE (start_time, end_time);

The INCLUDE clause adds start_time and end_time as index payload (not used for ordering) so the query can compute AVG(end_time - start_time) and SUM(...) FILTER (...) without heap access.

Before index:

Sort Method: external merge  Disk: 9032kB
 ->  Seq Scan on reservations rv  (actual rows=200026 loops=1)
Execution Time: 279.957 ms

After index:

Incremental Sort
  Presorted Key: user_id
  Full-sort Groups: 12  Sort Method: quicksort  Average Memory: 29kB
  Pre-sorted Groups: 12  Sort Method: quicksort  Average Memory: 1821kB
  ->  Index Only Scan using idx_reservations_user_status_resource
        Heap Fetches: 0
Execution Time: 175.602 ms

Analysis: The sequential scan is replaced by an Index Only Scan. Disk-based external merge sort (9032kB to disk) is eliminated; because the index is sorted by user_id, PostgreSQL does an Incremental Sort entirely in memory. Heap Fetches: 0 confirms the INCLUDE columns are doing their job.

Result: 280 ms -> 176 ms (37% faster). Disk sort eliminated; Index Only Scan used.

Scenario 3: Phase P6 Report 4 — Administrator Approval Workload

Query source: the full "Report 4: Administrator Approval Workload and Bottleneck Analysis" from ComplexReports. Uses 3 CTEs (admin_stats, pending_stats, workload_share), CROSS JOIN to attach the single-row pending_stats to every admin, and SUM(...) OVER () to compute workload share percentage.

Proposed index:

CREATE INDEX idx_reservations_approver_status
    ON reservations (approved_by, status, resource_id, user_id)
    WHERE approved_by IS NOT NULL;

This is a partial covering index: the WHERE clause excludes rows with no approver (roughly 30% of the data), and the listed columns are everything the admin_stats CTE needs to aggregate.

Before index:

Sort Method: external merge  Disk: 4568kB
 ->  Seq Scan on reservations rv_1
     Filter: approved_by IS NOT NULL
Execution Time: 166.804 ms

After index:

 ->  Index Only Scan using idx_reservations_approver_status
       Heap Fetches: 0
Execution Time: 129.167 ms

Analysis: Sequential scan replaced by Index Only Scan. Disk-based external merge sort (4568kB) eliminated. The partial index is smaller than a full one because it excludes the 60,000 rows with approved_by IS NULL.

Result: 167 ms -> 129 ms (23% faster). Index Only Scan used; partial index avoids indexing irrelevant rows.

Performance Summary

Phase P6 ReportBeforeAfterGainKey change
Report 1 — Resource Utilization (quarterly)181 ms82 ms55%Composite index enables Index Only Scan for CTE aggregation
Report 3 — User Activity Analysis280 ms176 ms37%INCLUDE covering index removes 9 MB disk sort and heap fetches
Report 4 — Admin Workload Analysis167 ms129 ms23%Partial covering index replaces seq scan, removes 4.5 MB disk sort

Note on Reports That Do Not Benefit from Indexes

Phase P6 Report 2 (Monthly Reservation Trends) without any date filter scans and aggregates the entire reservations table grouped by month. A full-table aggregation cannot be sped up by indexes alone — the planner would still have to read every row. For that report the correct optimization is either (a) to add a date filter (e.g. "last 12 months"), at which point the idx_reservations_start_resource_status index cuts execution time by about 10%, or (b) to pre-compute monthly summaries in a materialized view that is refreshed periodically. The same applies to any dashboard query that aggregates over the whole history.

Security Measures

SQL Injection Prevention

All database queries in the prototype use parameterized queries via psycopg2's %s placeholders. User input is never concatenated into SQL strings.

Example from uc_reserve.py:

cur.execute("""
    INSERT INTO reservations
        (start_time, end_time, status, purpose, created_at, user_id, resource_id)
    VALUES (%s, %s, 'pending', %s, CURRENT_TIMESTAMP, %s, %s)
    RETURNING reservation_id, status, created_at
""", (start_dt, end_dt, purpose, user_id, rid))

psycopg2 sends the query template and parameters separately to PostgreSQL. The server parses the template first and only then binds the parameters, so injection is impossible regardless of the input content.

Password Hashing

Passwords are never stored in plain text. The prototype uses bcrypt with a random per-password salt:

# Registration (uc_users.py)
hashed = bcrypt.hashpw(password.encode(), bcrypt.gensalt()).decode()

# Login verification (main.py)
if bcrypt.checkpw(password.encode(), row[3].encode()):
    # authenticated

bcrypt is a deliberately slow hashing algorithm with a configurable work factor, which resists brute-force attacks. The per-password salt defeats rainbow table attacks.

Role-Based Access Control

Access control is enforced at two independent layers:

Application layer (main.py): the main menu exposes only the actions appropriate for the logged-in user's role. Students can browse; Teaching Staff can also reserve; Administrators see approval, analytics, and user management.

if role == "Teaching Staff":
    options.append("Make a Resource Reservation")
if role == "Administrator":
    options.append("Approve or Reject Reservations")
    options.append("View Resource Usage Analytics")
    options.append("Register a New User")

Database layer (Phase P7 trigger trg_check_approver_is_admin): enforces that the approved_by column can only reference users whose type_name is 'Administrator'. Even if the application layer is bypassed (for example, by a direct SQL client), the database rejects any attempt to record a non-admin as the approver.

IF v_type_name != 'Administrator' THEN
    RAISE EXCEPTION 'Only administrators can approve reservations. User % is "%"',
        NEW.approved_by, v_type_name;
END IF;

Connection Security

Database connections use a connection pool (from Phase P8) with explicit transaction management. This provides:

  • Resource-exhaustion protection: at most 10 concurrent connections, so a runaway process cannot consume all database connection slots.
  • Automatic cleanup: connections are always returned to the pool via context managers, preventing connection leaks that could otherwise reach the server limit.
  • Transaction isolation: write operations use get_transaction(), which rolls back automatically on any exception and therefore cannot leave the database in a partially-written state.
Note: See TracWiki for help on using the wiki.