= Other Topics = == SQL Performance == Performance analysis was conducted against the full complex report queries from Phase P6 ([wiki:ComplexReports ComplexReports]). The tests were run on the local development database after loading 200,000 synthetic reservations (200,028 total rows including the seed dataset). The synthetic data spans 2024-01-01 to 2025-09-09, with 29,484 rows falling in Q1 2024 (the quarter used for Report 1 testing). The small seed dataset (28 rows) cannot reveal index usefulness because sequential scans are optimal at that scale. The testing method: each P6 report query is executed with {{{EXPLAIN (ANALYZE, BUFFERS)}}} before adding new indexes and after. We compare execution time, scan type, and buffer usage. === Synthetic Data Generation Script === The following script generates 200,000 synthetic reservations for performance testing. It uses {{{generate_series}}} with a deterministic slot assignment so each (resource, day, hour) combination is unique, a fixed status distribution, and disables triggers during the bulk load to skip overlap checks. To remove the synthetic rows afterwards: {{{DELETE FROM project.reservations WHERE purpose LIKE 'Synthetic reservation%';}}} {{{ SET search_path TO project; ALTER TABLE reservations DISABLE TRIGGER ALL; INSERT INTO reservations (start_time, end_time, status, purpose, created_at, user_id, resource_id, approved_by) SELECT ('2024-01-01'::TIMESTAMPTZ + (gs / 324) * INTERVAL '1 day' + (((gs / 27) % 12) + 8) * INTERVAL '1 hour') AS start_time, ('2024-01-01'::TIMESTAMPTZ + (gs / 324) * INTERVAL '1 day' + (((gs / 27) % 12) + 10) * INTERVAL '1 hour') AS end_time, CASE gs % 20 WHEN 0 THEN 'approved' WHEN 1 THEN 'approved' WHEN 2 THEN 'approved' WHEN 3 THEN 'approved' WHEN 4 THEN 'approved' WHEN 5 THEN 'approved' WHEN 6 THEN 'approved' WHEN 7 THEN 'approved' WHEN 8 THEN 'completed' WHEN 9 THEN 'completed' WHEN 10 THEN 'completed' WHEN 11 THEN 'completed' WHEN 12 THEN 'completed' WHEN 13 THEN 'completed' WHEN 14 THEN 'rejected' WHEN 15 THEN 'rejected' WHEN 16 THEN 'rejected' WHEN 17 THEN 'pending' WHEN 18 THEN 'pending' ELSE 'cancelled' END AS status, 'Synthetic reservation ' || gs AS purpose, ('2024-01-01'::TIMESTAMPTZ + (gs / 324) * INTERVAL '1 day') AS created_at, (gs % 10) + 3 AS user_id, (gs % 27) + 1 AS resource_id, CASE WHEN gs % 20 <= 15 THEN (gs % 2) + 1 ELSE NULL END AS approved_by FROM generate_series(0, 199999) AS gs; ALTER TABLE reservations ENABLE TRIGGER ALL; ANALYZE reservations; }}} '''Slot assignment logic:''' * {{{resource_id = (gs % 27) + 1}}} — cycles through all 27 resources * {{{day = 2024-01-01 + gs/324}}} — advances one day every 324 rows (27 resources x 12 slots) * {{{hour = 8 + (gs/27 % 12)}}} — 12 x 2-hour slots covering 08:00-20:00 '''Status distribution:''' 40% approved, 30% completed, 15% rejected, 10% pending, 5% cancelled. === Scenario 1: Phase P6 Report 1 — Resource Utilization (quarterly) === A quarterly report showing each resource's utilization rate, peak usage hours, busiest day, approval rate, and demand ranking. The query uses 5 CTEs (quarter_bounds, quarter_days, resource_availability, reservation_stats, popular_day, peak_hour) and a {{{RANK()}}} window function. The most expensive CTEs are popular_day and peak_hour, which filter reservations by date range and status to find the most-used day and hour per resource. '''Proposed index:''' {{{ CREATE INDEX idx_reservations_start_resource_status ON reservations (start_time, resource_id, status); }}} '''Before index — full execution plan:''' {{{ Sort (cost=211793.37..212201.69 rows=163328 width=588) (actual time=43.493..43.502 rows=27 loops=1) Sort Key: utilization_pct DESC Sort Method: quicksort Memory: 29kB Buffers: shared hit=1476 CTE quarter_bounds -> Result (actual time=0.000..0.001 rows=1 loops=1) -> WindowAgg (actual time=43.416..43.468 rows=27 loops=1) Buffers: shared hit=1473 -> Sort (actual time=43.404..43.412 rows=27 loops=1) Sort Method: quicksort Memory: 28kB -> Hash Left Join (actual time=42.034..43.393 rows=27 loops=1) Hash Cond: (r.resource_id = ph.resource_id) -> Merge Left Join (actual time=29.501..30.849 rows=27 loops=1) Merge Cond: (r.resource_id = rv_1.resource_id) -> Merge Join (actual time=15.513..16.833 rows=27 loops=1) -> Nested Loop Left Join (actual time=15.427..15.468 rows=27 loops=1) -> Merge Left Join (actual time=15.408..15.423 rows=27 loops=1) -> [resources + resource_types join] (actual time=0.035..0.038 rows=27 loops=1) -> reservation_stats CTE -> HashAggregate Group Key: rv.resource_id (actual rows=27 loops=1) -> Nested Loop (actual time=0.467..6.167 rows=29484 loops=1) -> Bitmap Heap Scan on reservations rv Recheck Cond: start_time >= q_start AND start_time < q_end Heap Blocks: exact=448 Buffers: shared hit=482 -> Bitmap Index Scan on idx_reservations_times Index Cond: start_time range -> popular_day CTE -> HashAggregate (actual rows=189 loops=1) -> Nested Loop -> Bitmap Heap Scan on reservations rv_1 Recheck Cond: start_time >= q_start AND start_time < q_end Filter: status IN ('approved','completed') Rows Removed by Filter: 8844 Heap Blocks: exact=448 Buffers: shared hit=479 -> Bitmap Index Scan on idx_reservations_times -> peak_hour CTE -> HashAggregate (actual rows=324 loops=1) -> Nested Loop -> Bitmap Heap Scan on reservations rv_2 Recheck Cond: start_time >= q_start AND start_time < q_end Filter: status IN ('approved','completed') Rows Removed by Filter: 8844 Heap Blocks: exact=448 Buffers: shared hit=479 -> Bitmap Index Scan on idx_reservations_times Planning Time: 3.454 ms Execution Time: 44.114 ms }}} '''After index — full execution plan:''' {{{ Sort (cost=206320.34..206729.09 rows=163500 width=588) (actual time=44.819..44.828 rows=27 loops=1) Sort Key: utilization_pct DESC Sort Method: quicksort Memory: 29kB Buffers: shared hit=670 read=148 CTE quarter_bounds -> Result (actual time=0.000..0.001 rows=1 loops=1) -> WindowAgg (actual time=44.755..44.798 rows=27 loops=1) -> Sort (actual time=44.742..44.750 rows=27 loops=1) Sort Method: quicksort Memory: 28kB -> Hash Left Join (actual time=43.284..44.733 rows=27 loops=1) Hash Cond: (r.resource_id = ph.resource_id) -> Merge Left Join (actual time=32.860..34.299 rows=27 loops=1) Merge Cond: (r.resource_id = rv_1.resource_id) -> [resource/type/location join] (actual time=17.376..18.786 rows=27 loops=1) -> reservation_stats CTE -> HashAggregate Group Key: rv.resource_id (actual rows=27 loops=1) -> Bitmap Heap Scan on reservations rv Recheck Cond: start_time range Heap Blocks: exact=448 Buffers: shared hit=482 -> Bitmap Index Scan on idx_reservations_times -> popular_day CTE -> HashAggregate (actual rows=189 loops=1) -> Nested Loop -> Index Only Scan using idx_reservations_start_resource_status Index Cond: start_time >= q_start AND start_time < q_end Filter: status IN ('approved','completed') Rows Removed by Filter: 8844 Heap Fetches: 35 Buffers: shared hit=2 read=148 -> peak_hour CTE -> HashAggregate (actual rows=324 loops=1) -> Nested Loop -> Index Only Scan using idx_reservations_start_resource_status Index Cond: start_time >= q_start AND start_time < q_end Filter: status IN ('approved','completed') Rows Removed by Filter: 8844 Heap Fetches: 35 Buffers: shared hit=150 Planning Time: 4.308 ms Execution Time: 45.570 ms }}} '''Key differences:''' ||'''CTE'''||'''Before'''||'''After'''|| ||popular_day||Bitmap Heap Scan on reservations + Bitmap Index Scan on idx_reservations_times||'''Index Only Scan using idx_reservations_start_resource_status'''|| ||peak_hour||Bitmap Heap Scan on reservations + Bitmap Index Scan on idx_reservations_times||'''Index Only Scan using idx_reservations_start_resource_status'''|| ||Heap access (popular_day)||Heap Blocks: exact=448||'''Heap Fetches: 35'''|| ||Heap access (peak_hour)||Heap Blocks: exact=448||'''Heap Fetches: 35'''|| '''Analysis:''' The popular_day and peak_hour CTEs switched from Bitmap Heap Scan (reading 448 heap blocks each) to Index Only Scan with near-zero heap access (35 fetches). The new composite index includes {{{status}}}, so the planner reads date range and status in one index pass without touching the heap. At 29,484 rows in the quarter the absolute time difference is small; the benefit scales with dataset size and becomes significant when multiple quarters or larger tables are queried. '''Result: 44.1 ms -> 45.6 ms (scan type improvement; time equivalent at this scale).''' '''Index Only Scan confirmed in popular_day and peak_hour CTEs. Heap access reduced from 448 blocks to 35 fetches per CTE.''' === Scenario 2: Phase P6 Report 3 — User Activity and Behavior Analysis === A per-user report aggregating all reservations in the system: total counts, approval/cancellation rates, average duration, total hours, most-used resource, and activity ranking. The query scans the entire reservations table with no date filter (200,028 rows), grouped by user_id. The user_stats CTE requires a sort on user_id to group, and the DISTINCT ON in favorite_resource also requires an ordered scan. '''Proposed index:''' {{{ CREATE INDEX idx_reservations_user_status_resource ON reservations (user_id, status, resource_id) INCLUDE (start_time, end_time); }}} The {{{INCLUDE}}} clause adds start_time and end_time as non-key payload so the query can compute {{{AVG(end_time - start_time)}}} and {{{SUM(...) FILTER (...)}}} without heap access. '''Before index — full execution plan:''' {{{ Sort (cost=44161.35..44162.72 rows=550 width=490) (actual time=314.472..314.589 rows=12 loops=1) Sort Key: activity_rank Sort Method: quicksort Memory: 26kB Buffers: shared hit=6089 read=2, temp read=1129 written=1133 -> WindowAgg (actual time=314.439..314.562 rows=12 loops=1) -> Sort (actual time=314.431..314.548 rows=12 loops=1) Sort Key: total_reservations DESC -> Hash Join (actual time=213.597..314.521 rows=12 loops=1) Hash Cond: u.type_id = ut.type_id -> Merge Left Join (actual time=213.391..314.301 rows=12 loops=1) Merge Cond: u.user_id = rv_1.user_id -> Merge Left Join (actual time=177.462..278.025 rows=12 loops=1) Merge Cond: u.user_id = rv.user_id -> Index Scan using users_pkey on users u (actual rows=12 loops=1) -> user_stats CTE (GroupAggregate) Group Key: rv.user_id -> Sort (actual time=165.115..190.881 rows=200028 loops=1) Sort Key: rv.user_id, rv.resource_id Sort Method: external merge Disk: 9032kB -> Seq Scan on reservations rv (actual rows=200028 loops=1) -> favorite_resource CTE -> Finalize GroupAggregate Group Key: rv_1.user_id, r.name -> Gather Merge (Workers Launched: 1) -> Partial HashAggregate -> Hash Join (Hash Cond: rv_1.resource_id = r.resource_id) -> Parallel Seq Scan on reservations rv_1 Filter: status IN ('approved','completed') Rows Removed by Filter: 30004 Planning Time: 3.511 ms Execution Time: 316.082 ms }}} '''After index — full execution plan:''' {{{ Sort (cost=41026.98..41028.35 rows=550 width=490) (actual time=199.209..199.270 rows=12 loops=1) Sort Key: activity_rank Sort Method: quicksort Memory: 26kB Buffers: shared hit=3062 read=1409 -> WindowAgg (actual time=199.191..199.258 rows=12 loops=1) -> Sort (actual time=199.182..199.242 rows=12 loops=1) Sort Key: total_reservations DESC -> Hash Join (actual time=68.425..199.221 rows=12 loops=1) Hash Cond: u.type_id = ut.type_id -> Merge Left Join (actual time=68.393..199.165 rows=12 loops=1) Merge Cond: u.user_id = rv_1.user_id -> Merge Left Join (actual time=24.435..154.881 rows=12 loops=1) Merge Cond: u.user_id = rv.user_id -> Index Scan using users_pkey on users u (actual rows=12 loops=1) -> user_stats CTE (GroupAggregate) Group Key: rv.user_id -> Incremental Sort (actual time=6.463..67.437 rows=200028 loops=1) Sort Key: rv.user_id, rv.resource_id Presorted Key: rv.user_id Full-sort Groups: 10 Sort Method: quicksort Memory: 29kB Pre-sorted Groups: 10 Sort Method: quicksort Memory: 2019kB -> Index Only Scan using idx_reservations_user_status_resource Heap Fetches: 63 Buffers: shared hit=2 read=1409 -> favorite_resource CTE -> Finalize GroupAggregate Group Key: rv_1.user_id, r.name -> Gather Merge (Workers Launched: 1) -> Partial HashAggregate -> Hash Join (Hash Cond: rv_1.resource_id = r.resource_id) -> Parallel Seq Scan on reservations rv_1 Filter: status IN ('approved','completed') Rows Removed by Filter: 30004 Planning Time: 3.179 ms Execution Time: 199.542 ms }}} '''Key differences:''' ||'''Node'''||'''Before'''||'''After'''|| ||user_stats sort||'''Seq Scan on reservations''' (200,028 rows)||'''Index Only Scan using idx_reservations_user_status_resource'''|| ||Sort method||'''external merge Disk: 9032kB'''||'''Incremental Sort, in memory (2019kB peak)'''|| ||Heap access||Full table read||'''Heap Fetches: 63'''|| ||Buffer reads||shared hit=6089 read=2||shared hit=3062 read=1409 (index pages)|| '''Analysis:''' The sequential scan on the full reservations table is replaced by an Index Only Scan. Because the index is ordered by user_id, the 9032kB disk-based external merge sort becomes an Incremental Sort entirely in memory. The INCLUDE columns (start_time, end_time) allow duration calculations without heap access. The favorite_resource CTE still uses a parallel sequential scan because that CTE joins resources and filters by status in a way that benefits more from parallelism than from the index at this data size. '''Result: 316 ms -> 199 ms (37% faster). Disk sort eliminated (9032kB -> 0kB). Index Only Scan confirmed. Heap Fetches: 63.''' === Scenario 3: Phase P6 Report 4 — Administrator Approval Workload === A per-administrator workload report: how many reservations each admin reviewed, approval/rejection rates, workload share percentage, and system-wide pending wait times. The admin_stats CTE scans only rows where {{{approved_by IS NOT NULL}}} (approximately 80% of the 200,028 rows), groups them by approved_by, and computes COUNT(DISTINCT resource_id) and COUNT(DISTINCT user_id), which requires sorting before aggregation. '''Proposed index:''' {{{ CREATE INDEX idx_reservations_approver_status ON reservations (approved_by, status, resource_id, user_id) WHERE approved_by IS NOT NULL; }}} This is a partial covering index: the {{{WHERE}}} clause excludes rows where approved_by is NULL (approximately 20% of rows), and the listed columns are everything the admin_stats CTE needs to aggregate. '''Before index — full execution plan:''' {{{ Sort (cost=29916.58..29916.60 rows=6 width=180) (actual time=155.563..155.566 rows=2 loops=1) Sort Key: total_reviewed DESC Sort Method: quicksort Memory: 25kB Buffers: shared hit=6084 read=2, temp read=667 written=669 CTE admin_stats -> GroupAggregate (actual time=120.747..141.321 rows=2 loops=1) Group Key: rv_1.approved_by Buffers: shared hit=3031, temp read=667 written=669 -> Sort (actual time=99.698..116.116 rows=160023 loops=1) Sort Key: rv_1.approved_by, rv_1.resource_id Sort Method: external merge Disk: 5336kB -> Seq Scan on reservations rv_1 (actual rows=200028 loops=1) Filter: (approved_by IS NOT NULL) Rows Removed by Filter: 40005 -> Nested Loop (actual time=155.552..155.556 rows=2 loops=1) -> Aggregate (pending_stats) (actual time=13.544..13.545 rows=1 loops=1) -> Bitmap Heap Scan on reservations rv Recheck Cond: status = 'pending' Heap Blocks: exact=3031 -> Bitmap Index Scan on idx_reservations_status Index Cond: status = 'pending' -> Hash Left Join (actual time=142.000..142.003 rows=2 loops=1) -> [users + workload_share join] -> Hash on admin_stats Planning Time: 1.899 ms Execution Time: 157.080 ms }}} '''After index — full execution plan:''' {{{ Sort (cost=25377.73..25377.75 rows=6 width=180) (actual time=107.891..107.895 rows=2 loops=1) Sort Key: total_reviewed DESC Sort Method: quicksort Memory: 25kB Buffers: shared hit=3054 read=145, temp read=668 written=670 CTE admin_stats -> GroupAggregate (actual time=79.159..96.144 rows=2 loops=1) Group Key: rv_1.approved_by Buffers: shared hit=2 read=145, temp read=668 written=670 -> Incremental Sort (actual time=29.685..71.752 rows=160023 loops=1) Sort Key: rv_1.approved_by, rv_1.resource_id Presorted Key: rv_1.approved_by Full-sort Groups: 2 Sort Method: quicksort Memory: 28kB Pre-sorted Groups: 2 Sort Method: external merge Avg Disk: 2672kB -> Index Only Scan using idx_reservations_approver_status Heap Fetches: 54 Buffers: shared hit=2 read=145 -> Nested Loop (actual time=107.883..107.888 rows=2 loops=1) -> Aggregate (pending_stats) (actual time=11.656..11.656 rows=1 loops=1) -> Bitmap Heap Scan on reservations rv Recheck Cond: status = 'pending' Heap Blocks: exact=3031 -> Bitmap Index Scan on idx_reservations_status Index Cond: status = 'pending' -> Hash Left Join (actual time=96.223..96.227 rows=2 loops=1) -> [users + workload_share join] -> Hash on admin_stats Planning Time: 0.374 ms Execution Time: 109.060 ms }}} '''Key differences:''' ||'''Node'''||'''Before'''||'''After'''|| ||admin_stats scan||'''Seq Scan on reservations''' (200,028 rows read, 40,005 filtered)||'''Index Only Scan using idx_reservations_approver_status''' (160,023 rows, no filter step)|| ||Sort method||'''external merge Disk: 5336kB'''||'''Incremental Sort; Per-sorted groups external merge Avg Disk: 2672kB (halved)'''|| ||Heap access||Full table read||'''Heap Fetches: 54'''|| ||Rows filtered||Rows Removed by Filter: 40005||'''0 (partial index excludes NULL rows by definition)'''|| '''Analysis:''' The sequential scan that read all 200,028 rows and discarded 40,005 NULL-approved_by rows is replaced by an Index Only Scan that only reads the 160,023 indexed rows. The partial index physically excludes NULL rows, so no filter step is needed. The pre-sorted nature of the index reduces the external merge sort from 5336kB to 2672kB. Heap Fetches drop to 54 because status, resource_id, and user_id are in the index and do not require heap lookup. '''Result: 157 ms -> 109 ms (31% faster). Index Only Scan used. Rows-removed-by-filter eliminated. Disk sort halved (5336kB -> 2672kB). Heap Fetches: 54.''' === Performance Summary === ||'''Phase P6 Report'''||'''Before'''||'''After'''||'''Gain'''||'''Key change'''|| ||Report 1 — Resource Utilization (quarterly)||'''44.1 ms'''||'''45.6 ms'''||Scan type improved||'''Bitmap Heap Scan -> Index Only Scan''' in popular_day and peak_hour CTEs; heap blocks reduced from 448 to 35 fetches per CTE|| ||Report 3 — User Activity Analysis||'''316 ms'''||'''199 ms'''||'''37% faster'''||'''Seq Scan + 9032kB disk sort -> Index Only Scan + in-memory Incremental Sort'''|| ||Report 4 — Admin Workload Analysis||'''157 ms'''||'''109 ms'''||'''31% faster'''||'''Seq Scan (40k rows filtered) + 5336kB disk sort -> Index Only Scan + 2672kB disk sort'''|| === Note on Reports That Do Not Benefit from Indexes === Phase P6 Report 2 (Monthly Reservation Trends) without any date filter scans and aggregates the entire reservations table grouped by month. A full-table aggregation cannot be sped up by indexes alone — the planner would still have to read every row. For that report the correct optimization is either (a) to add a date filter (e.g. "last 12 months"), at which point the {{{idx_reservations_start_resource_status}}} index cuts execution time by about 10%, or (b) to pre-compute monthly summaries in a materialized view that is refreshed periodically. The same applies to any dashboard query that aggregates over the whole history. == Security Measures == === SQL Injection Prevention === All database queries in the prototype use '''parameterized queries''' via psycopg2's {{{%s}}} placeholders. User input is never concatenated into SQL strings. Example from {{{uc_reserve.py}}}: {{{ cur.execute(""" INSERT INTO reservations (start_time, end_time, status, purpose, created_at, user_id, resource_id) VALUES (%s, %s, 'pending', %s, CURRENT_TIMESTAMP, %s, %s) RETURNING reservation_id, status, created_at """, (start_dt, end_dt, purpose, user_id, rid)) }}} psycopg2 sends the query template and parameters separately to PostgreSQL. The server parses the template first and only then binds the parameters, so injection is impossible regardless of the input content. === Password Hashing === Passwords are never stored in plain text. The prototype uses '''bcrypt''' with a random per-password salt: {{{ # Registration (uc_users.py) hashed = bcrypt.hashpw(password.encode(), bcrypt.gensalt()).decode() # Login verification (main.py) if bcrypt.checkpw(password.encode(), row[3].encode()): # authenticated }}} bcrypt is a deliberately slow hashing algorithm with a configurable work factor, which resists brute-force attacks. The per-password salt defeats rainbow table attacks. === Role-Based Access Control === Access control is enforced at two independent layers: '''Application layer (main.py):''' the main menu exposes only the actions appropriate for the logged-in user's role. Students can browse; Teaching Staff can also reserve; Administrators see approval, analytics, and user management. {{{ if role == "Teaching Staff": options.append("Make a Resource Reservation") if role == "Administrator": options.append("Approve or Reject Reservations") options.append("View Resource Usage Analytics") options.append("Register a New User") }}} '''Database layer (Phase P7 trigger {{{trg_check_approver_is_admin}}}):''' enforces that the {{{approved_by}}} column can only reference users whose {{{type_name}}} is {{{'Administrator'}}}. Even if the application layer is bypassed (for example, by a direct SQL client), the database rejects any attempt to record a non-admin as the approver. {{{ IF v_type_name != 'Administrator' THEN RAISE EXCEPTION 'Only administrators can approve reservations. User % is "%"', NEW.approved_by, v_type_name; END IF; }}} === Connection Security === Database connections use a '''connection pool''' (from Phase P8) with explicit transaction management. This provides: * '''Resource-exhaustion protection:''' at most 10 concurrent connections, so a runaway process cannot consume all database connection slots. * '''Automatic cleanup:''' connections are always returned to the pool via context managers, preventing connection leaks that could otherwise reach the server limit. * '''Transaction isolation:''' write operations use {{{get_transaction()}}}, which rolls back automatically on any exception and therefore cannot leave the database in a partially-written state.