wiki:P9Performance

Version 1 (modified by 193284, 4 days ago) ( diff )

--

Phase 9 – Database Performance

Overview

Database performance ensures fast, scalable queries as the Wedding Planner system grows. Even in a medium-size database, the number of guests, events and bookings can increase quickly, which leads to slower reports and long waiting times if queries are not optimized.

In this phase we explain practical ways to improve performance in PostgreSQL using indexes, query optimization and query plan analysis (EXPLAIN / EXPLAIN ANALYZE). All examples are connected to our Wedding Planner entities such as weddings, guests, events and RSVPs.

What we cover in this phase

  • Why and when indexes are needed
  • Types of indexes (single-column, composite, partial)
  • Query optimization techniques (filters, joins, selecting only needed columns)
  • Performance analysis using EXPLAIN and EXPLAIN ANALYZE
  • Seq Scan vs Index Scan comparison using real examples
  • Performance examples with statuses: PLANNED / CONFIRMED / CANCELLED

Implementation & Examples

1) Indexes in PostgreSQL

Indexes allow PostgreSQL to find rows faster, without scanning the entire table (full table scan). They are especially useful when:

  • filtering using WHERE
  • joining tables using JOIN
  • sorting results using ORDER BY
  • grouping results using GROUP BY

Basic syntax:

CREATE INDEX idx_name ON table_name(column);

Example 1: Index for guests per wedding

This index improves performance when retrieving all guests for a specific wedding.

Tables affected:

  • guest

SQL:

CREATE INDEX idx_guest_wedding ON guest(wedding_id);

Reason: Guest lists are frequently accessed in the system (wedding dashboard, RSVP pages, attendance lists). Without an index, PostgreSQL may scan the whole guest table.

Example 2: Composite partial index for active events

Instead of indexing all rows (including cancelled ones), we index only active events (PLANNED or CONFIRMED). This makes the index smaller and faster.

Tables affected:

  • event

SQL:

CREATE INDEX idx_event_wedding_status_date_time
ON event (wedding_id, date, start_time)
WHERE status IN ('PLANNED', 'CONFIRMED');

Benefits:

  • smaller index -> faster lookups
  • skips CANCELLED events
  • improves performance for scheduling queries and event timelines

---

2) Query Optimization

Query optimization means writing SQL in a way that helps PostgreSQL return results efficiently, with minimal scanned rows and minimal memory usage.

Key principles:

  • Push filters early (WHERE conditions reduce data before joins)
  • Join in the right order (smaller filtered datasets joined first)
  • Use indexes effectively (WHERE and JOIN should match indexed columns)
  • Avoid SELECT * (select only the needed columns)
  • Use FILTER() in aggregates (cleaner and often faster than CASE WHEN)

Example: Optimized RSVP summary report

This query returns a summary per wedding and counts confirmed/cancelled responses efficiently.

SELECT
    w.wedding_id,
    COUNT(g.guest_id) AS total_guests,
    COUNT(r.status) FILTER (WHERE r.status = 'CONFIRMED') AS confirmed,
    COUNT(r.status) FILTER (WHERE r.status = 'CANCELLED') AS cancelled
FROM wedding w
JOIN guest g ON w.wedding_id = g.wedding_id
LEFT JOIN event_rsvp r ON g.guest_id = r.guest_id
GROUP BY w.wedding_id;

Why this is optimized:

  • FILTER avoids multiple CASE expressions
  • joins only needed tables
  • works better together with indexes (e.g., idx_guest_wedding)

3) Performance Analysis (EXPLAIN / EXPLAIN ANALYZE)

EXPLAIN shows how PostgreSQL plans to execute a query (scan type, join strategy, estimated costs). EXPLAIN ANALYZE executes the query and provides real execution time and row statistics.

Most important things to look for:

  • Seq Scan (slow when table grows)
  • Index Scan (fast, uses index)
  • join algorithms (Hash Join / Nested Loop)
  • execution time before vs after optimization

---

Example: Before/After Index (Seq Scan vs Index Scan)

Before index (Sequential Scan):

EXPLAIN ANALYZE
SELECT *
FROM event
WHERE wedding_id = 3
  AND status IN ('PLANNED', 'CONFIRMED');

Mini output (example):

Seq Scan on event  (cost=0.00..1450.00 rows=120 width=48)
  Filter: ((wedding_id = 3) AND (status = ANY '{PLANNED,CONFIRMED}'::text[]))
Execution Time: 12.334 ms

Explanation:

  • Seq Scan means PostgreSQL scans the entire event table
  • this becomes very slow when events grow in number

After creating partial index:

EXPLAIN ANALYZE
SELECT *
FROM event
WHERE wedding_id = 3
  AND status IN ('PLANNED', 'CONFIRMED');

Mini output (example):

Index Scan using idx_event_wedding_status_date_time on event
  (cost=0.25..12.50 rows=20 width=48)
  Index Cond: (wedding_id = 3)
Execution Time: 0.423 ms

Result:

  • Seq Scan -> Index Scan
  • execution time reduced significantly
  • confirms the index is being used correctly

---

Example: JOIN performance analysis

This query demonstrates how PostgreSQL executes joins when generating reports.

EXPLAIN ANALYZE
SELECT w.wedding_id, COUNT(g.guest_id) AS total_guests
FROM wedding w
JOIN guest g ON w.wedding_id = g.wedding_id
GROUP BY w.wedding_id;

What to look for:

  • join type (Hash Join / Nested Loop)
  • whether the guest index is used (idx_guest_wedding)
  • number of scanned rows

---

Conclusion

Database performance is essential for fast response times and scalable reports in Wedding Planner. Indexes reduce full table scans and enable PostgreSQL to use Index Scan instead of Seq Scan when filtering weddings, guests, and events.

Optimized SQL queries reduce unnecessary work, scanned rows, and memory usage. EXPLAIN ANALYZE is the key tool that confirms whether our indexes and query improvements are actually applied in PostgreSQL. With proper index design and optimized queries, the system remains responsive even as the database grows.

Note: See TracWiki for help on using the wiki.