| Version 9 (modified by , 12 days ago) ( diff ) |
|---|
Performance
1. Adding mock data for bookings
To properly benchmark our database, we first generate a realistic dataset of 1000000 bookings and related entities which.
-- 1. Insert 1,000,000 mock bookings
ALTER TABLE project.bookings DISABLE TRIGGER trg_booking_validation;
INSERT INTO project.bookings (status, date_from, date_to, address, owner_id, sitter_id)
SELECT
(ARRAY['Pending', 'Confirmed', 'Completed', 'Canceled', 'Expired', 'Reviewed'])[floor(random()*6)+1] AS status,
t.dt AS date_from,
t.dt + (random() * INTERVAL '7 days') AS date_to,
'Mock Address ' || gs AS address,
o.user_id AS owner_id,
s.user_id AS sitter_id
FROM generate_series(1, 1000000) gs
JOIN LATERAL (
SELECT user_id FROM project.pet_owners ORDER BY random() LIMIT 1
) o ON true
JOIN LATERAL (
SELECT user_id FROM project.pet_sitters ORDER BY random() LIMIT 1
) s ON true
JOIN LATERAL (
SELECT CURRENT_DATE - (random() * INTERVAL '365 days') AS dt
) t ON true;
ALTER TABLE project.bookings ENABLE TRIGGER trg_booking_validation;
-- 2. Insert relevant mock payments
INSERT INTO project.payments (booking_id, amount, payment_type)
SELECT
b.booking_id,
floor(random() * 100 + 20)::int AS amount,
(ARRAY['Card', 'Cash', 'Bank Transfer'])[floor(random()*3)+1] AS payment_type
FROM project.bookings b
WHERE NOT EXISTS (
SELECT 1 FROM project.payments p WHERE p.booking_id = b.booking_id
);
-- 3. Insert mock reviews for completed bookings
INSERT INTO project.reviews (booking_id, rating, comment)
SELECT
b.booking_id,
floor(random()*5)+1 AS rating,
'Mock Review ' || b.booking_id AS comment
FROM project.bookings b
WHERE b.status = 'Completed'
AND random() > 0.3
AND NOT EXISTS (
SELECT 1 FROM project.reviews r WHERE r.booking_id = b.booking_id
);
-- 4. Insert mock pets
INSERT INTO project.pets (owner_id, name, age, pettype_id)
SELECT
po.user_id,
'Mock Pet ' || po.user_id || '-' || gs AS name,
floor(random()*15)+1 AS age,
pt.pettype_id
FROM project.pet_owners po
CROSS JOIN LATERAL generate_series(1, floor(random()*3 + 1)::int) gs
JOIN LATERAL (
SELECT pettype_id FROM project.pet_types ORDER BY random() LIMIT 1
) pt ON true;
-- 5. Link random services to bookings
INSERT INTO project.booking_services (booking_id, service_id)
SELECT
b.booking_id,
s.service_id
FROM project.bookings b
JOIN LATERAL (
SELECT service_id FROM project.services ORDER BY random() LIMIT 1
) s ON true
WHERE NOT EXISTS (
SELECT 1 FROM project.booking_services bs WHERE bs.booking_id = b.booking_id
);
Benchmark query:
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM project.bookings WHERE sitter_id = (SELECT user_id FROM project.pet_sitters LIMIT 1) ORDER BY date_from DESC;
Execution without indexes:
Gather Merge (cost=81029.64..195548.38 rows=967764 width=147) (actual time=102.092..113.433 rows=2 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=22335
InitPlan 1
-> Limit (cost=0.00..0.02 rows=1 width=90) (actual time=6.105..6.106 rows=1 loops=1)
Buffers: shared hit=1
-> Seq Scan on pet_sitters (cost=0.00..16.90 rows=690 width=90) (actual time=0.028..0.028 rows=1 loops=1)
Buffers: shared hit=1
-> Sort (cost=80029.57..80836.04 rows=322588 width=147) (actual time=65.204..65.206 rows=0 loops=4)
Sort Key: bookings.date_from DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=22334
Worker 0: Sort Method: quicksort Memory: 25kB
Worker 1: Sort Method: quicksort Memory: 25kB
Worker 2: Sort Method: quicksort Memory: 25kB
-> Parallel Seq Scan on bookings (cost=0.00..26255.35 rows=322588 width=147) (actual time=41.358..65.056 rows=0 loops=4)
Filter: ((sitter_id)::text = ((InitPlan 1).col1)::text)
Rows Removed by Filter: 250005
Buffers: shared hit=22223
Planning:
Buffers: shared hit=139 read=1
Planning Time: 1.125 ms
JIT:
Functions: 12
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 1.536 ms (Deform 0.703 ms), Inlining 0.000 ms, Optimization 1.491 ms, Emission 21.760 ms, Total 24.788 ms
Execution Time: 148.520 ms
Average execution time (10 attempts): 145.812ms
Next, we add this index:
CREATE INDEX idx_bookings_sitter_date ON project.bookings (sitter_id, date_from DESC);
(Note, for the exact same performance reasons, we also created this index to optimize the queries when Pet Owners view their own bookings)
CREATE INDEX idx_bookings_owner_date ON project.bookings (owner_id, date_from DESC);
Execution with indexes:
Index Scan using idx_bookings_sitter_date on bookings (cost=0.45..72505.21 rows=1000022 width=147) (actual time=0.101..0.104 rows=2 loops=1)
Index Cond: ((sitter_id)::text = ((InitPlan 1).col1)::text)
Buffers: shared hit=2 read=3
InitPlan 1
-> Limit (cost=0.00..0.02 rows=1 width=90) (actual time=0.040..0.040 rows=1 loops=1)
Buffers: shared hit=1
-> Seq Scan on pet_sitters (cost=0.00..16.90 rows=690 width=90) (actual time=0.038..0.038 rows=1 loops=1)
Buffers: shared hit=1
Planning:
Buffers: shared hit=124 read=5
Planning Time: 0.969 ms
Execution Time: 0.132 ms
Average execution time (10 attempts): 0.125ms
Because the execution time has been massively lowered by bypassing the expensive sequential scan and memory sort, we keep this index.
2. Calculate average sitter rating
Benchmark query:
EXPLAIN (ANALYZE, BUFFERS) SELECT AVG(r.rating) FROM project.reviews r JOIN project.bookings b ON r.booking_id = b.booking_id WHERE b.sitter_id = (SELECT user_id FROM project.pet_sitters LIMIT 1);
Execution without indexes:
Finalize Aggregate (cost=32123.42..32123.43 rows=1 width=32) (actual time=106.390..116.499 rows=1 loops=1)
Buffers: shared hit=24792
InitPlan 1
-> Limit (cost=0.00..0.02 rows=1 width=90) (actual time=0.013..0.014 rows=1 loops=1)
Buffers: shared hit=1
-> Seq Scan on pet_sitters (cost=0.00..16.90 rows=690 width=90) (actual time=0.011..0.012 rows=1 loops=1)
Buffers: shared hit=1
-> Gather (cost=32123.08..32123.39 rows=3 width=32) (actual time=104.969..116.470 rows=4 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=24792
-> Partial Aggregate (cost=31123.08..31123.09 rows=1 width=32) (actual time=75.769..75.773 rows=1 loops=4)
Buffers: shared hit=24791
-> Parallel Hash Join (cost=3926.81..31028.95 rows=37651 width=4) (actual time=62.399..75.767 rows=0 loops=4)
Hash Cond: ((b.booking_id)::text = (r.booking_id)::text)
Buffers: shared hit=24791
-> Parallel Seq Scan on bookings b (cost=0.00..26255.35 rows=322588 width=37) (actual time=40.115..53.479 rows=0 loops=4)
Filter: ((sitter_id)::text = ((InitPlan 1).col1)::text)
Rows Removed by Filter: 250005
Buffers: shared hit=22223
-> Parallel Hash (cost=3068.58..3068.58 rows=68658 width=41) (actual time=21.453..21.454 rows=29180 loops=4)
Buckets: 131072 Batches: 1 Memory Usage: 10208kB
Buffers: shared hit=2382
-> Parallel Seq Scan on reviews r (cost=0.00..3068.58 rows=68658 width=41) (actual time=0.026..7.819 rows=29180 loops=4)
Buffers: shared hit=2382
Planning:
Buffers: shared hit=165 read=2
Planning Time: 1.200 ms
Execution Time: 116.627 ms
Average execution time (10 attempts): 115.143ms
We add this index:
CREATE INDEX idx_reviews_booking_id ON project.reviews (booking_id);
Execution with indexes:
Finalize Aggregate (cost=32123.42..32123.43 rows=1 width=32) (actual time=102.850..113.265 rows=1 loops=1)
Buffers: shared hit=24792
InitPlan 1
-> Limit (cost=0.00..0.02 rows=1 width=90) (actual time=0.014..0.016 rows=1 loops=1)
Buffers: shared hit=1
-> Seq Scan on pet_sitters (cost=0.00..16.90 rows=690 width=90) (actual time=0.013..0.013 rows=1 loops=1)
Buffers: shared hit=1
-> Gather (cost=32123.08..32123.39 rows=3 width=32) (actual time=101.457..113.235 rows=4 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=24792
-> Partial Aggregate (cost=31123.08..31123.09 rows=1 width=32) (actual time=72.199..72.202 rows=1 loops=4)
Buffers: shared hit=24791
-> Parallel Hash Join (cost=3926.81..31028.95 rows=37651 width=4) (actual time=59.642..72.196 rows=0 loops=4)
Hash Cond: ((b.booking_id)::text = (r.booking_id)::text)
Buffers: shared hit=24791
-> Parallel Seq Scan on bookings b (cost=0.00..26255.35 rows=322588 width=37) (actual time=37.082..49.436 rows=0 loops=4)
Filter: ((sitter_id)::text = ((InitPlan 1).col1)::text)
Rows Removed by Filter: 250005
Buffers: shared hit=22223
-> Parallel Hash (cost=3068.58..3068.58 rows=68658 width=41) (actual time=21.936..21.937 rows=29180 loops=4)
Buckets: 131072 Batches: 1 Memory Usage: 10240kB
Buffers: shared hit=2382
-> Parallel Seq Scan on reviews r (cost=0.00..3068.58 rows=68658 width=41) (actual time=0.020..7.757 rows=29180 loops=4)
Buffers: shared hit=2382
Planning:
Buffers: shared hit=78 read=6
Planning Time: 1.082 ms
Execution Time: 113.336 ms
Average execution time (10 attempts): 114.288ms
There is practically no change in the execution time. Because the index is not utilized for this query and only adds overhead during INSERT and UPDATE operations, we will not keep this index.
DROP INDEX IF EXISTS project.idx_reviews_booking_id;
3. Sitter Performance Analytics (Phase 6 Query)
Benchmark query:
EXPLAIN (ANALYZE, BUFFERS)
WITH params AS (
SELECT
(CURRENT_DATE - INTERVAL '1 year')::DATE AS start_date,
CURRENT_DATE::DATE AS end_date
),
sitter_stats AS (
SELECT
b.sitter_id,
COUNT(b.booking_id) AS total_bookings,
COUNT(b.booking_id) FILTER (WHERE b.status = 'Completed') AS completed_bookings,
COUNT(b.booking_id) FILTER (WHERE b.status IN ('Canceled', 'Rejected')) AS missed_bookings
FROM project.bookings b
JOIN params p ON b.date_from >= p.start_date AND b.date_from < p.end_date
GROUP BY b.sitter_id
),
sitter_reports AS (
SELECT
b.sitter_id,
SUM(pay.amount) AS total_revenue
FROM project.bookings b
JOIN project.payments pay ON b.booking_id = pay.booking_id
JOIN params p ON b.date_from >= p.start_date AND b.date_from < p.end_date
WHERE b.status = 'Completed'
GROUP BY b.sitter_id
),
sitter_ratings AS (
SELECT
b.sitter_id,
AVG(r.rating)::numeric(10,2) AS avg_rating,
COUNT(r.review_id) AS total_reviews
FROM project.bookings b
JOIN project.reviews r ON b.booking_id = r.booking_id
JOIN params p ON b.date_from >= p.start_date AND b.date_from < p.end_date
GROUP BY b.sitter_id
)
SELECT
u.user_id, u.username, u.first_name, u.last_name,
COALESCE(ss.total_bookings, 0) AS total_bookings,
COALESCE(ss.completed_bookings, 0) AS completed_bookings,
COALESCE(ss.missed_bookings, 0) AS missed_bookings,
COALESCE(sr.avg_rating, 0) AS avg_rating,
COALESCE(sr.total_reviews, 0) AS total_reviews,
COALESCE(sf.total_revenue, 0) AS total_revenue,
DENSE_RANK() OVER (
ORDER BY (
COALESCE(sf.total_revenue, 0) * 0.5
+ COALESCE(ss.completed_bookings, 0) * 10
+ COALESCE(sr.avg_rating, 0) * 15
- COALESCE(ss.missed_bookings, 0) * 5
) DESC
) AS sitter_rank
FROM project.users u
JOIN project.pet_sitters ps ON u.user_id = ps.user_id
LEFT JOIN sitter_stats ss ON ss.sitter_id = ps.user_id
LEFT JOIN sitter_reports sf ON sf.sitter_id = ps.user_id
LEFT JOIN sitter_ratings sr ON sr.sitter_id = ps.user_id
WHERE COALESCE(ss.total_bookings, 0) > 0
ORDER BY sitter_rank
LIMIT 10;
Execution without indexes:
Limit (cost=154247.04..154247.06 rows=10 width=176) (actual time=2600.384..2611.703 rows=4 loops=1)
Buffers: shared hit=735904 read=1
CTE params
-> Result (cost=0.00..0.02 rows=1 width=8) (actual time=0.012..0.013 rows=1 loops=1)
-> Sort (cost=154247.02..154247.59 rows=230 width=176) (actual time=2544.398..2555.714 rows=4 loops=1)
Sort Key: (dense_rank() OVER (?))
Sort Method: quicksort Memory: 26kB
Buffers: shared hit=735904 read=1
-> WindowAgg (cost=154232.29..154242.05 rows=230 width=176) (actual time=2544.338..2555.668 rows=4 loops=1)
Buffers: shared hit=735901 read=1
-> Sort (cost=154232.27..154232.85 rows=230 width=152) (actual time=2544.310..2555.625 rows=4 loops=1)
Sort Key: ((((((COALESCE(sf.total_revenue, '0'::bigint))::numeric * 0.5) + ((COALESCE(ss.completed_bookings, '0'::bigint) * 10))::numeric) + (COALESCE(sr.avg_rating, '0'::numeric) * '15'::numeric)) - ((COALESCE(ss.missed_bookings, '0'::bigint) * 5))::numeric)) DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=735901 read=1
-> Hash Join (cost=154197.60..154223.25 rows=230 width=152) (actual time=2544.282..2555.608 rows=4 loops=1)
Hash Cond: ((u.user_id)::text = (ps.user_id)::text)
Buffers: shared hit=735898 read=1
-> Seq Scan on users u (cost=0.00..15.80 rows=580 width=64) (actual time=0.041..0.045 rows=12 loops=1)
Buffers: shared read=1
-> Hash (cost=154194.72..154194.72 rows=230 width=146) (actual time=2544.166..2555.480 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=735898
-> Merge Left Join (cost=154190.07..154194.72 rows=230 width=146) (actual time=2544.131..2555.465 rows=4 loops=1)
Merge Cond: ((ps.user_id)::text = (sr.sitter_id)::text)
Buffers: shared hit=735898
-> Merge Left Join (cost=109447.58..109451.64 rows=230 width=122) (actual time=2328.303..2328.329 rows=4 loops=1)
Merge Cond: ((ps.user_id)::text = (sf.sitter_id)::text)
Buffers: shared hit=711107
-> Merge Left Join (cost=48939.75..48943.22 rows=230 width=114) (actual time=724.264..724.279 rows=4 loops=1)
Merge Cond: ((ps.user_id)::text = (ss.sitter_id)::text)
Filter: (COALESCE(ss.total_bookings, '0'::bigint) > 0)
Buffers: shared hit=22224
-> Sort (cost=49.44..51.16 rows=690 width=90) (actual time=0.018..0.021 rows=4 loops=1)
Sort Key: ps.user_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=1
-> Seq Scan on pet_sitters ps (cost=0.00..16.90 rows=690 width=90) (actual time=0.008..0.009 rows=4 loops=1)
Buffers: shared hit=1
-> Sort (cost=48890.31..48890.32 rows=1 width=61) (actual time=724.208..724.212 rows=4 loops=1)
Sort Key: ss.sitter_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=22223
-> Subquery Scan on ss (cost=48890.28..48890.30 rows=1 width=61) (actual time=724.190..724.196 rows=4 loops=1)
Buffers: shared hit=22223
-> HashAggregate (cost=48890.28..48890.29 rows=1 width=61) (actual time=724.182..724.187 rows=4 loops=1)
Group Key: b.sitter_id
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=22223
-> Nested Loop (cost=0.00..47223.57 rows=111114 width=83) (actual time=0.044..359.696 rows=1000017 loops=1)
Join Filter: ((b.date_from >= p.start_date) AND (b.date_from < p.end_date))
Rows Removed by Join Filter: 5
Buffers: shared hit=22223
-> CTE Scan on params p (cost=0.00..0.02 rows=1 width=8) (actual time=0.015..0.017 rows=1 loops=1)
-> Seq Scan on bookings b (cost=0.00..32223.22 rows=1000022 width=87) (actual time=0.013..106.436 rows=1000022 loops=1)
Buffers: shared hit=22223
-> Sort (cost=60507.84..60507.84 rows=1 width=45) (actual time=1604.016..1604.022 rows=2 loops=1)
Sort Key: sf.sitter_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=688883
-> Subquery Scan on sf (cost=60507.81..60507.83 rows=1 width=45) (actual time=1603.989..1603.995 rows=2 loops=1)
Buffers: shared hit=688883
-> HashAggregate (cost=60507.81..60507.82 rows=1 width=45) (actual time=1603.982..1603.986 rows=2 loops=1)
Group Key: b_1.sitter_id
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=688883
-> Nested Loop (cost=0.42..60413.23 rows=18915 width=41) (actual time=0.129..1550.156 rows=166665 loops=1)
Buffers: shared hit=688883
-> Nested Loop (cost=0.00..37276.85 rows=18915 width=74) (actual time=0.043..232.987 rows=166665 loops=1)
Join Filter: ((b_1.date_from >= p_1.start_date) AND (b_1.date_from < p_1.end_date))
Rows Removed by Join Filter: 3
Buffers: shared hit=22223
-> CTE Scan on params p_1 (cost=0.00..0.02 rows=1 width=8) (actual time=0.001..0.003 rows=1 loops=1)
-> Seq Scan on bookings b_1 (cost=0.00..34723.28 rows=170237 width=78) (actual time=0.024..189.504 rows=166668 loops=1)
Filter: ((status)::text = 'Completed'::text)
Rows Removed by Filter: 833354
Buffers: shared hit=22223
-> Index Scan using payments_booking_id_key on payments pay (cost=0.42..1.22 rows=1 width=41) (actual time=0.007..0.007 rows=1 loops=166665)
Index Cond: ((booking_id)::text = (b_1.booking_id)::text)
Buffers: shared hit=666660
-> Sort (cost=44742.49..44742.49 rows=1 width=61) (actual time=215.803..227.106 rows=2 loops=1)
Sort Key: sr.sitter_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=24791
-> Subquery Scan on sr (cost=44742.45..44742.48 rows=1 width=61) (actual time=215.773..227.079 rows=2 loops=1)
Buffers: shared hit=24791
-> HashAggregate (cost=44742.45..44742.47 rows=1 width=61) (actual time=215.766..227.070 rows=2 loops=1)
Group Key: b_2.sitter_id
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=24791
-> Nested Loop (cost=4926.81..44645.18 rows=12969 width=78) (actual time=68.121..195.697 rows=116717 loops=1)
Join Filter: ((b_2.date_from >= p_2.start_date) AND (b_2.date_from < p_2.end_date))
Rows Removed by Join Filter: 2
Buffers: shared hit=24791
-> CTE Scan on params p_2 (cost=0.00..0.02 rows=1 width=8) (actual time=0.001..0.004 rows=1 loops=1)
-> Gather (cost=4926.81..42894.38 rows=116719 width=82) (actual time=68.087..176.699 rows=116719 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=24791
-> Parallel Hash Join (cost=3926.81..30222.48 rows=37651 width=82) (actual time=36.517..159.380 rows=29180 loops=4)
Hash Cond: ((b_2.booking_id)::text = (r.booking_id)::text)
Buffers: shared hit=24791
-> Parallel Seq Scan on bookings b_2 (cost=0.00..25448.88 rows=322588 width=78) (actual time=0.012..34.092 rows=250006 loops=4)
Buffers: shared hit=22223
-> Parallel Hash (cost=3068.58..3068.58 rows=68658 width=78) (actual time=35.713..35.714 rows=29180 loops=4)
Buckets: 131072 Batches: 1 Memory Usage: 13856kB
Buffers: shared hit=2382
-> Parallel Seq Scan on reviews r (cost=0.00..3068.58 rows=68658 width=78) (actual time=9.315..16.953 rows=29180 loops=4)
Buffers: shared hit=2382
Planning:
Buffers: shared hit=227 read=5
Planning Time: 3.010 ms
JIT:
Functions: 115
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 9.518 ms (Deform 4.386 ms), Inlining 0.000 ms, Optimization 4.262 ms, Emission 89.334 ms, Total 103.115 ms
Execution Time: 2619.146 ms
Execution time: 2619.146 ms
We attempt to optimize this analytical query by introducing multiple foreign key indexes:
CREATE INDEX idx_bookings_date_sitter ON project.bookings (date_from, sitter_id); CREATE INDEX idx_bookings_date_owner ON project.bookings (date_from, owner_id); CREATE INDEX idx_payments_booking_id ON project.payments (booking_id); CREATE INDEX idx_booking_services_booking_id ON project.booking_services (booking_id); CREATE INDEX idx_pets_owner_id ON project.pets (owner_id);
Execution with indexes:
Limit (cost=110101.62..110101.65 rows=10 width=176) (actual time=3500.078..3500.100 rows=4 loops=1)
Buffers: shared hit=738392, temp read=6439 written=6439
CTE params
-> Result (cost=0.00..0.02 rows=1 width=8) (actual time=0.027..0.028 rows=1 loops=1)
-> Sort (cost=110101.60..110102.18 rows=230 width=176) (actual time=3442.513..3442.532 rows=4 loops=1)
Sort Key: (dense_rank() OVER (?))
Sort Method: quicksort Memory: 26kB
Buffers: shared hit=738392, temp read=6439 written=6439
-> WindowAgg (cost=110086.87..110096.63 rows=230 width=176) (actual time=3442.451..3442.485 rows=4 loops=1)
Buffers: shared hit=738389, temp read=6439 written=6439
-> Sort (cost=110086.86..110087.43 rows=230 width=152) (actual time=3442.434..3442.453 rows=4 loops=1)
Sort Key: ((((((COALESCE(sf.total_revenue, '0'::bigint))::numeric * 0.5) + ((COALESCE(ss.completed_bookings, '0'::bigint) * 10))::numeric) + (COALESCE(sr.avg_rating, '0'::numeric) * '15'::numeric)) - ((COALESCE(ss.missed_bookings, '0'::bigint) * 5))::numeric)) DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=738389, temp read=6439 written=6439
-> Hash Join (cost=110052.18..110077.83 rows=230 width=152) (actual time=3442.406..3442.435 rows=4 loops=1)
Hash Cond: ((u.user_id)::text = (ps.user_id)::text)
Buffers: shared hit=738386, temp read=6439 written=6439
-> Seq Scan on users u (cost=0.00..15.80 rows=580 width=64) (actual time=0.015..0.020 rows=12 loops=1)
Buffers: shared hit=1
-> Hash (cost=110049.30..110049.30 rows=230 width=146) (actual time=3442.349..3442.367 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=738385, temp read=6439 written=6439
-> Merge Left Join (cost=110044.65..110049.30 rows=230 width=146) (actual time=3442.306..3442.342 rows=4 loops=1)
Merge Cond: ((ps.user_id)::text = (sr.sitter_id)::text)
Buffers: shared hit=738385, temp read=6439 written=6439
-> Merge Left Join (cost=72317.28..72321.34 rows=230 width=122) (actual time=2386.271..2386.298 rows=4 loops=1)
Merge Cond: ((ps.user_id)::text = (sf.sitter_id)::text)
Buffers: shared hit=712889
-> Merge Left Join (cost=28260.39..28263.86 rows=230 width=114) (actual time=797.385..797.401 rows=4 loops=1)
Merge Cond: ((ps.user_id)::text = (ss.sitter_id)::text)
Filter: (COALESCE(ss.total_bookings, '0'::bigint) > 0)
Buffers: shared hit=23115
-> Sort (cost=49.44..51.16 rows=690 width=90) (actual time=0.024..0.026 rows=4 loops=1)
Sort Key: ps.user_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=1
-> Seq Scan on pet_sitters ps (cost=0.00..16.90 rows=690 width=90) (actual time=0.013..0.014 rows=4 loops=1)
Buffers: shared hit=1
-> Sort (cost=28210.95..28210.96 rows=1 width=61) (actual time=797.322..797.327 rows=4 loops=1)
Sort Key: ss.sitter_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=23114
-> Subquery Scan on ss (cost=28210.92..28210.94 rows=1 width=61) (actual time=797.304..797.311 rows=4 loops=1)
Buffers: shared hit=23114
-> HashAggregate (cost=28210.92..28210.93 rows=1 width=61) (actual time=797.297..797.302 rows=4 loops=1)
Group Key: b.sitter_id
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=23114
-> Nested Loop (cost=1543.34..26544.21 rows=111114 width=83) (actual time=43.375..439.128 rows=1000017 loops=1)
Buffers: shared hit=23114
-> CTE Scan on params p (cost=0.00..0.02 rows=1 width=8) (actual time=0.030..0.033 rows=1 loops=1)
-> Bitmap Heap Scan on bookings b (cost=1543.34..25433.05 rows=111114 width=87) (actual time=43.324..251.336 rows=1000017 loops=1)
Recheck Cond: ((date_from >= p.start_date) AND (date_from < p.end_date))
Heap Blocks: exact=22223
Buffers: shared hit=23114
-> Bitmap Index Scan on idx_bookings_date_owner (cost=0.00..1515.57 rows=111114 width=0) (actual time=38.673..38.674 rows=1000017 loops=1)
Index Cond: ((date_from >= p.start_date) AND (date_from < p.end_date))
Buffers: shared hit=891
-> Sort (cost=44056.89..44056.90 rows=1 width=45) (actual time=1588.861..1588.866 rows=2 loops=1)
Sort Key: sf.sitter_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=689774
-> Subquery Scan on sf (cost=44056.86..44056.88 rows=1 width=45) (actual time=1588.837..1588.844 rows=2 loops=1)
Buffers: shared hit=689774
-> HashAggregate (cost=44056.86..44056.87 rows=1 width=45) (actual time=1588.829..1588.834 rows=2 loops=1)
Group Key: b_1.sitter_id
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=689774
-> Nested Loop (cost=1520.72..43962.29 rows=18915 width=41) (actual time=41.086..1537.591 rows=166665 loops=1)
Buffers: shared hit=689774
-> Nested Loop (cost=1520.29..25876.96 rows=18915 width=74) (actual time=41.016..285.706 rows=166665 loops=1)
Buffers: shared hit=23114
-> CTE Scan on params p_1 (cost=0.00..0.02 rows=1 width=8) (actual time=0.001..0.003 rows=1 loops=1)
-> Bitmap Heap Scan on bookings b_1 (cost=1520.29..25687.79 rows=18915 width=78) (actual time=41.004..254.671 rows=166665 loops=1)
Recheck Cond: ((date_from >= p_1.start_date) AND (date_from < p_1.end_date))
Filter: ((status)::text = 'Completed'::text)
Rows Removed by Filter: 833352
Heap Blocks: exact=22223
Buffers: shared hit=23114
-> Bitmap Index Scan on idx_bookings_date_owner (cost=0.00..1515.57 rows=111114 width=0) (actual time=36.387..36.387 rows=1000017 loops=1)
Index Cond: ((date_from >= p_1.start_date) AND (date_from < p_1.end_date))
Buffers: shared hit=891
-> Index Scan using idx_payments_booking_id on payments pay (cost=0.42..0.96 rows=1 width=41) (actual time=0.007..0.007 rows=1 loops=166665)
Index Cond: ((booking_id)::text = (b_1.booking_id)::text)
Buffers: shared hit=666660
-> Sort (cost=37727.37..37727.37 rows=1 width=61) (actual time=1056.010..1056.015 rows=2 loops=1)
Sort Key: sr.sitter_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=25496, temp read=6439 written=6439
-> Subquery Scan on sr (cost=37727.33..37727.36 rows=1 width=61) (actual time=1055.985..1055.994 rows=2 loops=1)
Buffers: shared hit=25496, temp read=6439 written=6439
-> HashAggregate (cost=37727.33..37727.35 rows=1 width=61) (actual time=1055.977..1055.984 rows=2 loops=1)
Group Key: b_2.sitter_id
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=25496, temp read=6439 written=6439
-> Hash Join (cost=8033.52..37630.07 rows=12969 width=78) (actual time=121.902..1017.507 rows=116717 loops=1)
Hash Cond: ((b_2.booking_id)::text = (r.booking_id)::text)
Buffers: shared hit=25496, temp read=6439 written=6439
-> Nested Loop (cost=1543.34..26544.21 rows=111114 width=74) (actual time=40.177..418.927 rows=1000017 loops=1)
Buffers: shared hit=23114
-> CTE Scan on params p_2 (cost=0.00..0.02 rows=1 width=8) (actual time=0.001..0.003 rows=1 loops=1)
-> Bitmap Heap Scan on bookings b_2 (cost=1543.34..25433.05 rows=111114 width=78) (actual time=40.156..222.658 rows=1000017 loops=1)
Recheck Cond: ((date_from >= p_2.start_date) AND (date_from < p_2.end_date))
Heap Blocks: exact=22223
Buffers: shared hit=23114
-> Bitmap Index Scan on idx_bookings_date_owner (cost=0.00..1515.57 rows=111114 width=0) (actual time=35.548..35.548 rows=1000017 loops=1)
Index Cond: ((date_from >= p_2.start_date) AND (date_from < p_2.end_date))
Buffers: shared hit=891
-> Hash (cost=3549.19..3549.19 rows=116719 width=78) (actual time=81.514..81.515 rows=116719 loops=1)
Buckets: 131072 Batches: 2 Memory Usage: 7251kB
Buffers: shared hit=2382, temp written=702
-> Seq Scan on reviews r (cost=0.00..3549.19 rows=116719 width=78) (actual time=0.025..30.472 rows=116719 loops=1)
Buffers: shared hit=2382
Planning:
Buffers: shared hit=668
Planning Time: 5.315 ms
JIT:
Functions: 81
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 6.369 ms (Deform 2.765 ms), Inlining 0.000 ms, Optimization 2.281 ms, Emission 55.664 ms, Total 64.314 ms
Execution Time: 3540.806 ms
Execution time: 3540.806 ms
Instead of improving performance, adding these indexes actually increased the execution time by nearly 1 second. The indexes are also used in Scenario 4 below.
4. Highest Paying Customers Analytics (Phase 6 Query)
Benchmark query:
EXPLAIN (ANALYZE, BUFFERS)
WITH params AS (
SELECT
(CURRENT_DATE - INTERVAL '1 year')::DATE AS start_date,
CURRENT_DATE::DATE AS end_date
),
owner_reports AS (
SELECT
b.owner_id,
SUM(pay.amount) AS total_profit_generated,
COUNT(DISTINCT b.booking_id) AS successful_bookings
FROM project.bookings b
JOIN project.payments pay ON b.booking_id = pay.booking_id
JOIN params p ON b.date_from >= p.start_date AND b.date_from < p.end_date
WHERE b.status = 'Completed'
GROUP BY b.owner_id
),
service_counts AS (
SELECT
b.owner_id,
s.type AS service_type,
COUNT(bs.service_id) AS times_booked,
ROW_NUMBER() OVER(PARTITION BY b.owner_id ORDER BY COUNT(bs.service_id) DESC) as rank_num
FROM project.bookings b
JOIN project.booking_services bs ON b.booking_id = bs.booking_id
JOIN project.services s ON bs.service_id = s.service_id
JOIN params p ON b.date_from >= p.start_date AND b.date_from < p.end_date
GROUP BY b.owner_id, s.type
),
favorite_service AS (
SELECT owner_id, service_type AS top_interest
FROM service_counts
WHERE rank_num = 1
),
pet_portfolio AS (
SELECT owner_id, COUNT(pet_id) AS registered_pets
FROM project.pets
GROUP BY owner_id
)
SELECT
u.user_id, u.first_name, u.last_name,
COALESCE(ofin.successful_bookings, 0) AS successful_bookings,
COALESCE(pp.registered_pets, 0) AS total_pets,
COALESCE(fs.top_interest, 'Unknown') AS top_interest,
COALESCE(ofin.total_profit_generated, 0) AS total_profit_generated,
DENSE_RANK() OVER (
ORDER BY
COALESCE(ofin.total_profit_generated, 0) DESC,
COALESCE(ofin.successful_bookings, 0) DESC
) AS customer_rank
FROM project.users u
JOIN project.pet_owners po ON u.user_id = po.user_id
LEFT JOIN owner_reports ofin ON po.user_id = ofin.owner_id
LEFT JOIN favorite_service fs ON po.user_id = fs.owner_id
LEFT JOIN pet_portfolio pp ON po.user_id = pp.owner_id
WHERE COALESCE(ofin.successful_bookings, 0) > 0
ORDER BY customer_rank
LIMIT 10;
Execution without indexes:
Limit (cost=170067.13..170067.15 rows=10 width=115) (actual time=5322.701..5322.720 rows=2 loops=1)
Buffers: shared hit=726269, temp read=26853 written=26858
CTE params
-> Result (cost=0.00..0.02 rows=1 width=8) (actual time=0.013..0.014 rows=1 loops=1)
-> Sort (cost=170067.11..170067.68 rows=230 width=115) (actual time=5264.981..5264.999 rows=2 loops=1)
Sort Key: (dense_rank() OVER (?))
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=726269, temp read=26853 written=26858
-> WindowAgg (cost=170057.56..170062.14 rows=230 width=115) (actual time=5264.923..5264.951 rows=2 loops=1)
Buffers: shared hit=726266, temp read=26853 written=26858
-> Sort (cost=170057.54..170058.11 rows=230 width=349) (actual time=5264.888..5264.905 rows=2 loops=1)
Sort Key: (COALESCE((sum(pay.amount)), '0'::bigint)) DESC, (COALESCE((count(DISTINCT b.booking_id)), '0'::bigint)) DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=726266, temp read=26853 written=26858
-> Hash Join (cost=169840.97..170048.52 rows=230 width=349) (actual time=5049.927..5264.888 rows=2 loops=1)
Hash Cond: ((po.user_id)::text = (u.user_id)::text)
Buffers: shared hit=726266, temp read=26853 written=26858
-> Merge Left Join (cost=169817.92..170024.86 rows=230 width=388) (actual time=5049.843..5264.798 rows=2 loops=1)
Merge Cond: ((po.user_id)::text = (pets.owner_id)::text)
Buffers: shared hit=726265, temp read=26853 written=26858
-> Merge Left Join (cost=169816.17..170021.77 rows=230 width=380) (actual time=5049.710..5264.655 rows=2 loops=1)
Merge Cond: ((po.user_id)::text = (service_counts.owner_id)::text)
Buffers: shared hit=726264, temp read=26853 written=26858
-> Merge Left Join (cost=61806.32..61998.99 rows=230 width=106) (actual time=2266.934..2481.844 rows=2 loops=1)
Merge Cond: ((po.user_id)::text = (b.owner_id)::text)
Filter: (COALESCE((count(DISTINCT b.booking_id)), '0'::bigint) > 0)
Rows Removed by Filter: 5
Buffers: shared hit=688884, temp read=1858 written=1863
-> Sort (cost=49.44..51.16 rows=690 width=90) (actual time=0.040..0.044 rows=7 loops=1)
Sort Key: po.user_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=1
-> Seq Scan on pet_owners po (cost=0.00..16.90 rows=690 width=90) (actual time=0.027..0.028 rows=7 loops=1)
Buffers: shared hit=1
-> Materialize (cost=61756.88..61946.08 rows=2 width=53) (actual time=2266.857..2481.755 rows=2 loops=1)
Buffers: shared hit=688883, temp read=1858 written=1863
-> GroupAggregate (cost=61756.88..61946.05 rows=2 width=53) (actual time=2266.852..2481.746 rows=2 loops=1)
Group Key: b.owner_id
Buffers: shared hit=688883, temp read=1858 written=1863
-> Sort (cost=61756.88..61804.17 rows=18915 width=78) (actual time=2266.784..2442.950 rows=166665 loops=1)
Sort Key: b.owner_id, b.booking_id
Sort Method: external merge Disk: 14864kB
Buffers: shared hit=688883, temp read=1858 written=1863
-> Nested Loop (cost=0.42..60413.23 rows=18915 width=78) (actual time=0.185..1616.303 rows=166665 loops=1)
Buffers: shared hit=688883
-> Nested Loop (cost=0.00..37276.85 rows=18915 width=74) (actual time=0.057..274.855 rows=166665 loops=1)
Join Filter: ((b.date_from >= p.start_date) AND (b.date_from < p.end_date))
Rows Removed by Join Filter: 3
Buffers: shared hit=22223
-> CTE Scan on params p (cost=0.00..0.02 rows=1 width=8) (actual time=0.015..0.017 rows=1 loops=1)
-> Seq Scan on bookings b (cost=0.00..34723.28 rows=170237 width=78) (actual time=0.026..235.609 rows=166668 loops=1)
Filter: ((status)::text = 'Completed'::text)
Rows Removed by Filter: 833354
Buffers: shared hit=22223
-> Index Scan using payments_booking_id_key on payments pay (cost=0.42..1.22 rows=1 width=41) (actual time=0.008..0.008 rows=1 loops=166665)
Index Cond: ((booking_id)::text = (b.booking_id)::text)
Buffers: shared hit=666660
-> Materialize (cost=108009.85..108022.19 rows=2 width=311) (actual time=2782.752..2782.780 rows=2 loops=1)
Buffers: shared hit=37380, temp read=24995 written=24995
-> Subquery Scan on service_counts (cost=108009.85..108022.18 rows=2 width=311) (actual time=2782.740..2782.766 rows=2 loops=1)
Filter: (service_counts.rank_num = 1)
Buffers: shared hit=37380, temp read=24995 written=24995
-> WindowAgg (cost=108009.85..108017.43 rows=380 width=327) (actual time=2782.732..2782.756 rows=2 loops=1)
Run Condition: (row_number() OVER (?) <= 1)
Buffers: shared hit=37380, temp read=24995 written=24995
-> Sort (cost=108009.83..108010.78 rows=380 width=319) (actual time=2782.709..2782.717 rows=5 loops=1)
Sort Key: b_1.owner_id, (count(bs.service_id)) DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=37380, temp read=24995 written=24995
-> HashAggregate (cost=107989.75..107993.55 rows=380 width=319) (actual time=2782.681..2782.692 rows=5 loops=1)
Group Key: b_1.owner_id, s.type
Batches: 1 Memory Usage: 37kB
Buffers: shared hit=37380, temp read=24995 written=24995
-> Hash Join (cost=50037.77..107156.39 rows=111114 width=348) (actual time=792.873..2447.981 rows=1000017 loops=1)
Hash Cond: ((bs.service_id)::text = (s.service_id)::text)
Buffers: shared hit=37380, temp read=24995 written=24995
-> Hash Join (cost=50023.50..106843.94 rows=111114 width=74) (actual time=792.796..2158.768 rows=1000017 loops=1)
Hash Cond: ((bs.booking_id)::text = (b_1.booking_id)::text)
Buffers: shared hit=37379, temp read=24995 written=24995
-> Seq Scan on booking_services bs (cost=0.00..25156.22 rows=1000022 width=74) (actual time=0.034..182.963 rows=1000022 loops=1)
Buffers: shared hit=15156
-> Hash (cost=47223.57..47223.57 rows=111114 width=74) (actual time=789.943..789.945 rows=1000017 loops=1)
Buckets: 131072 (originally 131072) Batches: 16 (originally 2) Memory Usage: 7487kB
Buffers: shared hit=22223, temp written=10750
-> Nested Loop (cost=0.00..47223.57 rows=111114 width=74) (actual time=0.034..347.150 rows=1000017 loops=1)
Join Filter: ((b_1.date_from >= p_1.start_date) AND (b_1.date_from < p_1.end_date))
Rows Removed by Join Filter: 5
Buffers: shared hit=22223
-> CTE Scan on params p_1 (cost=0.00..0.02 rows=1 width=8) (actual time=0.002..0.004 rows=1 loops=1)
-> Seq Scan on bookings b_1 (cost=0.00..32223.22 rows=1000022 width=78) (actual time=0.010..102.527 rows=1000022 loops=1)
Buffers: shared hit=22223
-> Hash (cost=11.90..11.90 rows=190 width=364) (actual time=0.044..0.045 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=1
-> Seq Scan on services s (cost=0.00..11.90 rows=190 width=364) (actual time=0.031..0.033 rows=4 loops=1)
Buffers: shared hit=1
-> GroupAggregate (cost=1.75..2.15 rows=23 width=98) (actual time=0.093..0.104 rows=6 loops=1)
Group Key: pets.owner_id
Buffers: shared hit=1
-> Sort (cost=1.75..1.81 rows=23 width=180) (actual time=0.068..0.071 rows=16 loops=1)
Sort Key: pets.owner_id
Sort Method: quicksort Memory: 26kB
Buffers: shared hit=1
-> Seq Scan on pets (cost=0.00..1.23 rows=23 width=180) (actual time=0.030..0.034 rows=16 loops=1)
Buffers: shared hit=1
-> Hash (cost=15.80..15.80 rows=580 width=51) (actual time=0.057..0.058 rows=12 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=1
-> Seq Scan on users u (cost=0.00..15.80 rows=580 width=51) (actual time=0.033..0.036 rows=12 loops=1)
Buffers: shared hit=1
Planning:
Buffers: shared hit=570
Planning Time: 4.934 ms
JIT:
Functions: 85
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 6.167 ms (Deform 2.815 ms), Inlining 0.000 ms, Optimization 2.209 ms, Emission 55.891 ms, Total 64.267 ms
Execution Time: 5367.798 ms
Execution time: 5367.798 ms
Execution with the indexes we created in Scenario 3:
Limit (cost=133535.97..133535.99 rows=10 width=115) (actual time=11115.289..11115.307 rows=2 loops=1)
Buffers: shared hit=4705758 read=7205, temp read=1858 written=1863
CTE params
-> Result (cost=0.00..0.02 rows=1 width=8) (actual time=0.013..0.014 rows=1 loops=1)
-> Sort (cost=133535.95..133536.52 rows=230 width=115) (actual time=11060.413..11060.429 rows=2 loops=1)
Sort Key: (dense_rank() OVER (?))
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=4705758 read=7205, temp read=1858 written=1863
-> WindowAgg (cost=133526.40..133530.98 rows=230 width=115) (actual time=11060.340..11060.367 rows=2 loops=1)
Buffers: shared hit=4705755 read=7205, temp read=1858 written=1863
-> Sort (cost=133526.38..133526.95 rows=230 width=349) (actual time=11060.310..11060.326 rows=2 loops=1)
Sort Key: (COALESCE((sum(pay.amount)), '0'::bigint)) DESC, (COALESCE((count(DISTINCT b.booking_id)), '0'::bigint)) DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=4705755 read=7205, temp read=1858 written=1863
-> Hash Join (cost=133310.05..133517.35 rows=230 width=349) (actual time=10845.768..11060.317 rows=2 loops=1)
Hash Cond: ((po.user_id)::text = (u.user_id)::text)
Buffers: shared hit=4705755 read=7205, temp read=1858 written=1863
-> Merge Left Join (cost=133287.00..133493.70 rows=230 width=388) (actual time=10845.686..11060.231 rows=2 loops=1)
Merge Cond: ((po.user_id)::text = (pets.owner_id)::text)
Buffers: shared hit=4705754 read=7205, temp read=1858 written=1863
-> Merge Left Join (cost=133285.52..133491.12 rows=230 width=380) (actual time=10845.563..11060.100 rows=2 loops=1)
Merge Cond: ((po.user_id)::text = (service_counts.owner_id)::text)
Buffers: shared hit=4705753 read=7205, temp read=1858 written=1863
-> Merge Left Join (cost=45355.37..45548.05 rows=230 width=106) (actual time=2279.274..2493.782 rows=2 loops=1)
Merge Cond: ((po.user_id)::text = (b.owner_id)::text)
Filter: (COALESCE((count(DISTINCT b.booking_id)), '0'::bigint) > 0)
Rows Removed by Filter: 5
Buffers: shared hit=689775, temp read=1858 written=1863
-> Sort (cost=49.44..51.16 rows=690 width=90) (actual time=0.048..0.053 rows=7 loops=1)
Sort Key: po.user_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=1
-> Seq Scan on pet_owners po (cost=0.00..16.90 rows=690 width=90) (actual time=0.036..0.038 rows=7 loops=1)
Buffers: shared hit=1
-> Materialize (cost=45305.94..45495.13 rows=2 width=53) (actual time=2279.186..2493.682 rows=2 loops=1)
Buffers: shared hit=689774, temp read=1858 written=1863
-> GroupAggregate (cost=45305.94..45495.11 rows=2 width=53) (actual time=2279.181..2493.674 rows=2 loops=1)
Group Key: b.owner_id
Buffers: shared hit=689774, temp read=1858 written=1863
-> Sort (cost=45305.94..45353.23 rows=18915 width=78) (actual time=2279.104..2454.786 rows=166665 loops=1)
Sort Key: b.owner_id, b.booking_id
Sort Method: external merge Disk: 14864kB
Buffers: shared hit=689774, temp read=1858 written=1863
-> Nested Loop (cost=1520.72..43962.29 rows=18915 width=78) (actual time=43.262..1631.393 rows=166665 loops=1)
Buffers: shared hit=689774
-> Nested Loop (cost=1520.29..25876.96 rows=18915 width=74) (actual time=43.200..340.897 rows=166665 loops=1)
Buffers: shared hit=23114
-> CTE Scan on params p (cost=0.00..0.02 rows=1 width=8) (actual time=0.020..0.022 rows=1 loops=1)
-> Bitmap Heap Scan on bookings b (cost=1520.29..25687.79 rows=18915 width=78) (actual time=43.171..311.549 rows=166665 loops=1)
Recheck Cond: ((date_from >= p.start_date) AND (date_from < p.end_date))
Filter: ((status)::text = 'Completed'::text)
Rows Removed by Filter: 833352
Heap Blocks: exact=22223
Buffers: shared hit=23114
-> Bitmap Index Scan on idx_bookings_date_owner (cost=0.00..1515.57 rows=111114 width=0) (actual time=38.503..38.504 rows=1000017 loops=1)
Index Cond: ((date_from >= p.start_date) AND (date_from < p.end_date))
Buffers: shared hit=891
-> Index Scan using idx_payments_booking_id on payments pay (cost=0.42..0.96 rows=1 width=41) (actual time=0.007..0.007 rows=1 loops=166665)
Index Cond: ((booking_id)::text = (b.booking_id)::text)
Buffers: shared hit=666660
-> Materialize (cost=87930.14..87942.48 rows=2 width=311) (actual time=8566.266..8566.290 rows=2 loops=1)
Buffers: shared hit=4015978 read=7205
-> Subquery Scan on service_counts (cost=87930.14..87942.47 rows=2 width=311) (actual time=8566.261..8566.283 rows=2 loops=1)
Filter: (service_counts.rank_num = 1)
Buffers: shared hit=4015978 read=7205
-> WindowAgg (cost=87930.14..87937.72 rows=380 width=327) (actual time=8566.254..8566.274 rows=2 loops=1)
Run Condition: (row_number() OVER (?) <= 1)
Buffers: shared hit=4015978 read=7205
-> Sort (cost=87930.12..87931.07 rows=380 width=319) (actual time=8566.230..8566.236 rows=5 loops=1)
Sort Key: b_1.owner_id, (count(bs.service_id)) DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=4015978 read=7205
-> HashAggregate (cost=87910.04..87913.84 rows=380 width=319) (actual time=8566.200..8566.209 rows=5 loops=1)
Group Key: b_1.owner_id, s.type
Batches: 1 Memory Usage: 37kB
Buffers: shared hit=4015978 read=7205
-> Hash Join (cost=1558.04..87076.68 rows=111114 width=348) (actual time=41.196..8143.541 rows=1000017 loops=1)
Hash Cond: ((bs.service_id)::text = (s.service_id)::text)
Buffers: shared hit=4015978 read=7205
-> Nested Loop (cost=1543.77..86764.23 rows=111114 width=74) (actual time=41.122..7819.032 rows=1000017 loops=1)
Buffers: shared hit=4015977 read=7205
-> Nested Loop (cost=1543.34..26544.21 rows=111114 width=74) (actual time=41.009..461.885 rows=1000017 loops=1)
Buffers: shared hit=23114
-> CTE Scan on params p_1 (cost=0.00..0.02 rows=1 width=8) (actual time=0.001..0.004 rows=1 loops=1)
-> Bitmap Heap Scan on bookings b_1 (cost=1543.34..25433.05 rows=111114 width=78) (actual time=40.988..259.041 rows=1000017 loops=1)
Recheck Cond: ((date_from >= p_1.start_date) AND (date_from < p_1.end_date))
Heap Blocks: exact=22223
Buffers: shared hit=23114
-> Bitmap Index Scan on idx_bookings_date_owner (cost=0.00..1515.57 rows=111114 width=0) (actual time=35.914..35.914 rows=1000017 loops=1)
Index Cond: ((date_from >= p_1.start_date) AND (date_from < p_1.end_date))
Buffers: shared hit=891
-> Index Scan using idx_booking_services_booking_id on booking_services bs (cost=0.42..0.53 rows=1 width=74) (actual time=0.007..0.007 rows=1 loops=1000017)
Index Cond: ((booking_id)::text = (b_1.booking_id)::text)
Buffers: shared hit=3992863 read=7205
-> Hash (cost=11.90..11.90 rows=190 width=364) (actual time=0.042..0.043 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=1
-> Seq Scan on services s (cost=0.00..11.90 rows=190 width=364) (actual time=0.030..0.031 rows=4 loops=1)
Buffers: shared hit=1
-> GroupAggregate (cost=1.48..1.76 rows=16 width=98) (actual time=0.096..0.105 rows=6 loops=1)
Group Key: pets.owner_id
Buffers: shared hit=1
-> Sort (cost=1.48..1.52 rows=16 width=180) (actual time=0.060..0.062 rows=16 loops=1)
Sort Key: pets.owner_id
Sort Method: quicksort Memory: 26kB
Buffers: shared hit=1
-> Seq Scan on pets (cost=0.00..1.16 rows=16 width=180) (actual time=0.029..0.034 rows=16 loops=1)
Buffers: shared hit=1
-> Hash (cost=15.80..15.80 rows=580 width=51) (actual time=0.046..0.047 rows=12 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=1
-> Seq Scan on users u (cost=0.00..15.80 rows=580 width=51) (actual time=0.030..0.033 rows=12 loops=1)
Buffers: shared hit=1
Planning:
Buffers: shared hit=706 read=6 dirtied=2
Planning Time: 5.877 ms
JIT:
Functions: 84
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 5.747 ms (Deform 2.377 ms), Inlining 0.000 ms, Optimization 2.047 ms, Emission 53.180 ms, Total 60.973 ms
Execution Time: 11158.152 ms
Execution time: 11158.152 ms
Conclusion: The execution time increased from 5.3 seconds to 11.1 seconds. Creating foreign key indexes actually decreased the query performance.
Because Sequential Scans and Hash Joins are better for processing millions of rows in large analytical reporting workloads, we drop these trap indexes to return to the previous performance state:
DROP INDEX project.idx_booking_services_booking_id; DROP INDEX project.idx_payments_booking_id; DROP INDEX project.idx_bookings_date_sitter; DROP INDEX project.idx_bookings_date_owner;
Security
1. Password Security (BCrypt)
User passwords are encrypted before being stored in the database using BCryptPasswordEncoder. This ensures that even in the event of a database breach, plain-text passwords remain unreadable and are protected against brute-force and dictionary attacks. Authentication is securely handled by verifying the raw input against the stored BCrypt hash within the UserService.
Implementation in UserService.java:
private final BCryptPasswordEncoder passwordEncoder = new BCryptPasswordEncoder();
@Transactional(readOnly = true)
public User authenticate(String username, String password) {
User user = userRepository.findByUsername(username).orElse(null);
if (user != null && passwordEncoder.matches(password, user.getPassword())) {
return user;
}
return null;
}
@Transactional
public User registerUser(String username, String password, String firstName, String lastName, String email, String role) {
// ...
newUser.setUsername(username);
newUser.setPassword(passwordEncoder.encode(password));
newUser.setFirstName(firstName);
// ...
}
2. SQL Injection Prevention
The application utilizes Spring Data JPA for all database interactions. By relying on JPA's built in repository methods, all user inputs are automatically treated as parameters rather than executable SQL strings. This protects the application from SQL Injection attacks.
Example of safe repository usage in UserService.java:
@Transactional(readOnly = true)
public User authenticate(String username, String password) {
User user = userRepository.findByUsername(username).orElse(null);
// ...
}
3. Role based access control (RBAC)
The application implements strict endpoint protection based on user roles (Admin, PetOwner, PetSitter). Controller routes verify the identity and specific subtype of the authenticated user via the HttpSession. If an unauthorized role attempts to access restricted areas, they are immediately redirected, preventing privilege escalation.
Implementation example in AdminController.java:
@GetMapping("/admin/users")
public String showAllUsers(HttpSession session, Model model) {
User user = (User) session.getAttribute("loggedInUser");
if (user == null || user instanceof PetOwner || user instanceof PetSitter) {
return "redirect:/dashboard";
}
List<User> users = userRepository.findAll();
model.addAttribute("users", users);
// ...
return "admin-users";
}
Not applicable
During the security analysis phase, two common web security measures were evaluated but found to be inapplicable for our specific architecture:
JWTs
JSON Web Tokens are designed for Stateless REST APIs (for example, when using a completely separate React/Vue frontend).
Because our application uses a Server Side Rendering - SSR architecture returning HTML views directly from controllers, we manage state using the built-in HttpSession. Session based authentication is the standard for this specific architecture and provides excellent security. Implementing JWTs would require an unnecessary and highly complex rewrite of our entire authentication system, moving away from SSR best practices.
CORS
Because our frontend and backend are served together from the exact same Spring Boot instance, they share the exact same origin. Therefore, cross origin requests do not occur, making CORS configuration unnecessary for our application.
