| Version 7 (modified by , 12 days ago) ( diff ) |
|---|
Performance
1. Adding mock data for bookings
To properly benchmark our database, we first generate a realistic dataset of 1000000 bookings and related entities which.
-- 1. Insert 1,000,000 mock bookings
ALTER TABLE project.bookings DISABLE TRIGGER trg_booking_validation;
INSERT INTO project.bookings (status, date_from, date_to, address, owner_id, sitter_id)
SELECT
(ARRAY['Pending', 'Confirmed', 'Completed', 'Canceled', 'Expired', 'Reviewed'])[floor(random()*6)+1] AS status,
t.dt AS date_from,
t.dt + (random() * INTERVAL '7 days') AS date_to,
'Mock Address ' || gs AS address,
o.user_id AS owner_id,
s.user_id AS sitter_id
FROM generate_series(1, 1000000) gs
JOIN LATERAL (
SELECT user_id FROM project.pet_owners ORDER BY random() LIMIT 1
) o ON true
JOIN LATERAL (
SELECT user_id FROM project.pet_sitters ORDER BY random() LIMIT 1
) s ON true
JOIN LATERAL (
SELECT CURRENT_DATE - (random() * INTERVAL '365 days') AS dt
) t ON true;
ALTER TABLE project.bookings ENABLE TRIGGER trg_booking_validation;
-- 2. Insert relevant mock payments
INSERT INTO project.payments (booking_id, amount, payment_type)
SELECT
b.booking_id,
floor(random() * 100 + 20)::int AS amount,
(ARRAY['Card', 'Cash', 'Bank Transfer'])[floor(random()*3)+1] AS payment_type
FROM project.bookings b
WHERE NOT EXISTS (
SELECT 1 FROM project.payments p WHERE p.booking_id = b.booking_id
);
-- 3. Insert mock reviews for completed bookings
INSERT INTO project.reviews (booking_id, rating, comment)
SELECT
b.booking_id,
floor(random()*5)+1 AS rating,
'Mock Review ' || b.booking_id AS comment
FROM project.bookings b
WHERE b.status = 'Completed'
AND random() > 0.3
AND NOT EXISTS (
SELECT 1 FROM project.reviews r WHERE r.booking_id = b.booking_id
);
-- 4. Insert mock pets
INSERT INTO project.pets (owner_id, name, age, pettype_id)
SELECT
po.user_id,
'Mock Pet ' || po.user_id || '-' || gs AS name,
floor(random()*15)+1 AS age,
pt.pettype_id
FROM project.pet_owners po
CROSS JOIN LATERAL generate_series(1, floor(random()*3 + 1)::int) gs
JOIN LATERAL (
SELECT pettype_id FROM project.pet_types ORDER BY random() LIMIT 1
) pt ON true;
-- 5. Link random services to bookings
INSERT INTO project.booking_services (booking_id, service_id)
SELECT
b.booking_id,
s.service_id
FROM project.bookings b
JOIN LATERAL (
SELECT service_id FROM project.services ORDER BY random() LIMIT 1
) s ON true
WHERE NOT EXISTS (
SELECT 1 FROM project.booking_services bs WHERE bs.booking_id = b.booking_id
);
Benchmark query:
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM project.bookings WHERE sitter_id = (SELECT user_id FROM project.pet_sitters LIMIT 1) ORDER BY date_from DESC;
Execution without indexes:
Gather Merge (cost=81030.64..195549.38 rows=967764 width=146) (actual time=94.209..106.394 rows=2 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=22336
InitPlan 1
-> Limit (cost=0.00..0.02 rows=1 width=90) (actual time=3.367..3.369 rows=1 loops=1)
Buffers: shared hit=1
-> Seq Scan on pet_sitters (cost=0.00..16.90 rows=690 width=90) (actual time=0.015..0.016 rows=1 loops=1)
Buffers: shared hit=1
-> Sort (cost=80030.57..80837.04 rows=322588 width=146) (actual time=60.212..60.213 rows=0 loops=4)
Sort Key: bookings.date_from DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=22335
Worker 0: Sort Method: quicksort Memory: 25kB
Worker 1: Sort Method: quicksort Memory: 25kB
Worker 2: Sort Method: quicksort Memory: 25kB
-> Parallel Seq Scan on bookings (cost=0.00..26256.35 rows=322588 width=146) (actual time=37.611..60.082 rows=0 loops=4)
Filter: ((sitter_id)::text = ((InitPlan 1).col1)::text)
Rows Removed by Filter: 250005
Buffers: shared hit=22224
Planning Time: 0.168 ms
JIT:
Functions: 12
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 1.466 ms (Deform 0.675 ms), Inlining 0.000 ms, Optimization 1.430 ms, Emission 19.219 ms, Total 22.115 ms
Execution Time: 106.834 ms
Average execution time (10 attempts): 105.235ms
Next, we add this index:
CREATE INDEX idx_bookings_sitter_date ON project.bookings (sitter_id, date_from DESC);
(Note, for the exact same performance reasons, we also created this index to optimize the queries when Pet Owners view their own bookings)
CREATE INDEX idx_bookings_owner_date ON project.bookings (owner_id, date_from DESC);
Execution with indexes:
Index Scan using idx_bookings_sitter_date on bookings (cost=0.45..72503.52 rows=1000022 width=146) (actual time=0.066..0.068 rows=2 loops=1)
Index Cond: ((sitter_id)::text = ((InitPlan 1).col1)::text)
Buffers: shared hit=2 read=3
InitPlan 1
-> Limit (cost=0.00..0.02 rows=1 width=90) (actual time=0.012..0.013 rows=1 loops=1)
Buffers: shared hit=1
-> Seq Scan on pet_sitters (cost=0.00..16.90 rows=690 width=90) (actual time=0.011..0.011 rows=1 loops=1)
Buffers: shared hit=1
Planning:
Buffers: shared hit=39 read=2
Planning Time: 0.618 ms
Execution Time: 0.092 ms
Average execution time (10 attempts): 0.087ms
Because the execution time has been massively lowered by bypassing the expensive sequential scan and memory sort, we keep this index.
2. Calculate average sitter rating
Benchmark query:
EXPLAIN (ANALYZE, BUFFERS) SELECT AVG(r.rating) FROM project.reviews r JOIN project.bookings b ON r.booking_id = b.booking_id WHERE b.sitter_id = (SELECT user_id FROM project.pet_sitters LIMIT 1);
Execution without indexes:
Finalize Aggregate (cost=33210.02..33210.03 rows=1 width=32) (actual time=121.837..135.038 rows=1 loops=1)
Buffers: shared hit=25818
InitPlan 1
-> Limit (cost=0.00..0.02 rows=1 width=90) (actual time=0.014..0.015 rows=1 loops=1)
Buffers: shared hit=1
-> Seq Scan on pet_sitters (cost=0.00..16.90 rows=690 width=90) (actual time=0.012..0.013 rows=1 loops=1)
Buffers: shared hit=1
-> Gather (cost=33209.67..33209.98 rows=3 width=32) (actual time=119.802..135.016 rows=4 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=25818
-> Partial Aggregate (cost=32209.67..32209.68 rows=1 width=32) (actual time=90.870..90.873 rows=1 loops=4)
Buffers: shared hit=25817
-> Parallel Hash Join (cost=4971.92..32075.06 rows=53846 width=4) (actual time=77.089..90.867 rows=0 loops=4)
Hash Cond: ((b.booking_id)::text = (r.booking_id)::text)
Buffers: shared hit=25817
-> Parallel Seq Scan on bookings b (cost=0.00..26256.35 rows=322588 width=37) (actual time=41.349..55.119 rows=0 loops=4)
Filter: ((sitter_id)::text = ((InitPlan 1).col1)::text)
Rows Removed by Filter: 250005
Buffers: shared hit=22224
-> Parallel Hash (cost=4102.52..4102.52 rows=69552 width=41) (actual time=34.636..34.637 rows=41731 loops=4)
Buckets: 262144 Batches: 1 Memory Usage: 15200kB
Buffers: shared hit=3407
-> Parallel Seq Scan on reviews r (cost=0.00..4102.52 rows=69552 width=41) (actual time=0.026..11.172 rows=41731 loops=4)
Buffers: shared hit=3407
Planning:
Buffers: shared hit=16
Planning Time: 0.444 ms
Execution Time: 135.097 ms
Average execution time (10 attempts): 131.722ms
We add this index:
CREATE INDEX idx_reviews_booking_id ON project.reviews (booking_id);
Execution with indexes:
Finalize Aggregate (cost=33210.02..33210.03 rows=1 width=32) (actual time=125.303..138.277 rows=1 loops=1)
Buffers: shared hit=25818
InitPlan 1
-> Limit (cost=0.00..0.02 rows=1 width=90) (actual time=0.013..0.014 rows=1 loops=1)
Buffers: shared hit=1
-> Seq Scan on pet_sitters (cost=0.00..16.90 rows=690 width=90) (actual time=0.012..0.012 rows=1 loops=1)
Buffers: shared hit=1
-> Gather (cost=33209.67..33209.98 rows=3 width=32) (actual time=123.351..138.255 rows=4 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=25818
-> Partial Aggregate (cost=32209.67..32209.68 rows=1 width=32) (actual time=93.069..93.072 rows=1 loops=4)
Buffers: shared hit=25817
-> Parallel Hash Join (cost=4971.92..32075.06 rows=53846 width=4) (actual time=78.783..93.066 rows=0 loops=4)
Hash Cond: ((b.booking_id)::text = (r.booking_id)::text)
Buffers: shared hit=25817
-> Parallel Seq Scan on bookings b (cost=0.00..26256.35 rows=322588 width=37) (actual time=42.860..57.138 rows=0 loops=4)
Filter: ((sitter_id)::text = ((InitPlan 1).col1)::text)
Rows Removed by Filter: 250005
Buffers: shared hit=22224
-> Parallel Hash (cost=4102.52..4102.52 rows=69552 width=41) (actual time=34.827..34.828 rows=41731 loops=4)
Buckets: 262144 Batches: 1 Memory Usage: 15168kB
Buffers: shared hit=3407
-> Parallel Seq Scan on reviews r (cost=0.00..4102.52 rows=69552 width=41) (actual time=0.029..11.437 rows=41731 loops=4)
Buffers: shared hit=3407
Planning:
Buffers: shared hit=16
Planning Time: 0.554 ms
Execution Time: 138.346 ms
Average execution time (10 attempts): 138.346ms
There is practically no change in the execution time, and the query planner completely ignored the index because it determined that performing a Parallel Seq Scan and Parallel Hash Join was cheaper than doing individual index lookups on such a large dataset. Because the index is not utilized for this query and only adds complexity during INSERT and UPDATE operations, we will not keep this index.
3. Sitter Performance Analytics (Phase 6 Query)
Benchmark query:
EXPLAIN (ANALYZE, BUFFERS)
WITH params AS (
SELECT
(CURRENT_DATE - INTERVAL '1 year')::DATE AS start_date,
CURRENT_DATE::DATE AS end_date
),
sitter_stats AS (
SELECT
b.sitter_id,
COUNT(b.booking_id) AS total_bookings,
COUNT(b.booking_id) FILTER (WHERE b.status = 'Completed') AS completed_bookings,
COUNT(b.booking_id) FILTER (WHERE b.status IN ('Canceled', 'Rejected')) AS missed_bookings
FROM project.bookings b
JOIN params p ON b.date_from >= p.start_date AND b.date_from < p.end_date
GROUP BY b.sitter_id
),
sitter_reports AS (
SELECT
b.sitter_id,
SUM(pay.amount) AS total_revenue
FROM project.bookings b
JOIN project.payments pay ON b.booking_id = pay.booking_id
JOIN params p ON b.date_from >= p.start_date AND b.date_from < p.end_date
WHERE b.status = 'Completed'
GROUP BY b.sitter_id
),
sitter_ratings AS (
SELECT
b.sitter_id,
AVG(r.rating)::numeric(10,2) AS avg_rating,
COUNT(r.review_id) AS total_reviews
FROM project.bookings b
JOIN project.reviews r ON b.booking_id = r.booking_id
JOIN params p ON b.date_from >= p.start_date AND b.date_from < p.end_date
GROUP BY b.sitter_id
)
SELECT
u.user_id, u.username, u.first_name, u.last_name,
COALESCE(ss.total_bookings, 0) AS total_bookings,
COALESCE(ss.completed_bookings, 0) AS completed_bookings,
COALESCE(ss.missed_bookings, 0) AS missed_bookings,
COALESCE(sr.avg_rating, 0) AS avg_rating,
COALESCE(sr.total_reviews, 0) AS total_reviews,
COALESCE(sf.total_revenue, 0) AS total_revenue,
DENSE_RANK() OVER (
ORDER BY (
COALESCE(sf.total_revenue, 0) * 0.5
+ COALESCE(ss.completed_bookings, 0) * 10
+ COALESCE(sr.avg_rating, 0) * 15
- COALESCE(ss.missed_bookings, 0) * 5
) DESC
) AS sitter_rank
FROM project.users u
JOIN project.pet_sitters ps ON u.user_id = ps.user_id
LEFT JOIN sitter_stats ss ON ss.sitter_id = ps.user_id
LEFT JOIN sitter_reports sf ON sf.sitter_id = ps.user_id
LEFT JOIN sitter_ratings sr ON sr.sitter_id = ps.user_id
WHERE COALESCE(ss.total_bookings, 0) > 0
ORDER BY sitter_rank
LIMIT 10;
Execution without indexes:
Limit (cost=151645.74..151645.77 rows=10 width=176) (actual time=2620.994..2631.056 rows=4 loops=1)
Buffers: shared hit=740349
CTE params
-> Result (cost=0.00..0.02 rows=1 width=8) (actual time=0.024..0.025 rows=1 loops=1)
-> Sort (cost=151645.72..151646.30 rows=230 width=176) (actual time=2562.262..2572.322 rows=4 loops=1)
Sort Key: (dense_rank() OVER (?))
Sort Method: quicksort Memory: 26kB
Buffers: shared hit=740349
-> WindowAgg (cost=151631.00..151640.75 rows=230 width=176) (actual time=2562.200..2572.274 rows=4 loops=1)
Buffers: shared hit=740346
-> Sort (cost=151630.98..151631.55 rows=230 width=152) (actual time=2562.183..2572.243 rows=4 loops=1)
Sort Key: ((((((COALESCE(sf.total_revenue, '0'::bigint))::numeric * 0.5) + ((COALESCE(ss.completed_bookings, '0'::bigint) * 10))::numeric) + (COALESCE(sr.avg_rating, '0'::numeric) * '15'::numeric)) - ((COALESCE(ss.missed_bookings, '0'::bigint) * 5))::numeric)) DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=740346
-> Hash Join (cost=151596.30..151621.96 rows=230 width=152) (actual time=2562.155..2572.225 rows=4 loops=1)
Hash Cond: ((u.user_id)::text = (ps.user_id)::text)
Buffers: shared hit=740343
-> Seq Scan on users u (cost=0.00..15.80 rows=580 width=64) (actual time=0.015..0.018 rows=12 loops=1)
Buffers: shared hit=1
-> Hash (cost=151593.43..151593.43 rows=230 width=146) (actual time=2562.102..2572.160 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=740342
-> Merge Left Join (cost=151588.77..151593.43 rows=230 width=146) (actual time=2562.070..2572.145 rows=4 loops=1)
Merge Cond: ((ps.user_id)::text = (sr.sitter_id)::text)
Buffers: shared hit=740342
-> Merge Left Join (cost=99928.44..99932.50 rows=230 width=122) (actual time=2328.059..2328.084 rows=4 loops=1)
Merge Cond: ((ps.user_id)::text = (sf.sitter_id)::text)
Buffers: shared hit=714518
-> Merge Left Join (cost=43617.47..43620.94 rows=230 width=114) (actual time=779.217..779.233 rows=4 loops=1)
Merge Cond: ((ps.user_id)::text = (ss.sitter_id)::text)
Filter: (COALESCE(ss.total_bookings, '0'::bigint) > 0)
Buffers: shared hit=23035
-> Sort (cost=49.44..51.16 rows=690 width=90) (actual time=0.024..0.026 rows=4 loops=1)
Sort Key: ps.user_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=1
-> Seq Scan on pet_sitters ps (cost=0.00..16.90 rows=690 width=90) (actual time=0.014..0.015 rows=4 loops=1)
Buffers: shared hit=1
-> Sort (cost=43568.03..43568.04 rows=1 width=61) (actual time=779.154..779.159 rows=4 loops=1)
Sort Key: ss.sitter_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=23034
-> Subquery Scan on ss (cost=43568.00..43568.02 rows=1 width=61) (actual time=779.137..779.144 rows=4 loops=1)
Buffers: shared hit=23034
-> HashAggregate (cost=43568.00..43568.01 rows=1 width=61) (actual time=779.130..779.134 rows=4 loops=1)
Group Key: b.sitter_id
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=23034
-> Nested Loop (cost=16900.42..41901.29 rows=111114 width=83) (actual time=44.143..421.383 rows=1000017 loops=1)
Buffers: shared hit=23034
-> CTE Scan on params p (cost=0.00..0.02 rows=1 width=8) (actual time=0.028..0.030 rows=1 loops=1)
-> Bitmap Heap Scan on bookings b (cost=16900.42..40790.13 rows=111114 width=87) (actual time=44.095..233.073 rows=1000017 loops=1)
Recheck Cond: ((date_from >= p.start_date) AND (date_from < p.end_date))
Heap Blocks: exact=22223
Buffers: shared hit=23034
-> Bitmap Index Scan on idx_bookings_owner_date (cost=0.00..16872.65 rows=111114 width=0) (actual time=39.433..39.433 rows=1000017 loops=1)
Index Cond: ((date_from >= p.start_date) AND (date_from < p.end_date))
Buffers: shared hit=811
-> Sort (cost=56310.97..56310.97 rows=1 width=45) (actual time=1548.817..1548.822 rows=2 loops=1)
Sort Key: sf.sitter_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=691483
-> Subquery Scan on sf (cost=56310.94..56310.96 rows=1 width=45) (actual time=1548.794..1548.800 rows=2 loops=1)
Buffers: shared hit=691483
-> HashAggregate (cost=56310.94..56310.95 rows=1 width=45) (actual time=1548.786..1548.791 rows=2 loops=1)
Group Key: b_1.sitter_id
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=691483
-> Nested Loop (cost=0.42..56216.86 rows=18815 width=41) (actual time=0.103..1495.703 rows=167315 loops=1)
Buffers: shared hit=691483
-> Nested Loop (cost=0.00..37263.35 rows=18815 width=74) (actual time=0.043..225.930 rows=167315 loops=1)
Join Filter: ((b_1.date_from >= p_1.start_date) AND (b_1.date_from < p_1.end_date))
Rows Removed by Join Filter: 3
Buffers: shared hit=22223
-> CTE Scan on params p_1 (cost=0.00..0.02 rows=1 width=8) (actual time=0.001..0.003 rows=1 loops=1)
-> Seq Scan on bookings b_1 (cost=0.00..34723.28 rows=169337 width=78) (actual time=0.026..183.123 rows=167318 loops=1)
Filter: ((status)::text = 'Completed'::text)
Rows Removed by Filter: 832704
Buffers: shared hit=22223
-> Index Scan using payments_booking_id_key on payments pay (cost=0.42..1.01 rows=1 width=41) (actual time=0.007..0.007 rows=1 loops=167315)
Index Cond: ((booking_id)::text = (b_1.booking_id)::text)
Buffers: shared hit=669260
-> Sort (cost=51660.34..51660.34 rows=1 width=61) (actual time=233.986..244.032 rows=2 loops=1)
Sort Key: sr.sitter_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=25824
-> Subquery Scan on sr (cost=51660.30..51660.33 rows=1 width=61) (actual time=233.958..244.007 rows=2 loops=1)
Buffers: shared hit=25824
-> HashAggregate (cost=51660.30..51660.32 rows=1 width=61) (actual time=233.948..243.997 rows=2 loops=1)
Group Key: b_2.sitter_id
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=25824
-> Nested Loop (cost=5983.61..51520.87 rows=18591 width=78) (actual time=76.620..198.922 rows=167316 loops=1)
Join Filter: ((b_2.date_from >= p_2.start_date) AND (b_2.date_from < p_2.end_date))
Rows Removed by Join Filter: 3
Buffers: shared hit=25824
-> CTE Scan on params p_2 (cost=0.00..0.02 rows=1 width=8) (actual time=0.001..0.003 rows=1 loops=1)
-> Gather (cost=5983.61..49011.08 rows=167318 width=82) (actual time=76.604..171.175 rows=167319 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=25824
-> Parallel Hash Join (cost=4983.61..31279.28 rows=53974 width=82) (actual time=45.331..167.054 rows=41830 loops=4)
Hash Cond: ((b_2.booking_id)::text = (r.booking_id)::text)
Buffers: shared hit=25824
-> Parallel Seq Scan on bookings b_2 (cost=0.00..25448.88 rows=322588 width=78) (actual time=0.017..27.723 rows=250006 loops=4)
Buffers: shared hit=22223
-> Parallel Hash (cost=4112.16..4112.16 rows=69716 width=78) (actual time=44.396..44.398 rows=41830 loops=4)
Buckets: 262144 Batches: 1 Memory Usage: 20448kB
Buffers: shared hit=3415
-> Parallel Seq Scan on reviews r (cost=0.00..4112.16 rows=69716 width=78) (actual time=9.212..20.518 rows=41830 loops=4)
Buffers: shared hit=3415
Planning:
Buffers: shared hit=611
Planning Time: 4.704 ms
JIT:
Functions: 116
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 9.344 ms (Deform 4.395 ms), Inlining 0.000 ms, Optimization 4.065 ms, Emission 91.907 ms, Total 105.316 ms
Execution Time: 2670.429 ms
Execution time: 2670.429 ms
We attempt to optimize this analytical query by introducing multiple foreign key indexes:
CREATE INDEX idx_bookings_date_sitter ON project.bookings (date_from, sitter_id); CREATE INDEX idx_bookings_date_owner ON project.bookings (date_from, owner_id); CREATE INDEX idx_payments_booking_id ON project.payments (booking_id); CREATE INDEX idx_booking_services_booking_id ON project.booking_services (booking_id); CREATE INDEX idx_pets_owner_id ON project.pets (owner_id);
Execution with indexes:
Limit (cost=113555.49..113555.51 rows=10 width=176) (actual time=3667.482..3667.503 rows=4 loops=1)
Buffers: shared hit=742025, temp read=10108 written=10108
CTE params
-> Result (cost=0.00..0.02 rows=1 width=8) (actual time=0.013..0.014 rows=1 loops=1)
-> Sort (cost=113555.47..113556.04 rows=230 width=176) (actual time=3609.923..3609.942 rows=4 loops=1)
Sort Key: (dense_rank() OVER (?))
Sort Method: quicksort Memory: 26kB
Buffers: shared hit=742025, temp read=10108 written=10108
-> WindowAgg (cost=113540.74..113550.50 rows=230 width=176) (actual time=3609.856..3609.894 rows=4 loops=1)
Buffers: shared hit=742022, temp read=10108 written=10108
-> Sort (cost=113540.72..113541.30 rows=230 width=152) (actual time=3609.838..3609.856 rows=4 loops=1)
Sort Key: ((((((COALESCE(sf.total_revenue, '0'::bigint))::numeric * 0.5) + ((COALESCE(ss.completed_bookings, '0'::bigint) * 10))::numeric) + (COALESCE(sr.avg_rating, '0'::numeric) * '15'::numeric)) - ((COALESCE(ss.missed_bookings, '0'::bigint) * 5))::numeric)) DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=742022, temp read=10108 written=10108
-> Hash Join (cost=113506.05..113531.70 rows=230 width=152) (actual time=3609.810..3609.839 rows=4 loops=1)
Hash Cond: ((u.user_id)::text = (ps.user_id)::text)
Buffers: shared hit=742019, temp read=10108 written=10108
-> Seq Scan on users u (cost=0.00..15.80 rows=580 width=64) (actual time=0.022..0.027 rows=12 loops=1)
Buffers: shared hit=1
-> Hash (cost=113503.17..113503.17 rows=230 width=146) (actual time=3609.749..3609.766 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=742018, temp read=10108 written=10108
-> Merge Left Join (cost=113498.52..113503.17 rows=230 width=146) (actual time=3609.711..3609.745 rows=4 loops=1)
Merge Cond: ((ps.user_id)::text = (sr.sitter_id)::text)
Buffers: shared hit=742018, temp read=10108 written=10108
-> Merge Left Join (cost=72271.51..72275.57 rows=230 width=122) (actual time=2380.858..2380.883 rows=4 loops=1)
Merge Cond: ((ps.user_id)::text = (sf.sitter_id)::text)
Buffers: shared hit=715489
-> Merge Left Join (cost=28260.39..28263.86 rows=230 width=114) (actual time=781.667..781.682 rows=4 loops=1)
Merge Cond: ((ps.user_id)::text = (ss.sitter_id)::text)
Filter: (COALESCE(ss.total_bookings, '0'::bigint) > 0)
Buffers: shared hit=23115
-> Sort (cost=49.44..51.16 rows=690 width=90) (actual time=0.018..0.020 rows=4 loops=1)
Sort Key: ps.user_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=1
-> Seq Scan on pet_sitters ps (cost=0.00..16.90 rows=690 width=90) (actual time=0.007..0.008 rows=4 loops=1)
Buffers: shared hit=1
-> Sort (cost=28210.95..28210.96 rows=1 width=61) (actual time=781.612..781.617 rows=4 loops=1)
Sort Key: ss.sitter_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=23114
-> Subquery Scan on ss (cost=28210.92..28210.94 rows=1 width=61) (actual time=781.595..781.601 rows=4 loops=1)
Buffers: shared hit=23114
-> HashAggregate (cost=28210.92..28210.93 rows=1 width=61) (actual time=781.588..781.593 rows=4 loops=1)
Group Key: b.sitter_id
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=23114
-> Nested Loop (cost=1543.34..26544.21 rows=111114 width=83) (actual time=42.444..423.581 rows=1000017 loops=1)
Buffers: shared hit=23114
-> CTE Scan on params p (cost=0.00..0.02 rows=1 width=8) (actual time=0.016..0.018 rows=1 loops=1)
-> Bitmap Heap Scan on bookings b (cost=1543.34..25433.05 rows=111114 width=87) (actual time=42.408..230.648 rows=1000017 loops=1)
Recheck Cond: ((date_from >= p.start_date) AND (date_from < p.end_date))
Heap Blocks: exact=22223
Buffers: shared hit=23114
-> Bitmap Index Scan on idx_bookings_date_owner (cost=0.00..1515.57 rows=111114 width=0) (actual time=37.787..37.787 rows=1000017 loops=1)
Index Cond: ((date_from >= p.start_date) AND (date_from < p.end_date))
Buffers: shared hit=891
-> Sort (cost=44011.12..44011.12 rows=1 width=45) (actual time=1599.167..1599.172 rows=2 loops=1)
Sort Key: sf.sitter_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=692374
-> Subquery Scan on sf (cost=44011.09..44011.11 rows=1 width=45) (actual time=1599.144..1599.151 rows=2 loops=1)
Buffers: shared hit=692374
-> HashAggregate (cost=44011.09..44011.10 rows=1 width=45) (actual time=1599.136..1599.141 rows=2 loops=1)
Group Key: b_1.sitter_id
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=692374
-> Nested Loop (cost=1520.69..43917.01 rows=18815 width=41) (actual time=40.913..1546.819 rows=167315 loops=1)
Buffers: shared hit=692374
-> Nested Loop (cost=1520.27..25875.93 rows=18815 width=74) (actual time=40.876..285.030 rows=167315 loops=1)
Buffers: shared hit=23114
-> CTE Scan on params p_1 (cost=0.00..0.02 rows=1 width=8) (actual time=0.001..0.003 rows=1 loops=1)
-> Bitmap Heap Scan on bookings b_1 (cost=1520.27..25687.76 rows=18815 width=78) (actual time=40.864..253.061 rows=167315 loops=1)
Recheck Cond: ((date_from >= p_1.start_date) AND (date_from < p_1.end_date))
Filter: ((status)::text = 'Completed'::text)
Rows Removed by Filter: 832702
Heap Blocks: exact=22223
Buffers: shared hit=23114
-> Bitmap Index Scan on idx_bookings_date_owner (cost=0.00..1515.57 rows=111114 width=0) (actual time=36.257..36.257 rows=1000017 loops=1)
Index Cond: ((date_from >= p_1.start_date) AND (date_from < p_1.end_date))
Buffers: shared hit=891
-> Index Scan using idx_payments_booking_id on payments pay (cost=0.42..0.96 rows=1 width=41) (actual time=0.007..0.007 rows=1 loops=167315)
Index Cond: ((booking_id)::text = (b_1.booking_id)::text)
Buffers: shared hit=669260
-> Sort (cost=41227.01..41227.02 rows=1 width=61) (actual time=1228.829..1228.835 rows=2 loops=1)
Sort Key: sr.sitter_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=26529, temp read=10108 written=10108
-> Subquery Scan on sr (cost=41226.98..41227.00 rows=1 width=61) (actual time=1228.801..1228.810 rows=2 loops=1)
Buffers: shared hit=26529, temp read=10108 written=10108
-> HashAggregate (cost=41226.98..41226.99 rows=1 width=61) (actual time=1228.793..1228.801 rows=2 loops=1)
Group Key: b_2.sitter_id
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=26529, temp read=10108 written=10108
-> Hash Join (cost=10848.00..41087.54 rows=18591 width=78) (actual time=157.648..1174.825 rows=167316 loops=1)
Hash Cond: ((b_2.booking_id)::text = (r.booking_id)::text)
Buffers: shared hit=26529, temp read=10108 written=10108
-> Nested Loop (cost=1543.34..26544.21 rows=111114 width=74) (actual time=40.133..417.946 rows=1000017 loops=1)
Buffers: shared hit=23114
-> CTE Scan on params p_2 (cost=0.00..0.02 rows=1 width=8) (actual time=0.001..0.002 rows=1 loops=1)
-> Bitmap Heap Scan on bookings b_2 (cost=1543.34..25433.05 rows=111114 width=78) (actual time=40.113..221.102 rows=1000017 loops=1)
Recheck Cond: ((date_from >= p_2.start_date) AND (date_from < p_2.end_date))
Heap Blocks: exact=22223
Buffers: shared hit=23114
-> Bitmap Index Scan on idx_bookings_date_owner (cost=0.00..1515.57 rows=111114 width=0) (actual time=35.509..35.509 rows=1000017 loops=1)
Index Cond: ((date_from >= p_2.start_date) AND (date_from < p_2.end_date))
Buffers: shared hit=891
-> Hash (cost=5088.18..5088.18 rows=167318 width=78) (actual time=117.367..117.368 rows=167319 loops=1)
Buckets: 131072 Batches: 4 Memory Usage: 5507kB
Buffers: shared hit=3415, temp written=1501
-> Seq Scan on reviews r (cost=0.00..5088.18 rows=167318 width=78) (actual time=0.026..42.889 rows=167319 loops=1)
Buffers: shared hit=3415
Planning:
Buffers: shared hit=657
Planning Time: 4.960 ms
JIT:
Functions: 81
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 6.320 ms (Deform 2.740 ms), Inlining 0.000 ms, Optimization 2.235 ms, Emission 55.696 ms, Total 64.251 ms
Execution Time: 3709.157 ms
Execution time: 3709.157 ms
Instead of improving performance, adding these indexes actually increased the execution time by roughly 1 second, therefore we are not keeping the indexes.
4. Highest Paying Customers Analytics (Phase 6 Query)
Benchmark query:
EXPLAIN (ANALYZE, BUFFERS)
WITH params AS (
SELECT
(CURRENT_DATE - INTERVAL '1 year')::DATE AS start_date,
CURRENT_DATE::DATE AS end_date
),
owner_reports AS (
SELECT
b.owner_id,
SUM(pay.amount) AS total_profit_generated,
COUNT(DISTINCT b.booking_id) AS successful_bookings
FROM project.bookings b
JOIN project.payments pay ON b.booking_id = pay.booking_id
JOIN params p ON b.date_from >= p.start_date AND b.date_from < p.end_date
WHERE b.status = 'Completed'
GROUP BY b.owner_id
),
service_counts AS (
SELECT
b.owner_id,
s.type AS service_type,
COUNT(bs.service_id) AS times_booked,
ROW_NUMBER() OVER(PARTITION BY b.owner_id ORDER BY COUNT(bs.service_id) DESC) as rank_num
FROM project.bookings b
JOIN project.booking_services bs ON b.booking_id = bs.booking_id
JOIN project.services s ON bs.service_id = s.service_id
JOIN params p ON b.date_from >= p.start_date AND b.date_from < p.end_date
GROUP BY b.owner_id, s.type
),
favorite_service AS (
SELECT owner_id, service_type AS top_interest
FROM service_counts
WHERE rank_num = 1
),
pet_portfolio AS (
SELECT owner_id, COUNT(pet_id) AS registered_pets
FROM project.pets
GROUP BY owner_id
)
SELECT
u.user_id, u.first_name, u.last_name,
COALESCE(ofin.successful_bookings, 0) AS successful_bookings,
COALESCE(pp.registered_pets, 0) AS total_pets,
COALESCE(fs.top_interest, 'Unknown') AS top_interest,
COALESCE(ofin.total_profit_generated, 0) AS total_profit_generated,
DENSE_RANK() OVER (
ORDER BY
COALESCE(ofin.total_profit_generated, 0) DESC,
COALESCE(ofin.successful_bookings, 0) DESC
) AS customer_rank
FROM project.users u
JOIN project.pet_owners po ON u.user_id = po.user_id
LEFT JOIN owner_reports ofin ON po.user_id = ofin.owner_id
LEFT JOIN favorite_service fs ON po.user_id = fs.owner_id
LEFT JOIN pet_portfolio pp ON po.user_id = pp.owner_id
WHERE COALESCE(ofin.successful_bookings, 0) > 0
ORDER BY customer_rank
LIMIT 10;
Execution without indexes:
Limit (cost=161052.28..161052.30 rows=10 width=115) (actual time=5023.716..5023.735 rows=2 loops=1)
Buffers: shared hit=730190, temp read=26867 written=26872
CTE params
-> Result (cost=0.00..0.02 rows=1 width=8) (actual time=0.013..0.013 rows=1 loops=1)
-> Sort (cost=161052.26..161052.83 rows=230 width=115) (actual time=4971.843..4971.859 rows=2 loops=1)
Sort Key: (dense_rank() OVER (?))
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=730190, temp read=26867 written=26872
-> WindowAgg (cost=161042.71..161047.29 rows=230 width=115) (actual time=4971.808..4971.835 rows=2 loops=1)
Buffers: shared hit=730190, temp read=26867 written=26872
-> Sort (cost=161042.69..161043.26 rows=230 width=349) (actual time=4971.773..4971.789 rows=2 loops=1)
Sort Key: (COALESCE((sum(pay.amount)), '0'::bigint)) DESC, (COALESCE((count(DISTINCT b.booking_id)), '0'::bigint)) DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=730190, temp read=26867 written=26872
-> Hash Join (cost=160830.10..161033.66 rows=230 width=349) (actual time=4755.875..4971.782 rows=2 loops=1)
Hash Cond: ((po.user_id)::text = (u.user_id)::text)
Buffers: shared hit=730190, temp read=26867 written=26872
-> Merge Left Join (cost=160807.05..161010.01 rows=230 width=388) (actual time=4755.799..4971.700 rows=2 loops=1)
Merge Cond: ((po.user_id)::text = (pets.owner_id)::text)
Buffers: shared hit=730189, temp read=26867 written=26872
-> Merge Left Join (cost=160791.71..160990.09 rows=230 width=380) (actual time=4755.644..4971.535 rows=2 loops=1)
Merge Cond: ((po.user_id)::text = (service_counts.owner_id)::text)
Buffers: shared hit=730188, temp read=26867 written=26872
-> Merge Left Join (cost=57602.12..57793.76 rows=230 width=106) (actual time=2088.524..2304.386 rows=2 loops=1)
Merge Cond: ((po.user_id)::text = (b.owner_id)::text)
Filter: (COALESCE((count(DISTINCT b.booking_id)), '0'::bigint) > 0)
Rows Removed by Filter: 5
Buffers: shared hit=691484, temp read=1865 written=1870
-> Sort (cost=49.44..51.16 rows=690 width=90) (actual time=0.023..0.027 rows=7 loops=1)
Sort Key: po.user_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=1
-> Seq Scan on pet_owners po (cost=0.00..16.90 rows=690 width=90) (actual time=0.009..0.011 rows=7 loops=1)
Buffers: shared hit=1
-> Materialize (cost=57552.69..57740.86 rows=1 width=53) (actual time=2088.462..2304.313 rows=2 loops=1)
Buffers: shared hit=691483, temp read=1865 written=1870
-> GroupAggregate (cost=57552.69..57740.85 rows=1 width=53) (actual time=2088.458..2304.305 rows=2 loops=1)
Group Key: b.owner_id
Buffers: shared hit=691483, temp read=1865 written=1870
-> Sort (cost=57552.69..57599.73 rows=18815 width=78) (actual time=2088.388..2265.596 rows=167315 loops=1)
Sort Key: b.owner_id, b.booking_id
Sort Method: external merge Disk: 14920kB
Buffers: shared hit=691483, temp read=1865 written=1870
-> Nested Loop (cost=0.42..56216.86 rows=18815 width=78) (actual time=0.077..1445.651 rows=167315 loops=1)
Buffers: shared hit=691483
-> Nested Loop (cost=0.00..37263.35 rows=18815 width=74) (actual time=0.043..218.948 rows=167315 loops=1)
Join Filter: ((b.date_from >= p.start_date) AND (b.date_from < p.end_date))
Rows Removed by Join Filter: 3
Buffers: shared hit=22223
-> CTE Scan on params p (cost=0.00..0.02 rows=1 width=8) (actual time=0.015..0.017 rows=1 loops=1)
-> Seq Scan on bookings b (cost=0.00..34723.28 rows=169337 width=78) (actual time=0.015..182.152 rows=167318 loops=1)
Filter: ((status)::text = 'Completed'::text)
Rows Removed by Filter: 832704
Buffers: shared hit=22223
-> Index Scan using payments_booking_id_key on payments pay (cost=0.42..1.01 rows=1 width=41) (actual time=0.007..0.007 rows=1 loops=167315)
Index Cond: ((booking_id)::text = (b.booking_id)::text)
Buffers: shared hit=669260
-> Materialize (cost=103189.58..103195.74 rows=1 width=311) (actual time=2667.097..2667.120 rows=2 loops=1)
Buffers: shared hit=38704, temp read=25002 written=25002
-> Subquery Scan on service_counts (cost=103189.58..103195.74 rows=1 width=311) (actual time=2667.092..2667.114 rows=2 loops=1)
Filter: (service_counts.rank_num = 1)
Buffers: shared hit=38704, temp read=25002 written=25002
-> WindowAgg (cost=103189.58..103193.36 rows=190 width=327) (actual time=2667.085..2667.106 rows=2 loops=1)
Run Condition: (row_number() OVER (?) <= 1)
Buffers: shared hit=38704, temp read=25002 written=25002
-> Sort (cost=103189.56..103190.04 rows=190 width=319) (actual time=2667.062..2667.070 rows=5 loops=1)
Sort Key: b_1.owner_id, (count(bs.service_id)) DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=38704, temp read=25002 written=25002
-> HashAggregate (cost=103180.47..103182.37 rows=190 width=319) (actual time=2667.034..2667.043 rows=5 loops=1)
Group Key: b_1.owner_id, s.type
Batches: 1 Memory Usage: 40kB
Buffers: shared hit=38704, temp read=25002 written=25002
-> Hash Join (cost=44715.49..102347.12 rows=111114 width=348) (actual time=1301.306..2326.037 rows=1000017 loops=1)
Hash Cond: ((bs.service_id)::text = (s.service_id)::text)
Buffers: shared hit=38704, temp read=25002 written=25002
-> Hash Join (cost=44701.22..102034.66 rows=111114 width=74) (actual time=1301.240..2045.345 rows=1000017 loops=1)
Hash Cond: ((bs.booking_id)::text = (b_1.booking_id)::text)
Buffers: shared hit=38703, temp read=25002 written=25002
-> Seq Scan on booking_services bs (cost=0.00..25669.22 rows=1000022 width=74) (actual time=0.017..158.668 rows=1000022 loops=1)
Buffers: shared hit=15669
-> Hash (cost=41901.29..41901.29 rows=111114 width=74) (actual time=844.625..844.628 rows=1000017 loops=1)
Buckets: 131072 (originally 131072) Batches: 16 (originally 2) Memory Usage: 7550kB
Buffers: shared hit=23034, temp written=10744
-> Nested Loop (cost=16900.42..41901.29 rows=111114 width=74) (actual time=41.046..409.694 rows=1000017 loops=1)
Buffers: shared hit=23034
-> CTE Scan on params p_1 (cost=0.00..0.02 rows=1 width=8) (actual time=0.002..0.004 rows=1 loops=1)
-> Bitmap Heap Scan on bookings b_1 (cost=16900.42..40790.13 rows=111114 width=78) (actual time=41.023..223.873 rows=1000017 loops=1)
Recheck Cond: ((date_from >= p_1.start_date) AND (date_from < p_1.end_date))
Heap Blocks: exact=22223
Buffers: shared hit=23034
-> Bitmap Index Scan on idx_bookings_owner_date (cost=0.00..16872.65 rows=111114 width=0) (actual time=36.458..36.459 rows=1000017 loops=1)
Index Cond: ((date_from >= p_1.start_date) AND (date_from < p_1.end_date))
Buffers: shared hit=811
-> Hash (cost=11.90..11.90 rows=190 width=364) (actual time=0.037..0.038 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=1
-> Seq Scan on services s (cost=0.00..11.90 rows=190 width=364) (actual time=0.026..0.027 rows=4 loops=1)
Buffers: shared hit=1
-> GroupAggregate (cost=15.34..17.44 rows=120 width=98) (actual time=0.103..0.115 rows=7 loops=1)
Group Key: pets.owner_id
Buffers: shared hit=1
-> Sort (cost=15.34..15.64 rows=120 width=180) (actual time=0.078..0.081 rows=23 loops=1)
Sort Key: pets.owner_id
Sort Method: quicksort Memory: 27kB
Buffers: shared hit=1
-> Seq Scan on pets (cost=0.00..11.20 rows=120 width=180) (actual time=0.044..0.051 rows=23 loops=1)
Buffers: shared hit=1
-> Hash (cost=15.80..15.80 rows=580 width=51) (actual time=0.044..0.045 rows=12 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=1
-> Seq Scan on users u (cost=0.00..15.80 rows=580 width=51) (actual time=0.027..0.030 rows=12 loops=1)
Buffers: shared hit=1
Planning:
Buffers: shared hit=34
Planning Time: 1.751 ms
JIT:
Functions: 86
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 4.189 ms (Deform 1.818 ms), Inlining 0.000 ms, Optimization 1.556 ms, Emission 50.729 ms, Total 56.474 ms
Execution Time: 5033.000 ms
Execution time: 5033.000 ms
Execution with the indexes we created in Scenario 3:
Limit (cost=133693.17..133693.19 rows=10 width=115) (actual time=11494.356..11494.375 rows=2 loops=1)
Buffers: shared hit=4708355 read=7205, temp read=1865 written=1870
CTE params
-> Result (cost=0.00..0.02 rows=1 width=8) (actual time=0.013..0.014 rows=1 loops=1)
-> Sort (cost=133693.15..133693.72 rows=230 width=115) (actual time=11443.882..11443.898 rows=2 loops=1)
Sort Key: (dense_rank() OVER (?))
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=4708355 read=7205, temp read=1865 written=1870
-> WindowAgg (cost=133683.60..133688.18 rows=230 width=115) (actual time=11443.849..11443.877 rows=2 loops=1)
Buffers: shared hit=4708355 read=7205, temp read=1865 written=1870
-> Sort (cost=133683.58..133684.15 rows=230 width=349) (actual time=11443.814..11443.830 rows=2 loops=1)
Sort Key: (COALESCE((sum(pay.amount)), '0'::bigint)) DESC, (COALESCE((count(DISTINCT b.booking_id)), '0'::bigint)) DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=4708355 read=7205, temp read=1865 written=1870
-> Hash Join (cost=133474.23..133674.56 rows=230 width=349) (actual time=11228.070..11443.823 rows=2 loops=1)
Hash Cond: ((po.user_id)::text = (u.user_id)::text)
Buffers: shared hit=4708355 read=7205, temp read=1865 written=1870
-> Merge Left Join (cost=133451.18..133650.90 rows=230 width=388) (actual time=11227.999..11443.746 rows=2 loops=1)
Merge Cond: ((po.user_id)::text = (pets.owner_id)::text)
Buffers: shared hit=4708354 read=7205, temp read=1865 written=1870
-> Merge Left Join (cost=133449.43..133647.81 rows=230 width=380) (actual time=11227.862..11443.598 rows=2 loops=1)
Merge Cond: ((po.user_id)::text = (service_counts.owner_id)::text)
Buffers: shared hit=4708353 read=7205, temp read=1865 written=1870
-> Merge Left Join (cost=45302.27..45493.91 rows=230 width=106) (actual time=2137.209..2352.916 rows=2 loops=1)
Merge Cond: ((po.user_id)::text = (b.owner_id)::text)
Filter: (COALESCE((count(DISTINCT b.booking_id)), '0'::bigint) > 0)
Rows Removed by Filter: 5
Buffers: shared hit=692375, temp read=1865 written=1870
-> Sort (cost=49.44..51.16 rows=690 width=90) (actual time=0.021..0.025 rows=7 loops=1)
Sort Key: po.user_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=1
-> Seq Scan on pet_owners po (cost=0.00..16.90 rows=690 width=90) (actual time=0.009..0.011 rows=7 loops=1)
Buffers: shared hit=1
-> Materialize (cost=45252.84..45441.01 rows=1 width=53) (actual time=2137.153..2352.849 rows=2 loops=1)
Buffers: shared hit=692374, temp read=1865 written=1870
-> GroupAggregate (cost=45252.84..45441.00 rows=1 width=53) (actual time=2137.148..2352.840 rows=2 loops=1)
Group Key: b.owner_id
Buffers: shared hit=692374, temp read=1865 written=1870
-> Sort (cost=45252.84..45299.88 rows=18815 width=78) (actual time=2137.081..2313.817 rows=167315 loops=1)
Sort Key: b.owner_id, b.booking_id
Sort Method: external merge Disk: 14920kB
Buffers: shared hit=692374, temp read=1865 written=1870
-> Nested Loop (cost=1520.69..43917.01 rows=18815 width=78) (actual time=40.128..1500.355 rows=167315 loops=1)
Buffers: shared hit=692374
-> Nested Loop (cost=1520.27..25875.93 rows=18815 width=74) (actual time=40.077..281.535 rows=167315 loops=1)
Buffers: shared hit=23114
-> CTE Scan on params p (cost=0.00..0.02 rows=1 width=8) (actual time=0.016..0.018 rows=1 loops=1)
-> Bitmap Heap Scan on bookings b (cost=1520.27..25687.76 rows=18815 width=78) (actual time=40.052..253.659 rows=167315 loops=1)
Recheck Cond: ((date_from >= p.start_date) AND (date_from < p.end_date))
Filter: ((status)::text = 'Completed'::text)
Rows Removed by Filter: 832702
Heap Blocks: exact=22223
Buffers: shared hit=23114
-> Bitmap Index Scan on idx_bookings_date_owner (cost=0.00..1515.57 rows=111114 width=0) (actual time=35.462..35.463 rows=1000017 loops=1)
Index Cond: ((date_from >= p.start_date) AND (date_from < p.end_date))
Buffers: shared hit=891
-> Index Scan using idx_payments_booking_id on payments pay (cost=0.42..0.96 rows=1 width=41) (actual time=0.007..0.007 rows=1 loops=167315)
Index Cond: ((booking_id)::text = (b.booking_id)::text)
Buffers: shared hit=669260
-> Materialize (cost=88147.15..88153.31 rows=1 width=311) (actual time=9090.628..9090.652 rows=2 loops=1)
Buffers: shared hit=4015978 read=7205
-> Subquery Scan on service_counts (cost=88147.15..88153.31 rows=1 width=311) (actual time=9090.623..9090.645 rows=2 loops=1)
Filter: (service_counts.rank_num = 1)
Buffers: shared hit=4015978 read=7205
-> WindowAgg (cost=88147.15..88150.93 rows=190 width=327) (actual time=9090.616..9090.636 rows=2 loops=1)
Run Condition: (row_number() OVER (?) <= 1)
Buffers: shared hit=4015978 read=7205
-> Sort (cost=88147.13..88147.61 rows=190 width=319) (actual time=9090.592..9090.598 rows=5 loops=1)
Sort Key: b_1.owner_id, (count(bs.service_id)) DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=4015978 read=7205
-> HashAggregate (cost=88138.04..88139.94 rows=190 width=319) (actual time=9090.565..9090.573 rows=5 loops=1)
Group Key: b_1.owner_id, s.type
Batches: 1 Memory Usage: 40kB
Buffers: shared hit=4015978 read=7205
-> Hash Join (cost=1558.04..87304.69 rows=111114 width=348) (actual time=40.705..8657.362 rows=1000017 loops=1)
Hash Cond: ((bs.service_id)::text = (s.service_id)::text)
Buffers: shared hit=4015978 read=7205
-> Nested Loop (cost=1543.77..86992.23 rows=111114 width=74) (actual time=40.620..8294.445 rows=1000017 loops=1)
Buffers: shared hit=4015977 read=7205
-> Nested Loop (cost=1543.34..26544.21 rows=111114 width=74) (actual time=40.489..464.589 rows=1000017 loops=1)
Buffers: shared hit=23114
-> CTE Scan on params p_1 (cost=0.00..0.02 rows=1 width=8) (actual time=0.001..0.003 rows=1 loops=1)
-> Bitmap Heap Scan on bookings b_1 (cost=1543.34..25433.05 rows=111114 width=78) (actual time=40.468..260.041 rows=1000017 loops=1)
Recheck Cond: ((date_from >= p_1.start_date) AND (date_from < p_1.end_date))
Heap Blocks: exact=22223
Buffers: shared hit=23114
-> Bitmap Index Scan on idx_bookings_date_owner (cost=0.00..1515.57 rows=111114 width=0) (actual time=35.914..35.914 rows=1000017 loops=1)
Index Cond: ((date_from >= p_1.start_date) AND (date_from < p_1.end_date))
Buffers: shared hit=891
-> Index Scan using idx_booking_services_booking_id on booking_services bs (cost=0.42..0.53 rows=1 width=74) (actual time=0.007..0.007 rows=1 loops=1000017)
Index Cond: ((booking_id)::text = (b_1.booking_id)::text)
Buffers: shared hit=3992863 read=7205
-> Hash (cost=11.90..11.90 rows=190 width=364) (actual time=0.054..0.055 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=1
-> Seq Scan on services s (cost=0.00..11.90 rows=190 width=364) (actual time=0.041..0.043 rows=4 loops=1)
Buffers: shared hit=1
-> GroupAggregate (cost=1.75..2.15 rows=23 width=98) (actual time=0.109..0.122 rows=7 loops=1)
Group Key: pets.owner_id
Buffers: shared hit=1
-> Sort (cost=1.75..1.81 rows=23 width=180) (actual time=0.084..0.088 rows=23 loops=1)
Sort Key: pets.owner_id
Sort Method: quicksort Memory: 27kB
Buffers: shared hit=1
-> Seq Scan on pets (cost=0.00..1.23 rows=23 width=180) (actual time=0.049..0.056 rows=23 loops=1)
Buffers: shared hit=1
-> Hash (cost=15.80..15.80 rows=580 width=51) (actual time=0.046..0.047 rows=12 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=1
-> Seq Scan on users u (cost=0.00..15.80 rows=580 width=51) (actual time=0.030..0.033 rows=12 loops=1)
Buffers: shared hit=1
Planning:
Buffers: shared hit=163 read=6 dirtied=2
Planning Time: 2.821 ms
JIT:
Functions: 84
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 4.125 ms (Deform 1.641 ms), Inlining 0.000 ms, Optimization 1.547 ms, Emission 49.279 ms, Total 54.951 ms
Execution Time: 11502.974 ms
Execution time: 11502.974 ms
Conclusion: The execution time heavily degraded from ~5 seconds to ~11.5 seconds. This represents the "Nested Loop / Index Scan Trap". Creating foreign key indexes actually decreased the query performance.
Because Sequential Scans and Hash Joins are better at processing millions of rows in large analytical reporting, we drop these indexes:
DROP INDEX project.idx_booking_services_booking_id; DROP INDEX project.idx_bookings_date_sitter;
Security
1. Password Security (BCrypt)
User passwords are encrypted before being stored in the database using BCryptPasswordEncoder. This ensures that even in the event of a database breach, plain-text passwords remain unreadable and are protected against brute-force and dictionary attacks. Authentication is securely handled by verifying the raw input against the stored BCrypt hash within the UserService.
Implementation in UserService.java:
private final BCryptPasswordEncoder passwordEncoder = new BCryptPasswordEncoder();
@Transactional(readOnly = true)
public User authenticate(String username, String password) {
User user = userRepository.findByUsername(username).orElse(null);
if (user != null && passwordEncoder.matches(password, user.getPassword())) {
return user;
}
return null;
}
@Transactional
public User registerUser(String username, String password, String firstName, String lastName, String email, String role) {
// ...
newUser.setUsername(username);
newUser.setPassword(passwordEncoder.encode(password));
newUser.setFirstName(firstName);
// ...
}
2. SQL Injection Prevention
The application utilizes Spring Data JPA for all database interactions. By relying on JPA's built in repository methods, all user inputs are automatically treated as parameters rather than executable SQL strings. This protects the application from SQL Injection attacks.
Example of safe repository usage in UserService.java:
@Transactional(readOnly = true)
public User authenticate(String username, String password) {
User user = userRepository.findByUsername(username).orElse(null);
// ...
}
3. Role based access control (RBAC)
The application implements strict endpoint protection based on user roles (Admin, PetOwner, PetSitter). Controller routes verify the identity and specific subtype of the authenticated user via the HttpSession. If an unauthorized role attempts to access restricted areas, they are immediately redirected, preventing privilege escalation.
Implementation example in AdminController.java:
@GetMapping("/admin/users")
public String showAllUsers(HttpSession session, Model model) {
User user = (User) session.getAttribute("loggedInUser");
if (user == null || user instanceof PetOwner || user instanceof PetSitter) {
return "redirect:/dashboard";
}
List<User> users = userRepository.findAll();
model.addAttribute("users", users);
// ...
return "admin-users";
}
Not applicable
During the security analysis phase, two common web security measures were evaluated but found to be inapplicable for our specific architecture:
JWTs
JSON Web Tokens are designed for Stateless REST APIs (for example, when using a completely separate React/Vue frontend).
Because our application uses a Server Side Rendering - SSR architecture returning HTML views directly from controllers, we manage state using the built-in HttpSession. Session based authentication is the standard for this specific architecture and provides excellent security. Implementing JWTs would require an unnecessary and highly complex rewrite of our entire authentication system, moving away from SSR best practices.
CORS
Because our frontend and backend are served together from the exact same Spring Boot instance, they share the exact same origin. Therefore, cross origin requests do not occur, making CORS configuration unnecessary for our application.
