| Version 5 (modified by , 13 days ago) ( diff ) |
|---|
Other topics (Performance, Security, …)
Performance
1. Adding 1000000 mock rows to 'bookings' table
ALTER TABLE project.bookings DISABLE TRIGGER trg_booking_validation;
INSERT INTO project.bookings (status, date_from, date_to, address, owner_id, sitter_id)
SELECT
(ARRAY['Pending', 'Confirmed', 'Completed', 'Canceled', 'Expired', 'Reviewed'])[floor(random()*6)+1] AS status,
t.dt AS date_from,
t.dt + (random() * INTERVAL '7 days') AS date_to,
'Mock Address ' || gs AS address,
o.user_id AS owner_id,
s.user_id AS sitter_id
FROM generate_series(1, 1000000) gs
JOIN LATERAL (
SELECT user_id FROM project.pet_owners ORDER BY random() LIMIT 1
) o ON true
JOIN LATERAL (
SELECT user_id FROM project.pet_sitters ORDER BY random() LIMIT 1
) s ON true
JOIN LATERAL (
SELECT CURRENT_DATE - (random() * INTERVAL '365 days') AS dt
) t ON true;
ALTER TABLE project.bookings ENABLE TRIGGER trg_booking_validation;
Benchmark query:
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM project.bookings WHERE sitter_id = (SELECT user_id FROM project.pet_sitters LIMIT 1) ORDER BY date_from DESC;
Execution without indexes:
Gather Merge (cost=81030.64..195549.38 rows=967764 width=146) (actual time=94.209..106.394 rows=2 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=22336
InitPlan 1
-> Limit (cost=0.00..0.02 rows=1 width=90) (actual time=3.367..3.369 rows=1 loops=1)
Buffers: shared hit=1
-> Seq Scan on pet_sitters (cost=0.00..16.90 rows=690 width=90) (actual time=0.015..0.016 rows=1 loops=1)
Buffers: shared hit=1
-> Sort (cost=80030.57..80837.04 rows=322588 width=146) (actual time=60.212..60.213 rows=0 loops=4)
Sort Key: bookings.date_from DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=22335
Worker 0: Sort Method: quicksort Memory: 25kB
Worker 1: Sort Method: quicksort Memory: 25kB
Worker 2: Sort Method: quicksort Memory: 25kB
-> Parallel Seq Scan on bookings (cost=0.00..26256.35 rows=322588 width=146) (actual time=37.611..60.082 rows=0 loops=4)
Filter: ((sitter_id)::text = ((InitPlan 1).col1)::text)
Rows Removed by Filter: 250005
Buffers: shared hit=22224
Planning Time: 0.168 ms
JIT:
Functions: 12
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 1.466 ms (Deform 0.675 ms), Inlining 0.000 ms, Optimization 1.430 ms, Emission 19.219 ms, Total 22.115 ms
Execution Time: 106.834 ms
Average execution time (10 attempts): 105.235ms
Next, we add this index:
CREATE INDEX idx_bookings_sitter_date ON project.bookings (sitter_id, date_from DESC);
(Note, for the exact same performance reasons, we also created this index to optimize the queries when Pet Owners view their own bookings)
CREATE INDEX idx_bookings_owner_date ON project.bookings (owner_id, date_from DESC);
Execution with indexes:
Index Scan using idx_bookings_sitter_date on bookings (cost=0.45..72503.52 rows=1000022 width=146) (actual time=0.066..0.068 rows=2 loops=1)
Index Cond: ((sitter_id)::text = ((InitPlan 1).col1)::text)
Buffers: shared hit=2 read=3
InitPlan 1
-> Limit (cost=0.00..0.02 rows=1 width=90) (actual time=0.012..0.013 rows=1 loops=1)
Buffers: shared hit=1
-> Seq Scan on pet_sitters (cost=0.00..16.90 rows=690 width=90) (actual time=0.011..0.011 rows=1 loops=1)
Buffers: shared hit=1
Planning:
Buffers: shared hit=39 read=2
Planning Time: 0.618 ms
Execution Time: 0.092 ms
Average execution time (10 attempts): 0.087ms
Because the execution time has been massively lowered by bypassing the expensive sequential scan and memory sort, we keep this index.
*(Note: For the exact same performance reasons, we also created CREATE INDEX idx_bookings_owner_date ON project.bookings (owner_id, date_from DESC); to optimize the queries when Pet Owners view their own booking history).*
2. Calculate average sitter rating
Benchmark query:
EXPLAIN (ANALYZE, BUFFERS) SELECT AVG(r.rating) FROM project.reviews r JOIN project.bookings b ON r.booking_id = b.booking_id WHERE b.sitter_id = (SELECT user_id FROM project.pet_sitters LIMIT 1);
Execution without indexes:
Finalize Aggregate (cost=33210.02..33210.03 rows=1 width=32) (actual time=121.837..135.038 rows=1 loops=1)
Buffers: shared hit=25818
InitPlan 1
-> Limit (cost=0.00..0.02 rows=1 width=90) (actual time=0.014..0.015 rows=1 loops=1)
Buffers: shared hit=1
-> Seq Scan on pet_sitters (cost=0.00..16.90 rows=690 width=90) (actual time=0.012..0.013 rows=1 loops=1)
Buffers: shared hit=1
-> Gather (cost=33209.67..33209.98 rows=3 width=32) (actual time=119.802..135.016 rows=4 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=25818
-> Partial Aggregate (cost=32209.67..32209.68 rows=1 width=32) (actual time=90.870..90.873 rows=1 loops=4)
Buffers: shared hit=25817
-> Parallel Hash Join (cost=4971.92..32075.06 rows=53846 width=4) (actual time=77.089..90.867 rows=0 loops=4)
Hash Cond: ((b.booking_id)::text = (r.booking_id)::text)
Buffers: shared hit=25817
-> Parallel Seq Scan on bookings b (cost=0.00..26256.35 rows=322588 width=37) (actual time=41.349..55.119 rows=0 loops=4)
Filter: ((sitter_id)::text = ((InitPlan 1).col1)::text)
Rows Removed by Filter: 250005
Buffers: shared hit=22224
-> Parallel Hash (cost=4102.52..4102.52 rows=69552 width=41) (actual time=34.636..34.637 rows=41731 loops=4)
Buckets: 262144 Batches: 1 Memory Usage: 15200kB
Buffers: shared hit=3407
-> Parallel Seq Scan on reviews r (cost=0.00..4102.52 rows=69552 width=41) (actual time=0.026..11.172 rows=41731 loops=4)
Buffers: shared hit=3407
Planning:
Buffers: shared hit=16
Planning Time: 0.444 ms
Execution Time: 135.097 ms
Average execution time (10 attempts): 131.722ms
We add this index:
CREATE INDEX idx_reviews_booking_id ON project.reviews (booking_id);
Execution with indexes:
Finalize Aggregate (cost=33210.02..33210.03 rows=1 width=32) (actual time=125.303..138.277 rows=1 loops=1)
Buffers: shared hit=25818
InitPlan 1
-> Limit (cost=0.00..0.02 rows=1 width=90) (actual time=0.013..0.014 rows=1 loops=1)
Buffers: shared hit=1
-> Seq Scan on pet_sitters (cost=0.00..16.90 rows=690 width=90) (actual time=0.012..0.012 rows=1 loops=1)
Buffers: shared hit=1
-> Gather (cost=33209.67..33209.98 rows=3 width=32) (actual time=123.351..138.255 rows=4 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=25818
-> Partial Aggregate (cost=32209.67..32209.68 rows=1 width=32) (actual time=93.069..93.072 rows=1 loops=4)
Buffers: shared hit=25817
-> Parallel Hash Join (cost=4971.92..32075.06 rows=53846 width=4) (actual time=78.783..93.066 rows=0 loops=4)
Hash Cond: ((b.booking_id)::text = (r.booking_id)::text)
Buffers: shared hit=25817
-> Parallel Seq Scan on bookings b (cost=0.00..26256.35 rows=322588 width=37) (actual time=42.860..57.138 rows=0 loops=4)
Filter: ((sitter_id)::text = ((InitPlan 1).col1)::text)
Rows Removed by Filter: 250005
Buffers: shared hit=22224
-> Parallel Hash (cost=4102.52..4102.52 rows=69552 width=41) (actual time=34.827..34.828 rows=41731 loops=4)
Buckets: 262144 Batches: 1 Memory Usage: 15168kB
Buffers: shared hit=3407
-> Parallel Seq Scan on reviews r (cost=0.00..4102.52 rows=69552 width=41) (actual time=0.029..11.437 rows=41731 loops=4)
Buffers: shared hit=3407
Planning:
Buffers: shared hit=16
Planning Time: 0.554 ms
Execution Time: 138.346 ms
Average execution time (10 attempts): 138.346ms
There is practically no change in the execution time, and the query planner completely ignored the index because it determined that performing a Parallel Seq Scan and Parallel Hash Join was cheaper than doing individual index lookups on such a large dataset. Because the index is not utilized for this query and only adds complexity during INSERT and UPDATE operations, we will not keep this index.
3. Pet Sitter Leaderboard, Analytics
Benchmark query:
EXPLAIN (ANALYZE, BUFFERS)
WITH params AS (
SELECT
(CURRENT_DATE - INTERVAL '1 year') AS start_date,
CURRENT_DATE AS end_date
),
sitter_bookings AS (
SELECT
b.sitter_id,
COUNT(*) AS total_bookings,
COUNT(*) FILTER (WHERE b.status = 'Completed') AS completed_bookings,
COUNT(*) FILTER (WHERE b.status = 'Canceled') AS canceled_bookings
FROM project.bookings b
JOIN params p ON b.date_from >= p.start_date AND b.date_from <= p.end_date
GROUP BY b.sitter_id
),
sitter_reviews AS (
SELECT
b.sitter_id,
AVG(r.rating)::numeric(10,2) AS avg_rating,
COUNT(r.review_id) AS total_reviews
FROM project.reviews r
JOIN project.bookings b ON r.booking_id = b.booking_id
GROUP BY b.sitter_id
)
SELECT
u.username,
u.first_name,
u.last_name,
COALESCE(sb.completed_bookings, 0) AS completed,
COALESCE(sb.canceled_bookings, 0) AS canceled,
COALESCE(sr.avg_rating, 0) AS rating,
(
COALESCE(sb.completed_bookings, 0) * 5
+ COALESCE(sr.avg_rating, 0) * 10
- COALESCE(sb.canceled_bookings, 0) * 3
) AS activity_score,
DENSE_RANK() OVER (
ORDER BY (
COALESCE(sb.completed_bookings, 0) * 5
+ COALESCE(sr.avg_rating, 0) * 10
- COALESCE(sb.canceled_bookings, 0) * 3
) DESC
) AS leaderboard_rank
FROM project.pet_sitters ps
JOIN project.users u ON ps.user_id = u.user_id
LEFT JOIN sitter_bookings sb ON sb.sitter_id = ps.user_id
LEFT JOIN sitter_reviews sr ON sr.sitter_id = ps.user_id
ORDER BY leaderboard_rank
LIMIT 10;
Execution without indexes:
Limit (cost=67216.49..67216.51 rows=10 width=115) (actual time=403.788..412.560 rows=4 loops=1)
Buffers: shared hit=22278 read=25633
-> Sort (cost=67216.49..67218.21 rows=690 width=115) (actual time=403.786..412.557 rows=4 loops=1)
Sort Key: (dense_rank() OVER (?))
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=22278 read=25633
-> WindowAgg (cost=67177.44..67201.58 rows=690 width=115) (actual time=403.716..412.514 rows=4 loops=1)
Buffers: shared hit=22275 read=25633
-> Sort (cost=67177.43..67179.15 rows=690 width=91) (actual time=403.675..412.445 rows=4 loops=1)
Sort Key: (((((COALESCE((count(*) FILTER (WHERE ((b.status)::text = 'Completed'::text))), '0'::bigint) * 5))::numeric + (COALESCE(((avg(r.rating))::numeric(10,2)), '0'::numeric) * '10'::numeric)) - ((COALESCE((count(*) FILTER (WHERE ((b.status)::text = 'Canceled'::text))), '0'::bigint) * 3))::numeric)) DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=22275 read=25633
-> Hash Join (cost=67125.00..67144.89 rows=690 width=91) (actual time=403.585..412.387 rows=4 loops=1)
Hash Cond: ((ps.user_id)::text = (u.user_id)::text)
Buffers: shared hit=22272 read=25633
-> Merge Left Join (cost=67101.95..67107.94 rows=690 width=122) (actual time=403.503..412.297 rows=4 loops=1)
Merge Cond: ((ps.user_id)::text = (b_1.sitter_id)::text)
Buffers: shared hit=22272 read=25632
-> Merge Left Join (cost=34564.07..34567.93 rows=690 width=106) (actual time=212.894..213.031 rows=4 loops=1)
Merge Cond: ((ps.user_id)::text = (b.sitter_id)::text)
Buffers: shared hit=24 read=22225
-> Sort (cost=49.44..51.16 rows=690 width=90) (actual time=0.049..0.051 rows=4 loops=1)
Sort Key: ps.user_id
Sort Method: quicksort Memory: 25kB
Buffers: shared read=1
-> Seq Scan on pet_sitters ps (cost=0.00..16.90 rows=690 width=90) (actual time=0.021..0.022 rows=4 loops=1)
Buffers: shared read=1
-> Finalize GroupAggregate (cost=34514.63..34515.02 rows=1 width=61) (actual time=212.840..212.970 rows=4 loops=1)
Group Key: b.sitter_id
Buffers: shared hit=24 read=22224
-> Gather Merge (cost=34514.63..34514.99 rows=3 width=53) (actual time=212.822..212.945 rows=7 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=24 read=22224
-> Sort (cost=33514.59..33514.60 rows=1 width=53) (actual time=206.606..206.608 rows=2 loops=4)
Sort Key: b.sitter_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=24 read=22224
Worker 0: Sort Method: quicksort Memory: 25kB
Worker 1: Sort Method: quicksort Memory: 25kB
Worker 2: Sort Method: quicksort Memory: 25kB
-> Partial HashAggregate (cost=33514.57..33514.58 rows=1 width=53) (actual time=206.527..206.529 rows=2 loops=4)
Group Key: b.sitter_id
Batches: 1 Memory Usage: 24kB
Buffers: shared read=22224
Worker 0: Batches: 1 Memory Usage: 24kB
Worker 1: Batches: 1 Memory Usage: 24kB
Worker 2: Batches: 1 Memory Usage: 24kB
-> Parallel Seq Scan on bookings b (cost=0.00..29482.22 rows=322588 width=45) (actual time=0.070..120.157 rows=250004 loops=4)
Filter: ((date_from <= CURRENT_DATE) AND (date_from >= (CURRENT_DATE - '1 year'::interval)))
Rows Removed by Filter: 1
Buffers: shared read=22224
-> Finalize GroupAggregate (cost=32537.88..32538.27 rows=1 width=61) (actual time=190.603..199.257 rows=4 loops=1)
Group Key: b_1.sitter_id
Buffers: shared hit=22248 read=3407
-> Gather Merge (cost=32537.88..32538.24 rows=3 width=69) (actual time=190.572..199.221 rows=7 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=22248 read=3407
-> Sort (cost=31537.84..31537.84 rows=1 width=69) (actual time=184.406..184.410 rows=2 loops=4)
Sort Key: b_1.sitter_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=22248 read=3407
Worker 0: Sort Method: quicksort Memory: 25kB
Worker 1: Sort Method: quicksort Memory: 25kB
Worker 2: Sort Method: quicksort Memory: 25kB
-> Partial HashAggregate (cost=31537.82..31537.83 rows=1 width=69) (actual time=184.336..184.340 rows=2 loops=4)
Group Key: b_1.sitter_id
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=22224 read=3407
Worker 0: Batches: 1 Memory Usage: 24kB
Worker 1: Batches: 1 Memory Usage: 24kB
Worker 2: Batches: 1 Memory Usage: 24kB
-> Parallel Hash Join (cost=4971.92..31268.59 rows=53846 width=41) (actual time=43.109..172.841 rows=41731 loops=4)
Hash Cond: ((b_1.booking_id)::text = (r.booking_id)::text)
Buffers: shared hit=22224 read=3407
-> Parallel Seq Scan on bookings b_1 (cost=0.00..25449.88 rows=322588 width=74) (actual time=0.019..27.882 rows=250006 loops=4)
Buffers: shared hit=22224
-> Parallel Hash (cost=4102.52..4102.52 rows=69552 width=41) (actual time=42.288..42.289 rows=41731 loops=4)
Buckets: 262144 Batches: 1 Memory Usage: 15168kB
Buffers: shared read=3407
-> Parallel Seq Scan on reviews r (cost=0.00..4102.52 rows=69552 width=41) (actual time=0.063..16.909 rows=41731 loops=4)
Buffers: shared read=3407
-> Hash (cost=15.80..15.80 rows=580 width=64) (actual time=0.049..0.050 rows=12 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
Buffers: shared read=1
-> Seq Scan on users u (cost=0.00..15.80 rows=580 width=64) (actual time=0.029..0.032 rows=12 loops=1)
Buffers: shared read=1
Planning:
Buffers: shared hit=402 read=29
Planning Time: 3.607 ms
Execution Time: 413.138 ms
Average execution time (10 attempts): 410.225ms
This execution time is perfectly acceptable for a heavy analytical reporting query. Because this query does not block user requests and is usually run as a background task or in an admin dashboard, we do not need to do more optimization with highly specific indexes that can slow down standard INSERT and UPDATE operations.
4. Search for available sitters with highest rankings
Benchmark query:
EXPLAIN (ANALYZE, BUFFERS)
WITH requested_dates AS (
SELECT
(CURRENT_DATE + INTERVAL '10 days')::date AS req_start,
(CURRENT_DATE + INTERVAL '15 days')::date AS req_end
),
unavailable_sitters AS (
SELECT DISTINCT b.sitter_id
FROM project.bookings b
JOIN requested_dates rd ON true
WHERE b.status IN ('Confirmed', 'Pending')
AND b.date_from <= rd.req_end
AND b.date_to >= rd.req_start
),
highly_rated_sitters AS (
SELECT b.sitter_id
FROM project.reviews r
JOIN project.bookings b ON r.booking_id = b.booking_id
GROUP BY b.sitter_id
HAVING AVG(r.rating) >= 4.0
)
SELECT
u.username,
u.first_name,
u.last_name,
u.email
FROM project.pet_sitters ps
JOIN project.users u ON ps.user_id = u.user_id
JOIN highly_rated_sitters hrs ON hrs.sitter_id = ps.user_id
LEFT JOIN unavailable_sitters us ON us.sitter_id = ps.user_id
WHERE us.sitter_id IS NULL
LIMIT 20;
Execution without indexes:
Limit (cost=66246.09..66295.07 rows=1 width=51) (actual time=279.739..287.004 rows=3 loops=1)
Buffers: shared hit=47892 read=2
-> Nested Loop Anti Join (cost=66246.09..66295.07 rows=1 width=51) (actual time=279.736..287.000 rows=3 loops=1)
Join Filter: ((b_1.sitter_id)::text = (ps.user_id)::text)
Buffers: shared hit=47892 read=2
-> Nested Loop (cost=32537.88..32586.84 rows=1 width=141) (actual time=184.771..188.940 rows=3 loops=1)
Join Filter: ((ps.user_id)::text = (u.user_id)::text)
Rows Removed by Join Filter: 20
Buffers: shared hit=25659 read=2
-> Nested Loop (cost=32537.88..32563.79 rows=1 width=127) (actual time=184.747..188.897 rows=3 loops=1)
Join Filter: ((ps.user_id)::text = (b.sitter_id)::text)
Rows Removed by Join Filter: 3
Buffers: shared hit=25657 read=1
-> Finalize GroupAggregate (cost=32537.88..32538.27 rows=1 width=37) (actual time=184.705..188.834 rows=3 loops=1)
Group Key: b.sitter_id
Filter: (avg(r.rating) >= 4.0)
Rows Removed by Filter: 1
Buffers: shared hit=25655
-> Gather Merge (cost=32537.88..32538.24 rows=3 width=69) (actual time=184.688..188.798 rows=7 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=25655
-> Sort (cost=31537.84..31537.84 rows=1 width=69) (actual time=178.566..178.571 rows=2 loops=4)
Sort Key: b.sitter_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=25655
Worker 0: Sort Method: quicksort Memory: 25kB
Worker 1: Sort Method: quicksort Memory: 25kB
Worker 2: Sort Method: quicksort Memory: 25kB
-> Partial HashAggregate (cost=31537.82..31537.83 rows=1 width=69) (actual time=178.495..178.500 rows=2 loops=4)
Group Key: b.sitter_id
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=25631
Worker 0: Batches: 1 Memory Usage: 24kB
Worker 1: Batches: 1 Memory Usage: 24kB
Worker 2: Batches: 1 Memory Usage: 24kB
-> Parallel Hash Join (cost=4971.92..31268.59 rows=53846 width=41) (actual time=36.863..167.092 rows=41731 loops=4)
Hash Cond: ((b.booking_id)::text = (r.booking_id)::text)
Buffers: shared hit=25631
-> Parallel Seq Scan on bookings b (cost=0.00..25449.88 rows=322588 width=74) (actual time=0.019..27.928 rows=250006 loops=4)
Buffers: shared hit=22224
-> Parallel Hash (cost=4102.52..4102.52 rows=69552 width=41) (actual time=36.033..36.034 rows=41731 loops=4)
Buckets: 262144 Batches: 1 Memory Usage: 15200kB
Buffers: shared hit=3407
-> Parallel Seq Scan on reviews r (cost=0.00..4102.52 rows=69552 width=41) (actual time=0.021..11.538 rows=41731 loops=4)
Buffers: shared hit=3407
-> Seq Scan on pet_sitters ps (cost=0.00..16.90 rows=690 width=90) (actual time=0.014..0.014 rows=2 loops=3)
Buffers: shared hit=2 read=1
-> Seq Scan on users u (cost=0.00..15.80 rows=580 width=88) (actual time=0.009..0.009 rows=8 loops=3)
Buffers: shared hit=2 read=1
-> Unique (cost=33708.21..33708.22 rows=1 width=37) (actual time=31.654..32.684 rows=0 loops=3)
Buffers: shared hit=22233
-> Sort (cost=33708.21..33708.22 rows=1 width=37) (actual time=31.654..32.684 rows=0 loops=3)
Sort Key: b_1.sitter_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=22233
-> Gather (cost=1000.00..33708.20 rows=1 width=37) (actual time=94.936..98.026 rows=0 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=22233
-> Parallel Seq Scan on bookings b_1 (cost=0.00..32708.10 rows=1 width=37) (actual time=89.286..89.286 rows=0 loops=4)
Filter: (((status)::text = ANY ('{Confirmed,Pending}'::text[])) AND (date_from <= ((CURRENT_DATE + '15 days'::interval))::date) AND (date_to >= ((CURRENT_DATE + '10 days'::interval))::date))
Rows Removed by Filter: 250006
Buffers: shared hit=22233
Planning:
Buffers: shared hit=11 read=8
Planning Time: 1.269 ms
Execution Time: 287.147 ms
Average execution time (10 attempts): 283.421ms
We add this index to optimize the date overlapping:
CREATE INDEX idx_bookings_sitter_status_dates ON project.bookings (sitter_id, status, date_from, date_to);
Execution with indexes:
Limit (cost=32538.32..48741.33 rows=1 width=51) (actual time=192.798..202.516 rows=3 loops=1)
Buffers: shared hit=27463 read=901
-> Nested Loop Anti Join (cost=32538.32..48741.33 rows=1 width=51) (actual time=192.796..202.512 rows=3 loops=1)
Join Filter: ((b_1.sitter_id)::text = (ps.user_id)::text)
Buffers: shared hit=27463 read=901
-> Nested Loop (cost=32537.88..32586.84 rows=1 width=141) (actual time=180.755..185.162 rows=3 loops=1)
Join Filter: ((ps.user_id)::text = (u.user_id)::text)
Rows Removed by Join Filter: 20
Buffers: shared hit=25661
-> Nested Loop (cost=32537.88..32563.79 rows=1 width=127) (actual time=180.740..185.126 rows=3 loops=1)
Join Filter: ((ps.user_id)::text = (b.sitter_id)::text)
Rows Removed by Join Filter: 3
Buffers: shared hit=25658
-> Finalize GroupAggregate (cost=32537.88..32538.27 rows=1 width=37) (actual time=180.718..185.081 rows=3 loops=1)
Group Key: b.sitter_id
Filter: (avg(r.rating) >= 4.0)
Rows Removed by Filter: 1
Buffers: shared hit=25655
-> Gather Merge (cost=32537.88..32538.24 rows=3 width=69) (actual time=180.700..185.043 rows=7 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=25655
-> Sort (cost=31537.84..31537.84 rows=1 width=69) (actual time=174.919..174.925 rows=2 loops=4)
Sort Key: b.sitter_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=25655
Worker 0: Sort Method: quicksort Memory: 25kB
Worker 1: Sort Method: quicksort Memory: 25kB
Worker 2: Sort Method: quicksort Memory: 25kB
-> Partial HashAggregate (cost=31537.82..31537.83 rows=1 width=69) (actual time=174.853..174.857 rows=2 loops=4)
Group Key: b.sitter_id
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=25631
Worker 0: Batches: 1 Memory Usage: 24kB
Worker 1: Batches: 1 Memory Usage: 24kB
Worker 2: Batches: 1 Memory Usage: 24kB
-> Parallel Hash Join (cost=4971.92..31268.59 rows=53846 width=41) (actual time=36.095..163.581 rows=41731 loops=4)
Hash Cond: ((b.booking_id)::text = (r.booking_id)::text)
Buffers: shared hit=25631
-> Parallel Seq Scan on bookings b (cost=0.00..25449.88 rows=322588 width=74) (actual time=0.019..27.140 rows=250006 loops=4)
Buffers: shared hit=22224
-> Parallel Hash (cost=4102.52..4102.52 rows=69552 width=41) (actual time=35.280..35.281 rows=41731 loops=4)
Buckets: 262144 Batches: 1 Memory Usage: 15136kB
Buffers: shared hit=3407
-> Parallel Seq Scan on reviews r (cost=0.00..4102.52 rows=69552 width=41) (actual time=0.022..11.047 rows=41731 loops=4)
Buffers: shared hit=3407
-> Seq Scan on pet_sitters ps (cost=0.00..16.90 rows=690 width=90) (actual time=0.008..0.008 rows=2 loops=3)
Buffers: shared hit=3
-> Seq Scan on users u (cost=0.00..15.80 rows=580 width=88) (actual time=0.006..0.006 rows=8 loops=3)
Buffers: shared hit=3
-> Unique (cost=0.44..16154.48 rows=1 width=37) (actual time=5.776..5.776 rows=0 loops=3)
Buffers: shared hit=1802 read=901
-> Index Only Scan using idx_bookings_sitter_status_dates on bookings b_1 (cost=0.44..16154.48 rows=1 width=37) (actual time=5.775..5.775 rows=0 loops=3)
Index Cond: ((status = ANY ('{Confirmed,Pending}'::text[])) AND (date_from <= ((CURRENT_DATE + '15 days'::interval))::date) AND (date_to >= ((CURRENT_DATE + '10 days'::interval))::date))
Heap Fetches: 0
Buffers: shared hit=1802 read=901
Planning:
Buffers: shared hit=49 read=1
Planning Time: 1.463 ms
Execution Time: 202.649 ms
Average execution time (10 attempts): 199.530ms
With this index, we eliminated the expensive sequential scan in exchange for a highly efficient index scan. The query is still bottlenecked by AVG(rating), this index provides a crucial performance improvement to the search feature, so we will keep it.
Security
1. Password Security (BCrypt)
User passwords are encrypted before being stored in the database using BCryptPasswordEncoder. This ensures that even in the event of a database breach, plain-text passwords remain unreadable and are protected against brute-force and dictionary attacks. Authentication is securely handled by verifying the raw input against the stored BCrypt hash within the UserService.
Implementation in UserService.java:
private final BCryptPasswordEncoder passwordEncoder = new BCryptPasswordEncoder();
@Transactional(readOnly = true)
public User authenticate(String username, String password) {
User user = userRepository.findByUsername(username).orElse(null);
if (user != null && passwordEncoder.matches(password, user.getPassword())) {
return user;
}
return null;
}
@Transactional
public User registerUser(String username, String password, String firstName, String lastName, String email, String role) {
// ...
newUser.setUsername(username);
newUser.setPassword(passwordEncoder.encode(password));
newUser.setFirstName(firstName);
// ...
}
2. SQL Injection Prevention
The application utilizes Spring Data JPA for all database interactions. By relying on JPA's built in repository methods, all user inputs are automatically treated as parameters rather than executable SQL strings. This protects the application from SQL Injection attacks.
Example of safe repository usage in UserService.java:
@Transactional(readOnly = true)
public User authenticate(String username, String password) {
User user = userRepository.findByUsername(username).orElse(null);
// ...
}
3. Role based access control (RBAC)
The application implements strict endpoint protection based on user roles (Admin, PetOwner, PetSitter). Controller routes verify the identity and specific subtype of the authenticated user via the HttpSession. If an unauthorized role attempts to access restricted areas, they are immediately redirected, preventing privilege escalation.
Implementation example in AdminController.java:
@GetMapping("/admin/users")
public String showAllUsers(HttpSession session, Model model) {
User user = (User) session.getAttribute("loggedInUser");
if (user == null || user instanceof PetOwner || user instanceof PetSitter) {
return "redirect:/dashboard";
}
List<User> users = userRepository.findAll();
model.addAttribute("users", users);
// ...
return "admin-users";
}
Not applicable
During the security analysis phase, two common web security measures were evaluated but found to be inapplicable for our specific architecture:
JWTs
JSON Web Tokens are designed for Stateless REST APIs (for example, when using a completely separate React/Vue frontend).
Because our application uses a Server Side Rendering - SSR architecture returning HTML views directly from controllers, we manage state using the built-in HttpSession. Session based authentication is the standard for this specific architecture and provides excellent security. Implementing JWTs would require an unnecessary and highly complex rewrite of our entire authentication system, moving away from SSR best practices.
CORS
Because our frontend and backend are served together from the exact same Spring Boot instance, they share the exact same origin. Therefore, cross origin requests do not occur, making CORS configuration unnecessary for our application.
