| Version 7 (modified by , 4 hours ago) ( diff ) |
|---|
Other Topics
SQL Performance, adding Indexes
To analyze the performance of our queries, we consider several different use cases.
The testing approach is as follows:
We first insert a large number of new records into the tables (around 1~2 million rows) so that using an index becomes meaningful and beneficial when applied correctly. It is clear that for a table with only a small number of records, indexing would not provide a significant advantage.
Before creating the index, we execute the queries 10 times using EXPLAIN ANALYZE. We then calculate the average Execution Time and record the query plan, so we can later compare it with the new plan after the index is introduced.
After adding the index, we run the same query again 10 times and compare the results in terms of execution time and query plan.
Scenario 1
Making appointments table have 1milion rows
ALTER TABLE appointments DISABLE TRIGGER trg_appointments_no_overlap;
INSERT INTO appointments (clinic_id, animal_id, responsible_owner_id, status, date_time, notes)
SELECT
c.clinic_id,
a.animal_id,
a.owner_id,
CASE
WHEN dt < now() THEN
(ARRAY['DONE','CANCELLED'])[floor(random()*2)+1]
ELSE
'CONFIRMED'
END,
dt,
'Auto generated'
FROM generate_series(1, 1000000) gs
JOIN LATERAL (
SELECT clinic_id FROM vet_clinics ORDER BY random() LIMIT 1
) c ON true
JOIN LATERAL (
SELECT animal_id, owner_id FROM animals ORDER BY random() LIMIT 1
) a ON true
JOIN LATERAL (
SELECT NOW() - (random() * INTERVAL '90 days') AS dt
) t ON true;
ALTER TABLE appointments ENABLE TRIGGER trg_appointments_no_overlap;
- I disabled the overlap trigger for easier testing, after I have inserted all the rows I enabled it again
Executing the query 10 times to see average execution time
EXPLAIN ANALYZE SELECT * FROM appointments WHERE status = 'CONFIRMED' AND date_time < now() - interval '45 minutes';
After the first execution
Gather (cost=1000.00..21156.46 rows=1 width=62) (actual time=0.646..81.896 rows=1 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on appointments (cost=0.00..20156.36 rows=1 width=62) (actual time=44.874..69.423 rows=0 loops=3)
Filter: (((status)::text = 'CONFIRMED'::text) AND (date_time < (now() - '00:45:00'::interval)))
Rows Removed by Filter: 333334
Planning Time: 0.726 ms
Execution Time: 81.930 ms
- Average execution time WITHOUT indexes
61.1925ms
Index Scan using idx_appointments_status_date_time on appointments (cost=0.43..7.88 rows=1 width=62) (actual time=0.089..0.090 rows=1 loops=1) Index Cond: (((status)::text = 'CONFIRMED'::text) AND (date_time < (now() - '00:45:00'::interval))) Planning Time: 1.020 ms Execution Time: 0.122 ms
- Average execution time WITH indexes
0.093ms
Because the execution time has been optimized with the use of index we will keep the index.
Scenario 2
Making listings table have 2milion rows
INSERT INTO listings (animal_id, owner_id, status, price, created_at)
SELECT
a.animal_id,
a.owner_id,
CASE
WHEN random() < 0.7 THEN 'DRAFT'
WHEN random() < 0.9 THEN 'ARCHIVED'
ELSE 'SOLD'
END,
(random() * 1000 + 50)::numeric(10,2),
NOW() - (random() * INTERVAL '120 days')
FROM generate_series(1, 2000000) gs
JOIN LATERAL (
SELECT animal_id, owner_id
FROM animals
ORDER BY random()
LIMIT 1
) a ON true;
Executing the query 10 times to see average execution time
EXPLAIN ANALYZE SELECT * FROM listings WHERE status = 'DRAFT' AND created_at < now() - interval '30 days';
After the first execution
Seq Scan on listings (cost=0.00..60619.18 rows=1058297 width=576) (actual time=0.014..453.562 rows=1051498 loops=1) Filter: (((status)::text = 'DRAFT'::text) AND (created_at < (now() - '30 days'::interval))) Rows Removed by Filter: 948511 Planning Time: 0.544 ms Execution Time: 487.292 ms
- Average execution time WITHOUT indexes
494.8984ms
Seq Scan on listings (cost=0.00..60619.18 rows=1058332 width=576) (actual time=0.044..473.306 rows=1051534 loops=1) Filter: (((status)::text = 'DRAFT'::text) AND (created_at < (now() - '30 days'::interval))) Rows Removed by Filter: 948475 Planning Time: 0.092 ms Execution Time: 507.535 ms
- Average execution time WITH indexes
500.327ms
There isn't a change in the execution time, the query doesn't even use the index because its's cheaper to just use seq scan, we won't be keeping this index.
Scenario 3
Executing with EXPLAIN ANALYZE
EXPLAIN ANALYZE
WITH params AS (
SELECT
TIMESTAMP '2025-01-01 00:00:00' AS start_ts,
TIMESTAMP '2026-01-01 00:00:00' AS end_ts
),
listings_by_user AS (
SELECT l.owner_id AS user_id, COUNT(*) AS listings_created
FROM listings l
JOIN params p ON l.created_at >= p.start_ts AND l.created_at < p.end_ts
GROUP BY l.owner_id
),
reviews_by_user AS (
SELECT r.reviewer_id AS user_id,
COUNT(*) AS reviews_left,
AVG(r.rating)::numeric(10,2) AS avg_rating_left
FROM reviews r
JOIN params p ON r.created_at >= p.start_ts AND r.created_at < p.end_ts
GROUP BY r.reviewer_id
),
appointments_by_user AS (
SELECT a.responsible_owner_id AS user_id,
COUNT(*) AS appointments_total,
COUNT(*) FILTER (WHERE a.status = 'DONE') AS appointments_done,
COUNT(*) FILTER (WHERE a.status = 'NO_SHOW') AS appointments_no_show,
COUNT(*) FILTER (WHERE a.status = 'CANCELLED') AS appointments_cancelled
FROM appointments a
JOIN params p ON a.date_time >= p.start_ts AND a.date_time < p.end_ts
GROUP BY a.responsible_owner_id
),
favorites_by_user AS (
SELECT f.client_id AS user_id, COUNT(*) AS favorites_saved_all_time
FROM favorite_listings f
GROUP BY f.client_id
)
SELECT
u.user_id, u.username, u.email, u.name, u.surname,
COALESCE(l.listings_created, 0) AS listings_created,
COALESCE(rv.reviews_left, 0) AS reviews_left,
COALESCE(rv.avg_rating_left, 0) AS avg_rating_left,
COALESCE(ap.appointments_total, 0) AS appointments_total,
COALESCE(ap.appointments_done, 0) AS appointments_done,
COALESCE(ap.appointments_no_show, 0) AS appointments_no_show,
COALESCE(ap.appointments_cancelled, 0) AS appointments_cancelled,
COALESCE(fv.favorites_saved_all_time, 0) AS favorites_saved_all_time,
(
COALESCE(l.listings_created, 0) * 5
+ COALESCE(rv.reviews_left, 0) * 3
+ COALESCE(ap.appointments_done, 0) * 2
+ COALESCE(fv.favorites_saved_all_time, 0)
- COALESCE(ap.appointments_no_show, 0) * 2
) AS activity_score,
DENSE_RANK() OVER (
ORDER BY
(
COALESCE(l.listings_created, 0) * 5
+ COALESCE(rv.reviews_left, 0) * 3
+ COALESCE(ap.appointments_done, 0) * 2
+ COALESCE(fv.favorites_saved_all_time, 0)
- COALESCE(ap.appointments_no_show, 0) * 2
) DESC,
COALESCE(l.listings_created, 0) DESC,
COALESCE(rv.reviews_left, 0) DESC
) AS activity_rank
FROM users u
LEFT JOIN listings_by_user l ON l.user_id = u.user_id
LEFT JOIN reviews_by_user rv ON rv.user_id = u.user_id
LEFT JOIN appointments_by_user ap ON ap.user_id = u.user_id
LEFT JOIN favorites_by_user fv ON fv.user_id = u.user_id
WHERE COALESCE(l.listings_created, 0)
+ COALESCE(rv.reviews_left, 0)
+ COALESCE(ap.appointments_total, 0)
+ COALESCE(fv.favorites_saved_all_time, 0) > 0
ORDER BY activity_rank
LIMIT 10;
Without indexes
Limit (cost=86391.14..86391.14 rows=2 width=2176) (actual time=538.362..538.369 rows=3 loops=1)
CTE params
-> Result (cost=0.00..0.01 rows=1 width=16) (actual time=0.001..0.001 rows=1 loops=1)
-> Sort (cost=86391.13..86391.13 rows=2 width=2176) (actual time=538.361..538.366 rows=3 loops=1)
Sort Key: (dense_rank() OVER (?))
Sort Method: quicksort Memory: 25kB
-> WindowAgg (cost=86391.03..86391.12 rows=2 width=2176) (actual time=538.326..538.334 rows=3 loops=1)
-> Sort (cost=86391.03..86391.04 rows=2 width=2152) (actual time=538.313..538.318 rows=3 loops=1)
" Sort Key: ((((((COALESCE(l.listings_created, '0'::bigint) * 5) + (COALESCE((count(*)), '0'::bigint) * 3)) + (COALESCE((count(*) FILTER (WHERE ((a.status)::text = 'DONE'::text))), '0'::bigint) * 2)) + COALESCE((count(*)), '0'::bigint)) - (COALESCE((count(*) FILTER (WHERE ((a.status)::text = 'NO_SHOW'::text))), '0'::bigint) * 2))) DESC, (COALESCE(l.listings_created, '0'::bigint)) DESC, (COALESCE((count(*)), '0'::bigint)) DESC"
Sort Method: quicksort Memory: 25kB
-> Merge Left Join (cost=86390.69..86391.02 rows=2 width=2152) (actual time=538.278..538.294 rows=3 loops=1)
Merge Cond: (u.user_id = a.responsible_owner_id)
" Filter: ((((COALESCE(l.listings_created, '0'::bigint) + COALESCE((count(*)), '0'::bigint)) + COALESCE((count(*)), '0'::bigint)) + COALESCE((count(*)), '0'::bigint)) > 0)"
Rows Removed by Filter: 7
-> Merge Left Join (cost=82934.17..82934.38 rows=5 width=2112) (actual time=538.223..538.236 rows=10 loops=1)
Merge Cond: (u.user_id = f.client_id)
-> Merge Left Join (cost=82933.14..82933.26 rows=5 width=2104) (actual time=538.151..538.161 rows=10 loops=1)
Merge Cond: (u.user_id = r.reviewer_id)
-> Merge Left Join (cost=82931.79..82931.84 rows=5 width=2080) (actual time=538.097..538.104 rows=10 loops=1)
Merge Cond: (u.user_id = l.user_id)
-> Sort (cost=1.11..1.12 rows=5 width=2072) (actual time=0.025..0.027 rows=10 loops=1)
Sort Key: u.user_id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on users u (cost=0.00..1.05 rows=5 width=2072) (actual time=0.012..0.014 rows=10 loops=1)
-> Sort (cost=82930.68..82930.69 rows=2 width=16) (actual time=538.069..538.070 rows=1 loops=1)
Sort Key: l.user_id
Sort Method: quicksort Memory: 25kB
-> Subquery Scan on l (cost=82930.63..82930.67 rows=2 width=16) (actual time=538.058..538.060 rows=1 loops=1)
-> HashAggregate (cost=82930.63..82930.65 rows=2 width=16) (actual time=538.057..538.058 rows=1 loops=1)
Group Key: l_1.owner_id
Batches: 1 Memory Usage: 24kB
-> Nested Loop (cost=0.00..81819.60 rows=222207 width=8) (actual time=0.094..529.559 rows=41024 loops=1)
Join Filter: ((l_1.created_at >= p.start_ts) AND (l_1.created_at < p.end_ts))
Rows Removed by Join Filter: 1958985
-> CTE Scan on params p (cost=0.00..0.02 rows=1 width=16) (actual time=0.001..0.003 rows=1 loops=1)
-> Seq Scan on listings l_1 (cost=0.00..51821.63 rows=1999863 width=16) (actual time=0.075..395.717 rows=2000009 loops=1)
-> GroupAggregate (cost=1.35..1.38 rows=1 width=32) (actual time=0.052..0.052 rows=0 loops=1)
Group Key: r.reviewer_id
-> Sort (cost=1.35..1.36 rows=1 width=12) (actual time=0.050..0.051 rows=0 loops=1)
Sort Key: r.reviewer_id
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=0.00..1.34 rows=1 width=12) (actual time=0.016..0.016 rows=0 loops=1)
Join Filter: ((r.created_at >= p_1.start_ts) AND (r.created_at < p_1.end_ts))
Rows Removed by Join Filter: 14
-> CTE Scan on params p_1 (cost=0.00..0.02 rows=1 width=16) (actual time=0.001..0.002 rows=1 loops=1)
-> Seq Scan on reviews r (cost=0.00..1.13 rows=13 width=20) (actual time=0.011..0.011 rows=14 loops=1)
-> GroupAggregate (cost=1.03..1.06 rows=2 width=16) (actual time=0.068..0.070 rows=2 loops=1)
Group Key: f.client_id
-> Sort (cost=1.03..1.03 rows=2 width=8) (actual time=0.064..0.065 rows=2 loops=1)
Sort Key: f.client_id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on favorite_listings f (cost=0.00..1.02 rows=2 width=8) (actual time=0.010..0.010 rows=2 loops=1)
-> Materialize (cost=3456.52..3456.57 rows=1 width=40) (actual time=0.049..0.050 rows=0 loops=1)
-> GroupAggregate (cost=3456.52..3456.55 rows=1 width=40) (actual time=0.044..0.045 rows=0 loops=1)
Group Key: a.responsible_owner_id
-> Sort (cost=3456.52..3456.52 rows=1 width=14) (actual time=0.043..0.044 rows=0 loops=1)
Sort Key: a.responsible_owner_id
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=3452.46..3456.51 rows=1 width=14) (actual time=0.034..0.035 rows=0 loops=1)
-> CTE Scan on params p_2 (cost=0.00..0.02 rows=1 width=16) (actual time=0.000..0.001 rows=1 loops=1)
-> Bitmap Heap Scan on appointments a (cost=3452.46..3456.48 rows=1 width=22) (actual time=0.030..0.030 rows=0 loops=1)
Recheck Cond: ((date_time >= p_2.start_ts) AND (date_time < p_2.end_ts))
-> Bitmap Index Scan on idx_appointments_clinic_date_time (cost=0.00..3452.46 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=1)
Index Cond: ((date_time >= p_2.start_ts) AND (date_time < p_2.end_ts))
Planning Time: 0.777 ms
Execution Time: 538.700 ms
- This is an acceptable time so we don't need to optimize it.
Scenario 4
EXPLAIN ANALYZE
WITH
my_likes AS (
SELECT fl.listing_id
FROM favorite_listings fl
WHERE fl.client_id = 1
),
my_recent_likes AS (
SELECT fl.listing_id
FROM favorite_listings fl
JOIN listings l ON l.listing_id = fl.listing_id
WHERE fl.client_id = 1
ORDER BY l.created_at DESC
LIMIT 10
),
similar_users AS (
SELECT
fl2.client_id AS other_user_id,
COUNT(*) AS overlap_likes
FROM favorite_listings fl2
JOIN my_likes ml ON ml.listing_id = fl2.listing_id
WHERE fl2.client_id <> 1
GROUP BY fl2.client_id
HAVING COUNT(*) > 0
),
cf_candidates AS (
SELECT
fl.listing_id,
SUM(su.overlap_likes) AS cf_score,
COUNT(DISTINCT su.other_user_id) AS liked_by_similar_users
FROM similar_users su
JOIN favorite_listings fl ON fl.client_id = su.other_user_id
LEFT JOIN my_likes ml ON ml.listing_id = fl.listing_id
WHERE ml.listing_id IS NULL
GROUP BY fl.listing_id
),
content_candidates AS (
SELECT
l2.listing_id,
COUNT(*) AS content_score
FROM my_recent_likes r
JOIN listings l1 ON l1.listing_id = r.listing_id
JOIN animals a1 ON a1.animal_id = l1.animal_id
JOIN listings l2 ON l2.listing_id <> l1.listing_id
JOIN animals a2 ON a2.animal_id = l2.animal_id
LEFT JOIN my_likes ml ON ml.listing_id = l2.listing_id
WHERE ml.listing_id IS NULL
AND (
a2.species = a1.species
OR a2.breed = a1.breed
OR a2.located_name = a1.located_name
)
GROUP BY l2.listing_id
),
merged AS (
SELECT
COALESCE(cf.listing_id, cc.listing_id) AS listing_id,
COALESCE(cf.cf_score, 0) AS cf_score,
COALESCE(cf.liked_by_similar_users, 0) AS liked_by_similar_users,
COALESCE(cc.content_score, 0) AS content_score
FROM cf_candidates cf
FULL OUTER JOIN content_candidates cc
ON cc.listing_id = cf.listing_id
)
SELECT
l.listing_id,
a.name AS title,
a.species,
a.breed,
a.located_name AS location,
l.created_at,
m.cf_score,
m.liked_by_similar_users,
m.content_score,
(m.cf_score * 3 + m.content_score * 2) AS final_score
FROM merged m
JOIN listings l ON l.listing_id = m.listing_id
JOIN animals a ON a.animal_id = l.animal_id
WHERE l.status = 'ACTIVE'
AND l.owner_id <> 1
ORDER BY final_score DESC, l.created_at DESC
LIMIT 20;
Without indexes
Limit (cost=264852.48..264852.48 rows=1 width=2160) (actual time=7249.269..7249.523 rows=3 loops=1)
CTE my_likes
-> Seq Scan on favorite_listings fl_2 (cost=0.00..1.02 rows=1 width=8) (actual time=0.021..0.022 rows=1 loops=1)
Filter: (client_id = 1)
Rows Removed by Filter: 1
-> Sort (cost=264851.45..264851.46 rows=1 width=2160) (actual time=7008.341..7008.593 rows=3 loops=1)
" Sort Key: (((COALESCE(cf.cf_score, '0'::numeric) * '3'::numeric) + ((COALESCE((count(*)), '0'::bigint) * 2))::numeric)) DESC, l.created_at DESC"
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=218242.91..264851.44 rows=1 width=2160) (actual time=4594.816..7007.250 rows=3 loops=1)
Join Filter: (l.animal_id = a.animal_id)
Rows Removed by Join Filter: 9
-> Hash Join (cost=218242.91..264850.37 rows=1 width=72) (actual time=4594.763..7007.059 rows=3 loops=1)
" Hash Cond: (COALESCE(cf.listing_id, l2.listing_id) = l.listing_id)"
-> Hash Full Join (cost=172920.66..215833.91 rows=1407309 width=64) (actual time=2751.983..5316.014 rows=2000008 loops=1)
Hash Cond: (l2.listing_id = cf.listing_id)
-> HashAggregate (cost=172918.43..197986.12 rows=1407309 width=16) (actual time=2751.308..4818.192 rows=2000008 loops=1)
Group Key: l2.listing_id
Planned Partitions: 32 Batches: 33 Memory Usage: 8209kB Disk Usage: 63168kB
-> Hash Anti Join (cost=3.38..93757.30 rows=1407309 width=8) (actual time=1.187..1532.095 rows=2000008 loops=1)
Hash Cond: (l2.listing_id = ml.listing_id)
-> Hash Join (cost=3.35..75989.99 rows=1407310 width=8) (actual time=1.125..1117.188 rows=2000008 loops=1)
Hash Cond: (l2.animal_id = a2.animal_id)
Join Filter: (l2.listing_id <> l1.listing_id)
Rows Removed by Join Filter: 1
-> Seq Scan on listings l2 (cost=0.00..51821.63 rows=1999863 width=16) (actual time=0.146..543.935 rows=2000009 loops=1)
-> Hash (cost=3.32..3.32 rows=2 width=16) (actual time=0.831..0.857 rows=8 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Merge Join (cost=2.50..3.32 rows=2 width=16) (actual time=0.779..0.832 rows=8 loops=1)
Merge Cond: (l1.listing_id = r.listing_id)
-> Nested Loop (cost=0.57..364472.04 rows=4221933 width=16) (actual time=0.365..0.400 rows=9 loops=1)
Join Filter: (((a2.species)::text = (a1.species)::text) OR ((a2.breed)::text = (a1.breed)::text) OR ((a2.located_name)::text = (a1.located_name)::text))
Rows Removed by Join Filter: 1
-> Nested Loop (cost=0.57..244479.22 rows=1999863 width=1556) (actual time=0.330..0.346 rows=2 loops=1)
-> Index Scan using listings_pk on listings l1 (cost=0.43..199481.87 rows=1999863 width=16) (actual time=0.064..0.065 rows=2 loops=1)
-> Memoize (cost=0.14..0.16 rows=1 width=1556) (actual time=0.133..0.133 rows=1 loops=2)
Cache Key: l1.animal_id
Cache Mode: logical
Hits: 0 Misses: 2 Evictions: 0 Overflows: 0 Memory Usage: 1kB
-> Index Scan using animals_pk on animals a1 (cost=0.13..0.15 rows=1 width=1556) (actual time=0.124..0.125 rows=1 loops=2)
Index Cond: (animal_id = l1.animal_id)
-> Materialize (cost=0.00..1.04 rows=3 width=1556) (actual time=0.012..0.018 rows=5 loops=2)
-> Seq Scan on animals a2 (cost=0.00..1.03 rows=3 width=1556) (actual time=0.014..0.018 rows=9 loops=1)
-> Sort (cost=1.92..1.93 rows=1 width=8) (actual time=0.401..0.411 rows=1 loops=1)
Sort Key: r.listing_id
Sort Method: quicksort Memory: 25kB
-> Subquery Scan on r (cost=1.90..1.91 rows=1 width=8) (actual time=0.271..0.281 rows=1 loops=1)
-> Limit (cost=1.90..1.90 rows=1 width=16) (actual time=0.266..0.274 rows=1 loops=1)
-> Sort (cost=1.90..1.90 rows=1 width=16) (actual time=0.252..0.259 rows=1 loops=1)
Sort Key: l_1.created_at DESC
Sort Method: quicksort Memory: 25kB
-> Merge Join (cost=1.47..1.89 rows=1 width=16) (actual time=0.218..0.226 rows=1 loops=1)
Merge Cond: (l_1.listing_id = fl.listing_id)
-> Index Scan using listings_pk on listings l_1 (cost=0.43..199481.87 rows=1999863 width=16) (actual time=0.035..0.037 rows=2 loops=1)
-> Sort (cost=1.03..1.04 rows=1 width=8) (actual time=0.164..0.167 rows=1 loops=1)
Sort Key: fl.listing_id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on favorite_listings fl (cost=0.00..1.02 rows=1 width=8) (actual time=0.027..0.029 rows=1 loops=1)
Filter: (client_id = 1)
Rows Removed by Filter: 1
-> Hash (cost=0.02..0.02 rows=1 width=8) (actual time=0.014..0.015 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> CTE Scan on my_likes ml (cost=0.00..0.02 rows=1 width=8) (actual time=0.000..0.001 rows=1 loops=1)
-> Hash (cost=2.21..2.21 rows=1 width=48) (actual time=0.611..0.621 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
-> Subquery Scan on cf (cost=2.12..2.21 rows=1 width=48) (actual time=0.610..0.619 rows=0 loops=1)
-> GroupAggregate (cost=2.12..2.20 rows=1 width=48) (actual time=0.609..0.617 rows=0 loops=1)
Group Key: fl_1.listing_id
-> Nested Loop (cost=2.12..2.18 rows=1 width=24) (actual time=0.607..0.615 rows=0 loops=1)
Join Filter: (fl_1.client_id = fl2.client_id)
-> Merge Anti Join (cost=1.06..1.09 rows=1 width=16) (actual time=0.452..0.457 rows=1 loops=1)
Merge Cond: (fl_1.listing_id = ml_1.listing_id)
-> Sort (cost=1.03..1.03 rows=2 width=16) (actual time=0.346..0.348 rows=2 loops=1)
Sort Key: fl_1.listing_id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on favorite_listings fl_1 (cost=0.00..1.02 rows=2 width=16) (actual time=0.065..0.068 rows=2 loops=1)
-> Sort (cost=0.03..0.04 rows=1 width=8) (actual time=0.066..0.067 rows=1 loops=1)
Sort Key: ml_1.listing_id
Sort Method: quicksort Memory: 25kB
-> CTE Scan on my_likes ml_1 (cost=0.00..0.02 rows=1 width=8) (actual time=0.026..0.028 rows=1 loops=1)
-> HashAggregate (cost=1.06..1.07 rows=1 width=16) (actual time=0.150..0.152 rows=0 loops=1)
Group Key: fl2.client_id
Filter: (count(*) > 0)
Batches: 1 Memory Usage: 24kB
-> Nested Loop (cost=0.00..1.06 rows=1 width=8) (actual time=0.079..0.081 rows=0 loops=1)
Join Filter: (fl2.listing_id = ml_2.listing_id)
Rows Removed by Join Filter: 1
-> Seq Scan on favorite_listings fl2 (cost=0.00..1.02 rows=1 width=16) (actual time=0.012..0.017 rows=1 loops=1)
Filter: (client_id <> 1)
Rows Removed by Filter: 1
-> CTE Scan on my_likes ml_2 (cost=0.00..0.02 rows=1 width=8) (actual time=0.001..0.002 rows=1 loops=1)
-> Hash (cost=45322.24..45322.24 rows=1 width=24) (actual time=1436.407..1436.608 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Gather (cost=1000.00..45322.24 rows=1 width=24) (actual time=1420.646..1436.431 rows=4 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on listings l (cost=0.00..44322.14 rows=1 width=24) (actual time=1316.362..1367.278 rows=1 loops=3)
Filter: ((owner_id <> 1) AND ((status)::text = 'ACTIVE'::text))
Rows Removed by Filter: 666668
-> Seq Scan on animals a (cost=0.00..1.03 rows=3 width=2072) (actual time=0.019..0.019 rows=4 loops=3)
Planning Time: 11.722 ms
JIT:
Functions: 124
" Options: Inlining false, Optimization false, Expressions true, Deforming true"
" Timing: Generation 22.000 ms, Inlining 0.000 ms, Optimization 32.272 ms, Emission 243.487 ms, Total 297.758 ms"
Execution Time: 7756.372 ms
This is not an acceptable execution time so we will optimze it with indexes.
- We add these indexes
CREATE INDEX idx_favorite_listings_client_listing ON favorite_listings (client_id, listing_id); CREATE INDEX idx_favorite_listings_listing_client ON favorite_listings (listing_id, client_id); CREATE INDEX idx_listings_status_owner_created ON listings (status, owner_id, created_at DESC); CREATE INDEX idx_listings_animal ON listings (animal_id); CREATE INDEX idx_animals_species ON animals (species); CREATE INDEX idx_animals_breed ON animals (breed); CREATE INDEX idx_animals_located_name ON animals (located_name);
With indexes
Limit (cost=24.22..24.23 rows=1 width=121) (actual time=0.419..0.427 rows=3 loops=1)
CTE my_likes
-> Seq Scan on favorite_listings fl_2 (cost=0.00..1.02 rows=1 width=8) (actual time=0.004..0.005 rows=1 loops=1)
Filter: (client_id = 1)
Rows Removed by Filter: 1
-> Sort (cost=23.20..23.20 rows=1 width=121) (actual time=0.418..0.425 rows=3 loops=1)
" Sort Key: (((COALESCE((sum((count(*)))), '0'::numeric) * '3'::numeric) + ((COALESCE((count(*)), '0'::bigint) * 2))::numeric)) DESC, l.created_at DESC"
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=7.22..23.19 rows=1 width=121) (actual time=0.381..0.405 rows=3 loops=1)
-> Nested Loop (cost=7.08..23.02 rows=1 width=72) (actual time=0.357..0.374 rows=3 loops=1)
-> Merge Full Join (cost=6.65..14.58 rows=1 width=64) (actual time=0.338..0.348 rows=3 loops=1)
Merge Cond: (fl.listing_id = l2.listing_id)
-> GroupAggregate (cost=2.55..10.42 rows=1 width=48) (actual time=0.071..0.074 rows=0 loops=1)
Group Key: fl.listing_id
-> Nested Loop (cost=2.55..10.40 rows=1 width=24) (actual time=0.070..0.072 rows=0 loops=1)
Join Filter: (fl.listing_id = l_1.listing_id)
-> Nested Loop (cost=2.12..2.18 rows=1 width=24) (actual time=0.070..0.072 rows=0 loops=1)
Join Filter: (fl.client_id = fl2.client_id)
-> Merge Anti Join (cost=1.06..1.09 rows=1 width=16) (actual time=0.061..0.062 rows=1 loops=1)
Merge Cond: (fl.listing_id = ml.listing_id)
-> Sort (cost=1.03..1.03 rows=2 width=16) (actual time=0.034..0.034 rows=2 loops=1)
Sort Key: fl.listing_id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on favorite_listings fl (cost=0.00..1.02 rows=2 width=16) (actual time=0.014..0.015 rows=2 loops=1)
-> Sort (cost=0.03..0.04 rows=1 width=8) (actual time=0.023..0.024 rows=1 loops=1)
Sort Key: ml.listing_id
Sort Method: quicksort Memory: 25kB
-> CTE Scan on my_likes ml (cost=0.00..0.02 rows=1 width=8) (actual time=0.009..0.010 rows=1 loops=1)
-> HashAggregate (cost=1.06..1.07 rows=1 width=16) (actual time=0.007..0.008 rows=0 loops=1)
Group Key: fl2.client_id
Batches: 1 Memory Usage: 24kB
-> Nested Loop (cost=0.00..1.06 rows=1 width=8) (actual time=0.006..0.006 rows=0 loops=1)
Join Filter: (fl2.listing_id = ml_1.listing_id)
Rows Removed by Join Filter: 1
-> Seq Scan on favorite_listings fl2 (cost=0.00..1.02 rows=1 width=16) (actual time=0.003..0.004 rows=1 loops=1)
Filter: (client_id <> 1)
Rows Removed by Filter: 1
-> CTE Scan on my_likes ml_1 (cost=0.00..0.02 rows=1 width=8) (actual time=0.000..0.001 rows=1 loops=1)
-> Index Scan using idx_listings_status_owner_created on listings l_1 (cost=0.43..8.20 rows=1 width=8) (never executed)
Index Cond: ((status)::text = 'ACTIVE'::text)
Filter: (owner_id <> 1)
-> GroupAggregate (cost=4.10..4.12 rows=1 width=16) (actual time=0.266..0.272 rows=3 loops=1)
Group Key: l2.listing_id
-> Sort (cost=4.10..4.11 rows=1 width=8) (actual time=0.262..0.265 rows=3 loops=1)
Sort Key: l2.listing_id
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=3.23..4.09 rows=1 width=8) (actual time=0.214..0.246 rows=3 loops=1)
Join Filter: (((a2.species)::text = (a1.species)::text) OR ((a2.breed)::text = (a1.breed)::text) OR ((a2.located_name)::text = (a1.located_name)::text))
-> Merge Join (cost=3.09..3.92 rows=1 width=36) (actual time=0.188..0.214 rows=3 loops=1)
Merge Cond: (l1.listing_id = r.listing_id)
-> Nested Loop (cost=0.85..229633.64 rows=2000898 width=44) (actual time=0.118..0.140 rows=4 loops=1)
Join Filter: (l2.listing_id <> l1.listing_id)
-> Index Scan using listings_pk on listings l1 (cost=0.43..199610.69 rows=2000900 width=16) (actual time=0.018..0.019 rows=2 loops=1)
-> Materialize (cost=0.43..9.44 rows=1 width=28) (actual time=0.049..0.058 rows=2 loops=2)
-> Nested Loop Anti Join (cost=0.43..9.44 rows=1 width=28) (actual time=0.090..0.107 rows=3 loops=1)
Join Filter: (ml_2.listing_id = l2.listing_id)
Rows Removed by Join Filter: 3
-> Nested Loop (cost=0.43..9.41 rows=1 width=28) (actual time=0.088..0.102 rows=4 loops=1)
Join Filter: (a2.animal_id = l2.animal_id)
Rows Removed by Join Filter: 9
-> Index Scan using idx_listings_status_owner_created on listings l2 (cost=0.43..8.20 rows=1 width=16) (actual time=0.082..0.085 rows=4 loops=1)
Index Cond: ((status)::text = 'ACTIVE'::text)
Filter: (owner_id <> 1)
Rows Removed by Filter: 1
-> Seq Scan on animals a2 (cost=0.00..1.09 rows=9 width=28) (actual time=0.002..0.002 rows=3 loops=4)
-> CTE Scan on my_likes ml_2 (cost=0.00..0.02 rows=1 width=8) (actual time=0.000..0.000 rows=1 loops=4)
-> Sort (cost=2.23..2.24 rows=1 width=8) (actual time=0.068..0.070 rows=1 loops=1)
Sort Key: r.listing_id
Sort Method: quicksort Memory: 25kB
-> Subquery Scan on r (cost=2.21..2.22 rows=1 width=8) (actual time=0.053..0.054 rows=1 loops=1)
-> Limit (cost=2.21..2.21 rows=1 width=16) (actual time=0.052..0.053 rows=1 loops=1)
-> Sort (cost=2.21..2.21 rows=1 width=16) (actual time=0.051..0.052 rows=1 loops=1)
Sort Key: l_2.created_at DESC
Sort Method: quicksort Memory: 25kB
-> Merge Join (cost=1.46..2.20 rows=1 width=16) (actual time=0.035..0.038 rows=1 loops=1)
Merge Cond: (l_2.listing_id = fl_1.listing_id)
-> Index Scan using listings_pk on listings l_2 (cost=0.43..199610.69 rows=2000900 width=16) (actual time=0.019..0.020 rows=2 loops=1)
-> Sort (cost=1.03..1.04 rows=1 width=8) (actual time=0.014..0.015 rows=1 loops=1)
Sort Key: fl_1.listing_id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on favorite_listings fl_1 (cost=0.00..1.02 rows=1 width=8) (actual time=0.004..0.004 rows=1 loops=1)
Filter: (client_id = 1)
Rows Removed by Filter: 1
-> Index Scan using animals_pk on animals a1 (cost=0.14..0.15 rows=1 width=28) (actual time=0.008..0.008 rows=1 loops=3)
Index Cond: (animal_id = l1.animal_id)
-> Index Scan using listings_pk on listings l (cost=0.43..8.45 rows=1 width=24) (actual time=0.007..0.007 rows=1 loops=3)
" Index Cond: (listing_id = COALESCE(fl.listing_id, l2.listing_id))"
-> Index Scan using animals_pk on animals a (cost=0.14..0.15 rows=1 width=33) (actual time=0.005..0.005 rows=1 loops=3)
Index Cond: (animal_id = l.animal_id)
Planning Time: 7.688 ms
Execution Time: 0.904 ms
- The indexes worked, now the execution time is accepatable.
Security measures
SQL injection prevention
@Query(value = """
SELECT *
FROM get_top_active_users(:startTs, :endTs)
""", nativeQuery = true)
List<UserActivityRankingProjection> getTopActiveUsers(
@Param("startTs") LocalDateTime startTs,
@Param("endTs") LocalDateTime endTs
);
- Native queries are executed using named parameters such as :startTs and :endTs, which are bound by Spring Data JPA. This prevents SQL injection because input values are treated as parameters, not as executable SQL.
Keeping passwords
*Password security is implemented using BCryptPasswordEncoder, which hashes user passwords before storing them in the database. This ensures that passwords are not stored in plain text and provides protection against brute-force and dictionary attacks.
@Bean
public PasswordEncoder passwordEncoder() {
return new BCryptPasswordEncoder();
}
CORS
- The application uses a CORS configuration to control which frontend clients are allowed to communicate with the backend API. In development, requests are allowed only from local frontend origins such as http://localhost:* (this will be changed to the correct URL of the hosted site). The configuration defines the permitted HTTP methods, including GET, POST, PUT, PATCH, DELETE, and OPTIONS.
@Bean public CorsConfigurationSource corsConfigurationSource() { CorsConfiguration configuration = new CorsConfiguration(); configuration.setAllowedOriginPatterns(List.of("http://localhost:*")); configuration.setAllowedMethods(Arrays.asList("GET", "POST", "PUT","PATCH", "DELETE", "OPTIONS")); configuration.setAllowedHeaders(Arrays.asList("*")); configuration.setAllowCredentials(true); UrlBasedCorsConfigurationSource source = new UrlBasedCorsConfigurationSource(); source.registerCorsConfiguration("/**", configuration); return source; }
