Context Navigation

← Previous Version
View Latest Version
Next Version →

Version 7 (modified by 231035, 4 hours ago) ( diff )
--

Other Topics

SQL Performance, adding Indexes

To analyze the performance of our queries, we consider several different use cases.

The testing approach is as follows:

We first insert a large number of new records into the tables (around 1~2 million rows) so that using an index becomes meaningful and beneficial when applied correctly. It is clear that for a table with only a small number of records, indexing would not provide a significant advantage.

Before creating the index, we execute the queries 10 times using EXPLAIN ANALYZE. We then calculate the average Execution Time and record the query plan, so we can later compare it with the new plan after the index is introduced.

After adding the index, we run the same query again 10 times and compare the results in terms of execution time and query plan.

Scenario 1

Making appointments table have 1milion rows

ALTER TABLE appointments DISABLE TRIGGER trg_appointments_no_overlap;

INSERT INTO appointments (clinic_id, animal_id, responsible_owner_id, status, date_time, notes)
SELECT
    c.clinic_id,
    a.animal_id,
    a.owner_id,
    CASE
        WHEN dt < now() THEN
            (ARRAY['DONE','CANCELLED'])[floor(random()*2)+1]
        ELSE
            'CONFIRMED'
        END,
    dt,
    'Auto generated'
FROM generate_series(1, 1000000) gs
         JOIN LATERAL (
    SELECT clinic_id FROM vet_clinics ORDER BY random() LIMIT 1
    ) c ON true
         JOIN LATERAL (
    SELECT animal_id, owner_id FROM animals ORDER BY random() LIMIT 1
    ) a ON true
         JOIN LATERAL (
    SELECT NOW() - (random() * INTERVAL '90 days') AS dt
    ) t ON true;

ALTER TABLE appointments ENABLE TRIGGER trg_appointments_no_overlap;

I disabled the overlap trigger for easier testing, after I have inserted all the rows I enabled it again

Executing the query 10 times to see average execution time

EXPLAIN ANALYZE
SELECT *
FROM appointments
WHERE status = 'CONFIRMED'
  AND date_time < now() - interval '45 minutes';

After the first execution

Gather  (cost=1000.00..21156.46 rows=1 width=62) (actual time=0.646..81.896 rows=1 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  ->  Parallel Seq Scan on appointments  (cost=0.00..20156.36 rows=1 width=62) (actual time=44.874..69.423 rows=0 loops=3)
        Filter: (((status)::text = 'CONFIRMED'::text) AND (date_time < (now() - '00:45:00'::interval)))
        Rows Removed by Filter: 333334
Planning Time: 0.726 ms
Execution Time: 81.930 ms

Average execution time WITHOUT indexes

61.1925ms

Index Scan using idx_appointments_status_date_time on appointments  (cost=0.43..7.88 rows=1 width=62) (actual time=0.089..0.090 rows=1 loops=1)
  Index Cond: (((status)::text = 'CONFIRMED'::text) AND (date_time < (now() - '00:45:00'::interval)))
Planning Time: 1.020 ms
Execution Time: 0.122 ms

Average execution time WITH indexes

0.093ms Because the execution time has been optimized with the use of index we will keep the index.

Scenario 2

Making listings table have 2milion rows

INSERT INTO listings (animal_id, owner_id, status, price, created_at)
SELECT
    a.animal_id,
    a.owner_id,
    CASE
        WHEN random() < 0.7 THEN 'DRAFT'
        WHEN random() < 0.9 THEN 'ARCHIVED'
        ELSE 'SOLD'
        END,
    (random() * 1000 + 50)::numeric(10,2),
    NOW() - (random() * INTERVAL '120 days')
FROM generate_series(1, 2000000) gs
         JOIN LATERAL (
    SELECT animal_id, owner_id
    FROM animals
    ORDER BY random()
    LIMIT 1
    ) a ON true;

Executing the query 10 times to see average execution time

EXPLAIN ANALYZE
SELECT *
FROM listings
WHERE status = 'DRAFT'
  AND created_at < now() - interval '30 days';

After the first execution

Seq Scan on listings  (cost=0.00..60619.18 rows=1058297 width=576) (actual time=0.014..453.562 rows=1051498 loops=1)
  Filter: (((status)::text = 'DRAFT'::text) AND (created_at < (now() - '30 days'::interval)))
  Rows Removed by Filter: 948511
Planning Time: 0.544 ms
Execution Time: 487.292 ms

Average execution time WITHOUT indexes

494.8984ms

Seq Scan on listings  (cost=0.00..60619.18 rows=1058332 width=576) (actual time=0.044..473.306 rows=1051534 loops=1)
  Filter: (((status)::text = 'DRAFT'::text) AND (created_at < (now() - '30 days'::interval)))
  Rows Removed by Filter: 948475
Planning Time: 0.092 ms
Execution Time: 507.535 ms

Average execution time WITH indexes

500.327ms There isn't a change in the execution time, the query doesn't even use the index because its's cheaper to just use seq scan, we won't be keeping this index.

Scenario 3

Executing with EXPLAIN ANALYZE

EXPLAIN ANALYZE
WITH params AS (
    SELECT
        TIMESTAMP '2025-01-01 00:00:00' AS start_ts,
        TIMESTAMP '2026-01-01 00:00:00' AS end_ts
),
     listings_by_user AS (
         SELECT l.owner_id AS user_id, COUNT(*) AS listings_created
         FROM listings l
                  JOIN params p ON l.created_at >= p.start_ts AND l.created_at < p.end_ts
         GROUP BY l.owner_id
     ),
     reviews_by_user AS (
         SELECT r.reviewer_id AS user_id,
                COUNT(*) AS reviews_left,
                AVG(r.rating)::numeric(10,2) AS avg_rating_left
         FROM reviews r
                  JOIN params p ON r.created_at >= p.start_ts AND r.created_at < p.end_ts
         GROUP BY r.reviewer_id
     ),
     appointments_by_user AS (
         SELECT a.responsible_owner_id AS user_id,
                COUNT(*) AS appointments_total,
                COUNT(*) FILTER (WHERE a.status = 'DONE') AS appointments_done,
                COUNT(*) FILTER (WHERE a.status = 'NO_SHOW') AS appointments_no_show,
                COUNT(*) FILTER (WHERE a.status = 'CANCELLED') AS appointments_cancelled
         FROM appointments a
                  JOIN params p ON a.date_time >= p.start_ts AND a.date_time < p.end_ts
         GROUP BY a.responsible_owner_id
     ),
     favorites_by_user AS (
         SELECT f.client_id AS user_id, COUNT(*) AS favorites_saved_all_time
         FROM favorite_listings f
         GROUP BY f.client_id
     )
SELECT
    u.user_id, u.username, u.email, u.name, u.surname,
    COALESCE(l.listings_created, 0) AS listings_created,
    COALESCE(rv.reviews_left, 0) AS reviews_left,
    COALESCE(rv.avg_rating_left, 0) AS avg_rating_left,
    COALESCE(ap.appointments_total, 0) AS appointments_total,
    COALESCE(ap.appointments_done, 0) AS appointments_done,
    COALESCE(ap.appointments_no_show, 0) AS appointments_no_show,
    COALESCE(ap.appointments_cancelled, 0) AS appointments_cancelled,
    COALESCE(fv.favorites_saved_all_time, 0) AS favorites_saved_all_time,
    (
        COALESCE(l.listings_created, 0) * 5
            + COALESCE(rv.reviews_left, 0) * 3
            + COALESCE(ap.appointments_done, 0) * 2
            + COALESCE(fv.favorites_saved_all_time, 0)
            - COALESCE(ap.appointments_no_show, 0) * 2
        ) AS activity_score,
    DENSE_RANK() OVER (
        ORDER BY
            (
                COALESCE(l.listings_created, 0) * 5
                    + COALESCE(rv.reviews_left, 0) * 3
                    + COALESCE(ap.appointments_done, 0) * 2
                    + COALESCE(fv.favorites_saved_all_time, 0)
                    - COALESCE(ap.appointments_no_show, 0) * 2
                ) DESC,
            COALESCE(l.listings_created, 0) DESC,
            COALESCE(rv.reviews_left, 0) DESC
        ) AS activity_rank
FROM users u
         LEFT JOIN listings_by_user l ON l.user_id = u.user_id
         LEFT JOIN reviews_by_user rv ON rv.user_id = u.user_id
         LEFT JOIN appointments_by_user ap ON ap.user_id = u.user_id
         LEFT JOIN favorites_by_user fv ON fv.user_id = u.user_id
WHERE COALESCE(l.listings_created, 0)
          + COALESCE(rv.reviews_left, 0)
          + COALESCE(ap.appointments_total, 0)
          + COALESCE(fv.favorites_saved_all_time, 0) > 0
ORDER BY activity_rank
LIMIT 10;

Without indexes

Limit  (cost=86391.14..86391.14 rows=2 width=2176) (actual time=538.362..538.369 rows=3 loops=1)
  CTE params
    ->  Result  (cost=0.00..0.01 rows=1 width=16) (actual time=0.001..0.001 rows=1 loops=1)
  ->  Sort  (cost=86391.13..86391.13 rows=2 width=2176) (actual time=538.361..538.366 rows=3 loops=1)
        Sort Key: (dense_rank() OVER (?))
        Sort Method: quicksort  Memory: 25kB
        ->  WindowAgg  (cost=86391.03..86391.12 rows=2 width=2176) (actual time=538.326..538.334 rows=3 loops=1)
              ->  Sort  (cost=86391.03..86391.04 rows=2 width=2152) (actual time=538.313..538.318 rows=3 loops=1)
"                    Sort Key: ((((((COALESCE(l.listings_created, '0'::bigint) * 5) + (COALESCE((count(*)), '0'::bigint) * 3)) + (COALESCE((count(*) FILTER (WHERE ((a.status)::text = 'DONE'::text))), '0'::bigint) * 2)) + COALESCE((count(*)), '0'::bigint)) - (COALESCE((count(*) FILTER (WHERE ((a.status)::text = 'NO_SHOW'::text))), '0'::bigint) * 2))) DESC, (COALESCE(l.listings_created, '0'::bigint)) DESC, (COALESCE((count(*)), '0'::bigint)) DESC"
                    Sort Method: quicksort  Memory: 25kB
                    ->  Merge Left Join  (cost=86390.69..86391.02 rows=2 width=2152) (actual time=538.278..538.294 rows=3 loops=1)
                          Merge Cond: (u.user_id = a.responsible_owner_id)
"                          Filter: ((((COALESCE(l.listings_created, '0'::bigint) + COALESCE((count(*)), '0'::bigint)) + COALESCE((count(*)), '0'::bigint)) + COALESCE((count(*)), '0'::bigint)) > 0)"
                          Rows Removed by Filter: 7
                          ->  Merge Left Join  (cost=82934.17..82934.38 rows=5 width=2112) (actual time=538.223..538.236 rows=10 loops=1)
                                Merge Cond: (u.user_id = f.client_id)
                                ->  Merge Left Join  (cost=82933.14..82933.26 rows=5 width=2104) (actual time=538.151..538.161 rows=10 loops=1)
                                      Merge Cond: (u.user_id = r.reviewer_id)
                                      ->  Merge Left Join  (cost=82931.79..82931.84 rows=5 width=2080) (actual time=538.097..538.104 rows=10 loops=1)
                                            Merge Cond: (u.user_id = l.user_id)
                                            ->  Sort  (cost=1.11..1.12 rows=5 width=2072) (actual time=0.025..0.027 rows=10 loops=1)
                                                  Sort Key: u.user_id
                                                  Sort Method: quicksort  Memory: 25kB
                                                  ->  Seq Scan on users u  (cost=0.00..1.05 rows=5 width=2072) (actual time=0.012..0.014 rows=10 loops=1)
                                            ->  Sort  (cost=82930.68..82930.69 rows=2 width=16) (actual time=538.069..538.070 rows=1 loops=1)
                                                  Sort Key: l.user_id
                                                  Sort Method: quicksort  Memory: 25kB
                                                  ->  Subquery Scan on l  (cost=82930.63..82930.67 rows=2 width=16) (actual time=538.058..538.060 rows=1 loops=1)
                                                        ->  HashAggregate  (cost=82930.63..82930.65 rows=2 width=16) (actual time=538.057..538.058 rows=1 loops=1)
                                                              Group Key: l_1.owner_id
                                                              Batches: 1  Memory Usage: 24kB
                                                              ->  Nested Loop  (cost=0.00..81819.60 rows=222207 width=8) (actual time=0.094..529.559 rows=41024 loops=1)
                                                                    Join Filter: ((l_1.created_at >= p.start_ts) AND (l_1.created_at < p.end_ts))
                                                                    Rows Removed by Join Filter: 1958985
                                                                    ->  CTE Scan on params p  (cost=0.00..0.02 rows=1 width=16) (actual time=0.001..0.003 rows=1 loops=1)
                                                                    ->  Seq Scan on listings l_1  (cost=0.00..51821.63 rows=1999863 width=16) (actual time=0.075..395.717 rows=2000009 loops=1)
                                      ->  GroupAggregate  (cost=1.35..1.38 rows=1 width=32) (actual time=0.052..0.052 rows=0 loops=1)
                                            Group Key: r.reviewer_id
                                            ->  Sort  (cost=1.35..1.36 rows=1 width=12) (actual time=0.050..0.051 rows=0 loops=1)
                                                  Sort Key: r.reviewer_id
                                                  Sort Method: quicksort  Memory: 25kB
                                                  ->  Nested Loop  (cost=0.00..1.34 rows=1 width=12) (actual time=0.016..0.016 rows=0 loops=1)
                                                        Join Filter: ((r.created_at >= p_1.start_ts) AND (r.created_at < p_1.end_ts))
                                                        Rows Removed by Join Filter: 14
                                                        ->  CTE Scan on params p_1  (cost=0.00..0.02 rows=1 width=16) (actual time=0.001..0.002 rows=1 loops=1)
                                                        ->  Seq Scan on reviews r  (cost=0.00..1.13 rows=13 width=20) (actual time=0.011..0.011 rows=14 loops=1)
                                ->  GroupAggregate  (cost=1.03..1.06 rows=2 width=16) (actual time=0.068..0.070 rows=2 loops=1)
                                      Group Key: f.client_id
                                      ->  Sort  (cost=1.03..1.03 rows=2 width=8) (actual time=0.064..0.065 rows=2 loops=1)
                                            Sort Key: f.client_id
                                            Sort Method: quicksort  Memory: 25kB
                                            ->  Seq Scan on favorite_listings f  (cost=0.00..1.02 rows=2 width=8) (actual time=0.010..0.010 rows=2 loops=1)
                          ->  Materialize  (cost=3456.52..3456.57 rows=1 width=40) (actual time=0.049..0.050 rows=0 loops=1)
                                ->  GroupAggregate  (cost=3456.52..3456.55 rows=1 width=40) (actual time=0.044..0.045 rows=0 loops=1)
                                      Group Key: a.responsible_owner_id
                                      ->  Sort  (cost=3456.52..3456.52 rows=1 width=14) (actual time=0.043..0.044 rows=0 loops=1)
                                            Sort Key: a.responsible_owner_id
                                            Sort Method: quicksort  Memory: 25kB
                                            ->  Nested Loop  (cost=3452.46..3456.51 rows=1 width=14) (actual time=0.034..0.035 rows=0 loops=1)
                                                  ->  CTE Scan on params p_2  (cost=0.00..0.02 rows=1 width=16) (actual time=0.000..0.001 rows=1 loops=1)
                                                  ->  Bitmap Heap Scan on appointments a  (cost=3452.46..3456.48 rows=1 width=22) (actual time=0.030..0.030 rows=0 loops=1)
                                                        Recheck Cond: ((date_time >= p_2.start_ts) AND (date_time < p_2.end_ts))
                                                        ->  Bitmap Index Scan on idx_appointments_clinic_date_time  (cost=0.00..3452.46 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=1)
                                                              Index Cond: ((date_time >= p_2.start_ts) AND (date_time < p_2.end_ts))
Planning Time: 0.777 ms
Execution Time: 538.700 ms

This is an acceptable time so we don't need to optimize it.

Scenario 4

EXPLAIN ANALYZE
WITH
    my_likes AS (
        SELECT fl.listing_id
        FROM favorite_listings fl
        WHERE fl.client_id = 1
    ),

    my_recent_likes AS (
        SELECT fl.listing_id
        FROM favorite_listings fl
                 JOIN listings l ON l.listing_id = fl.listing_id
        WHERE fl.client_id = 1
        ORDER BY l.created_at DESC
        LIMIT 10
    ),

    similar_users AS (
        SELECT
            fl2.client_id AS other_user_id,
            COUNT(*) AS overlap_likes
        FROM favorite_listings fl2
                 JOIN my_likes ml ON ml.listing_id = fl2.listing_id
        WHERE fl2.client_id <> 1
        GROUP BY fl2.client_id
        HAVING COUNT(*) > 0
    ),

    cf_candidates AS (
        SELECT
            fl.listing_id,
            SUM(su.overlap_likes) AS cf_score,
            COUNT(DISTINCT su.other_user_id) AS liked_by_similar_users
        FROM similar_users su
                 JOIN favorite_listings fl ON fl.client_id = su.other_user_id
                 LEFT JOIN my_likes ml ON ml.listing_id = fl.listing_id
        WHERE ml.listing_id IS NULL
        GROUP BY fl.listing_id
    ),

    content_candidates AS (
        SELECT
            l2.listing_id,
            COUNT(*) AS content_score
        FROM my_recent_likes r
                 JOIN listings l1 ON l1.listing_id = r.listing_id
                 JOIN animals a1 ON a1.animal_id = l1.animal_id

                 JOIN listings l2 ON l2.listing_id <> l1.listing_id
                 JOIN animals a2 ON a2.animal_id = l2.animal_id

                 LEFT JOIN my_likes ml ON ml.listing_id = l2.listing_id
        WHERE ml.listing_id IS NULL
          AND (
            a2.species = a1.species
                OR a2.breed = a1.breed
                OR a2.located_name = a1.located_name
            )
        GROUP BY l2.listing_id
    ),

    merged AS (
        SELECT
            COALESCE(cf.listing_id, cc.listing_id) AS listing_id,
            COALESCE(cf.cf_score, 0) AS cf_score,
            COALESCE(cf.liked_by_similar_users, 0) AS liked_by_similar_users,
            COALESCE(cc.content_score, 0) AS content_score
        FROM cf_candidates cf
                 FULL OUTER JOIN content_candidates cc
                                 ON cc.listing_id = cf.listing_id
    )

SELECT
    l.listing_id,
    a.name AS title,
    a.species,
    a.breed,
    a.located_name AS location,
    l.created_at,
    m.cf_score,
    m.liked_by_similar_users,
    m.content_score,
    (m.cf_score * 3 + m.content_score * 2) AS final_score
FROM merged m
         JOIN listings l ON l.listing_id = m.listing_id
         JOIN animals a ON a.animal_id = l.animal_id
WHERE l.status = 'ACTIVE'
  AND l.owner_id <> 1
ORDER BY final_score DESC, l.created_at DESC
LIMIT 20;

Without indexes

Limit  (cost=264852.48..264852.48 rows=1 width=2160) (actual time=7249.269..7249.523 rows=3 loops=1)
  CTE my_likes
    ->  Seq Scan on favorite_listings fl_2  (cost=0.00..1.02 rows=1 width=8) (actual time=0.021..0.022 rows=1 loops=1)
          Filter: (client_id = 1)
          Rows Removed by Filter: 1
  ->  Sort  (cost=264851.45..264851.46 rows=1 width=2160) (actual time=7008.341..7008.593 rows=3 loops=1)
"        Sort Key: (((COALESCE(cf.cf_score, '0'::numeric) * '3'::numeric) + ((COALESCE((count(*)), '0'::bigint) * 2))::numeric)) DESC, l.created_at DESC"
        Sort Method: quicksort  Memory: 25kB
        ->  Nested Loop  (cost=218242.91..264851.44 rows=1 width=2160) (actual time=4594.816..7007.250 rows=3 loops=1)
              Join Filter: (l.animal_id = a.animal_id)
              Rows Removed by Join Filter: 9
              ->  Hash Join  (cost=218242.91..264850.37 rows=1 width=72) (actual time=4594.763..7007.059 rows=3 loops=1)
"                    Hash Cond: (COALESCE(cf.listing_id, l2.listing_id) = l.listing_id)"
                    ->  Hash Full Join  (cost=172920.66..215833.91 rows=1407309 width=64) (actual time=2751.983..5316.014 rows=2000008 loops=1)
                          Hash Cond: (l2.listing_id = cf.listing_id)
                          ->  HashAggregate  (cost=172918.43..197986.12 rows=1407309 width=16) (actual time=2751.308..4818.192 rows=2000008 loops=1)
                                Group Key: l2.listing_id
                                Planned Partitions: 32  Batches: 33  Memory Usage: 8209kB  Disk Usage: 63168kB
                                ->  Hash Anti Join  (cost=3.38..93757.30 rows=1407309 width=8) (actual time=1.187..1532.095 rows=2000008 loops=1)
                                      Hash Cond: (l2.listing_id = ml.listing_id)
                                      ->  Hash Join  (cost=3.35..75989.99 rows=1407310 width=8) (actual time=1.125..1117.188 rows=2000008 loops=1)
                                            Hash Cond: (l2.animal_id = a2.animal_id)
                                            Join Filter: (l2.listing_id <> l1.listing_id)
                                            Rows Removed by Join Filter: 1
                                            ->  Seq Scan on listings l2  (cost=0.00..51821.63 rows=1999863 width=16) (actual time=0.146..543.935 rows=2000009 loops=1)
                                            ->  Hash  (cost=3.32..3.32 rows=2 width=16) (actual time=0.831..0.857 rows=8 loops=1)
                                                  Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                  ->  Merge Join  (cost=2.50..3.32 rows=2 width=16) (actual time=0.779..0.832 rows=8 loops=1)
                                                        Merge Cond: (l1.listing_id = r.listing_id)
                                                        ->  Nested Loop  (cost=0.57..364472.04 rows=4221933 width=16) (actual time=0.365..0.400 rows=9 loops=1)
                                                              Join Filter: (((a2.species)::text = (a1.species)::text) OR ((a2.breed)::text = (a1.breed)::text) OR ((a2.located_name)::text = (a1.located_name)::text))
                                                              Rows Removed by Join Filter: 1
                                                              ->  Nested Loop  (cost=0.57..244479.22 rows=1999863 width=1556) (actual time=0.330..0.346 rows=2 loops=1)
                                                                    ->  Index Scan using listings_pk on listings l1  (cost=0.43..199481.87 rows=1999863 width=16) (actual time=0.064..0.065 rows=2 loops=1)
                                                                    ->  Memoize  (cost=0.14..0.16 rows=1 width=1556) (actual time=0.133..0.133 rows=1 loops=2)
                                                                          Cache Key: l1.animal_id
                                                                          Cache Mode: logical
                                                                          Hits: 0  Misses: 2  Evictions: 0  Overflows: 0  Memory Usage: 1kB
                                                                          ->  Index Scan using animals_pk on animals a1  (cost=0.13..0.15 rows=1 width=1556) (actual time=0.124..0.125 rows=1 loops=2)
                                                                                Index Cond: (animal_id = l1.animal_id)
                                                              ->  Materialize  (cost=0.00..1.04 rows=3 width=1556) (actual time=0.012..0.018 rows=5 loops=2)
                                                                    ->  Seq Scan on animals a2  (cost=0.00..1.03 rows=3 width=1556) (actual time=0.014..0.018 rows=9 loops=1)
                                                        ->  Sort  (cost=1.92..1.93 rows=1 width=8) (actual time=0.401..0.411 rows=1 loops=1)
                                                              Sort Key: r.listing_id
                                                              Sort Method: quicksort  Memory: 25kB
                                                              ->  Subquery Scan on r  (cost=1.90..1.91 rows=1 width=8) (actual time=0.271..0.281 rows=1 loops=1)
                                                                    ->  Limit  (cost=1.90..1.90 rows=1 width=16) (actual time=0.266..0.274 rows=1 loops=1)
                                                                          ->  Sort  (cost=1.90..1.90 rows=1 width=16) (actual time=0.252..0.259 rows=1 loops=1)
                                                                                Sort Key: l_1.created_at DESC
                                                                                Sort Method: quicksort  Memory: 25kB
                                                                                ->  Merge Join  (cost=1.47..1.89 rows=1 width=16) (actual time=0.218..0.226 rows=1 loops=1)
                                                                                      Merge Cond: (l_1.listing_id = fl.listing_id)
                                                                                      ->  Index Scan using listings_pk on listings l_1  (cost=0.43..199481.87 rows=1999863 width=16) (actual time=0.035..0.037 rows=2 loops=1)
                                                                                      ->  Sort  (cost=1.03..1.04 rows=1 width=8) (actual time=0.164..0.167 rows=1 loops=1)
                                                                                            Sort Key: fl.listing_id
                                                                                            Sort Method: quicksort  Memory: 25kB
                                                                                            ->  Seq Scan on favorite_listings fl  (cost=0.00..1.02 rows=1 width=8) (actual time=0.027..0.029 rows=1 loops=1)
                                                                                                  Filter: (client_id = 1)
                                                                                                  Rows Removed by Filter: 1
                                      ->  Hash  (cost=0.02..0.02 rows=1 width=8) (actual time=0.014..0.015 rows=1 loops=1)
                                            Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                            ->  CTE Scan on my_likes ml  (cost=0.00..0.02 rows=1 width=8) (actual time=0.000..0.001 rows=1 loops=1)
                          ->  Hash  (cost=2.21..2.21 rows=1 width=48) (actual time=0.611..0.621 rows=0 loops=1)
                                Buckets: 1024  Batches: 1  Memory Usage: 8kB
                                ->  Subquery Scan on cf  (cost=2.12..2.21 rows=1 width=48) (actual time=0.610..0.619 rows=0 loops=1)
                                      ->  GroupAggregate  (cost=2.12..2.20 rows=1 width=48) (actual time=0.609..0.617 rows=0 loops=1)
                                            Group Key: fl_1.listing_id
                                            ->  Nested Loop  (cost=2.12..2.18 rows=1 width=24) (actual time=0.607..0.615 rows=0 loops=1)
                                                  Join Filter: (fl_1.client_id = fl2.client_id)
                                                  ->  Merge Anti Join  (cost=1.06..1.09 rows=1 width=16) (actual time=0.452..0.457 rows=1 loops=1)
                                                        Merge Cond: (fl_1.listing_id = ml_1.listing_id)
                                                        ->  Sort  (cost=1.03..1.03 rows=2 width=16) (actual time=0.346..0.348 rows=2 loops=1)
                                                              Sort Key: fl_1.listing_id
                                                              Sort Method: quicksort  Memory: 25kB
                                                              ->  Seq Scan on favorite_listings fl_1  (cost=0.00..1.02 rows=2 width=16) (actual time=0.065..0.068 rows=2 loops=1)
                                                        ->  Sort  (cost=0.03..0.04 rows=1 width=8) (actual time=0.066..0.067 rows=1 loops=1)
                                                              Sort Key: ml_1.listing_id
                                                              Sort Method: quicksort  Memory: 25kB
                                                              ->  CTE Scan on my_likes ml_1  (cost=0.00..0.02 rows=1 width=8) (actual time=0.026..0.028 rows=1 loops=1)
                                                  ->  HashAggregate  (cost=1.06..1.07 rows=1 width=16) (actual time=0.150..0.152 rows=0 loops=1)
                                                        Group Key: fl2.client_id
                                                        Filter: (count(*) > 0)
                                                        Batches: 1  Memory Usage: 24kB
                                                        ->  Nested Loop  (cost=0.00..1.06 rows=1 width=8) (actual time=0.079..0.081 rows=0 loops=1)
                                                              Join Filter: (fl2.listing_id = ml_2.listing_id)
                                                              Rows Removed by Join Filter: 1
                                                              ->  Seq Scan on favorite_listings fl2  (cost=0.00..1.02 rows=1 width=16) (actual time=0.012..0.017 rows=1 loops=1)
                                                                    Filter: (client_id <> 1)
                                                                    Rows Removed by Filter: 1
                                                              ->  CTE Scan on my_likes ml_2  (cost=0.00..0.02 rows=1 width=8) (actual time=0.001..0.002 rows=1 loops=1)
                    ->  Hash  (cost=45322.24..45322.24 rows=1 width=24) (actual time=1436.407..1436.608 rows=4 loops=1)
                          Buckets: 1024  Batches: 1  Memory Usage: 9kB
                          ->  Gather  (cost=1000.00..45322.24 rows=1 width=24) (actual time=1420.646..1436.431 rows=4 loops=1)
                                Workers Planned: 2
                                Workers Launched: 2
                                ->  Parallel Seq Scan on listings l  (cost=0.00..44322.14 rows=1 width=24) (actual time=1316.362..1367.278 rows=1 loops=3)
                                      Filter: ((owner_id <> 1) AND ((status)::text = 'ACTIVE'::text))
                                      Rows Removed by Filter: 666668
              ->  Seq Scan on animals a  (cost=0.00..1.03 rows=3 width=2072) (actual time=0.019..0.019 rows=4 loops=3)
Planning Time: 11.722 ms
JIT:
  Functions: 124
"  Options: Inlining false, Optimization false, Expressions true, Deforming true"
"  Timing: Generation 22.000 ms, Inlining 0.000 ms, Optimization 32.272 ms, Emission 243.487 ms, Total 297.758 ms"
Execution Time: 7756.372 ms

This is not an acceptable execution time so we will optimze it with indexes.

We add these indexes

CREATE INDEX  idx_favorite_listings_client_listing
    ON favorite_listings (client_id, listing_id);

CREATE INDEX idx_favorite_listings_listing_client
    ON favorite_listings (listing_id, client_id);

CREATE INDEX  idx_listings_status_owner_created
    ON listings (status, owner_id, created_at DESC);

CREATE INDEX idx_listings_animal
    ON listings (animal_id);

CREATE INDEX idx_animals_species
    ON animals (species);

CREATE INDEX idx_animals_breed
    ON animals (breed);

CREATE INDEX  idx_animals_located_name
    ON animals (located_name);

With indexes

Limit  (cost=24.22..24.23 rows=1 width=121) (actual time=0.419..0.427 rows=3 loops=1)
  CTE my_likes
    ->  Seq Scan on favorite_listings fl_2  (cost=0.00..1.02 rows=1 width=8) (actual time=0.004..0.005 rows=1 loops=1)
          Filter: (client_id = 1)
          Rows Removed by Filter: 1
  ->  Sort  (cost=23.20..23.20 rows=1 width=121) (actual time=0.418..0.425 rows=3 loops=1)
"        Sort Key: (((COALESCE((sum((count(*)))), '0'::numeric) * '3'::numeric) + ((COALESCE((count(*)), '0'::bigint) * 2))::numeric)) DESC, l.created_at DESC"
        Sort Method: quicksort  Memory: 25kB
        ->  Nested Loop  (cost=7.22..23.19 rows=1 width=121) (actual time=0.381..0.405 rows=3 loops=1)
              ->  Nested Loop  (cost=7.08..23.02 rows=1 width=72) (actual time=0.357..0.374 rows=3 loops=1)
                    ->  Merge Full Join  (cost=6.65..14.58 rows=1 width=64) (actual time=0.338..0.348 rows=3 loops=1)
                          Merge Cond: (fl.listing_id = l2.listing_id)
                          ->  GroupAggregate  (cost=2.55..10.42 rows=1 width=48) (actual time=0.071..0.074 rows=0 loops=1)
                                Group Key: fl.listing_id
                                ->  Nested Loop  (cost=2.55..10.40 rows=1 width=24) (actual time=0.070..0.072 rows=0 loops=1)
                                      Join Filter: (fl.listing_id = l_1.listing_id)
                                      ->  Nested Loop  (cost=2.12..2.18 rows=1 width=24) (actual time=0.070..0.072 rows=0 loops=1)
                                            Join Filter: (fl.client_id = fl2.client_id)
                                            ->  Merge Anti Join  (cost=1.06..1.09 rows=1 width=16) (actual time=0.061..0.062 rows=1 loops=1)
                                                  Merge Cond: (fl.listing_id = ml.listing_id)
                                                  ->  Sort  (cost=1.03..1.03 rows=2 width=16) (actual time=0.034..0.034 rows=2 loops=1)
                                                        Sort Key: fl.listing_id
                                                        Sort Method: quicksort  Memory: 25kB
                                                        ->  Seq Scan on favorite_listings fl  (cost=0.00..1.02 rows=2 width=16) (actual time=0.014..0.015 rows=2 loops=1)
                                                  ->  Sort  (cost=0.03..0.04 rows=1 width=8) (actual time=0.023..0.024 rows=1 loops=1)
                                                        Sort Key: ml.listing_id
                                                        Sort Method: quicksort  Memory: 25kB
                                                        ->  CTE Scan on my_likes ml  (cost=0.00..0.02 rows=1 width=8) (actual time=0.009..0.010 rows=1 loops=1)
                                            ->  HashAggregate  (cost=1.06..1.07 rows=1 width=16) (actual time=0.007..0.008 rows=0 loops=1)
                                                  Group Key: fl2.client_id
                                                  Batches: 1  Memory Usage: 24kB
                                                  ->  Nested Loop  (cost=0.00..1.06 rows=1 width=8) (actual time=0.006..0.006 rows=0 loops=1)
                                                        Join Filter: (fl2.listing_id = ml_1.listing_id)
                                                        Rows Removed by Join Filter: 1
                                                        ->  Seq Scan on favorite_listings fl2  (cost=0.00..1.02 rows=1 width=16) (actual time=0.003..0.004 rows=1 loops=1)
                                                              Filter: (client_id <> 1)
                                                              Rows Removed by Filter: 1
                                                        ->  CTE Scan on my_likes ml_1  (cost=0.00..0.02 rows=1 width=8) (actual time=0.000..0.001 rows=1 loops=1)
                                      ->  Index Scan using idx_listings_status_owner_created on listings l_1  (cost=0.43..8.20 rows=1 width=8) (never executed)
                                            Index Cond: ((status)::text = 'ACTIVE'::text)
                                            Filter: (owner_id <> 1)
                          ->  GroupAggregate  (cost=4.10..4.12 rows=1 width=16) (actual time=0.266..0.272 rows=3 loops=1)
                                Group Key: l2.listing_id
                                ->  Sort  (cost=4.10..4.11 rows=1 width=8) (actual time=0.262..0.265 rows=3 loops=1)
                                      Sort Key: l2.listing_id
                                      Sort Method: quicksort  Memory: 25kB
                                      ->  Nested Loop  (cost=3.23..4.09 rows=1 width=8) (actual time=0.214..0.246 rows=3 loops=1)
                                            Join Filter: (((a2.species)::text = (a1.species)::text) OR ((a2.breed)::text = (a1.breed)::text) OR ((a2.located_name)::text = (a1.located_name)::text))
                                            ->  Merge Join  (cost=3.09..3.92 rows=1 width=36) (actual time=0.188..0.214 rows=3 loops=1)
                                                  Merge Cond: (l1.listing_id = r.listing_id)
                                                  ->  Nested Loop  (cost=0.85..229633.64 rows=2000898 width=44) (actual time=0.118..0.140 rows=4 loops=1)
                                                        Join Filter: (l2.listing_id <> l1.listing_id)
                                                        ->  Index Scan using listings_pk on listings l1  (cost=0.43..199610.69 rows=2000900 width=16) (actual time=0.018..0.019 rows=2 loops=1)
                                                        ->  Materialize  (cost=0.43..9.44 rows=1 width=28) (actual time=0.049..0.058 rows=2 loops=2)
                                                              ->  Nested Loop Anti Join  (cost=0.43..9.44 rows=1 width=28) (actual time=0.090..0.107 rows=3 loops=1)
                                                                    Join Filter: (ml_2.listing_id = l2.listing_id)
                                                                    Rows Removed by Join Filter: 3
                                                                    ->  Nested Loop  (cost=0.43..9.41 rows=1 width=28) (actual time=0.088..0.102 rows=4 loops=1)
                                                                          Join Filter: (a2.animal_id = l2.animal_id)
                                                                          Rows Removed by Join Filter: 9
                                                                          ->  Index Scan using idx_listings_status_owner_created on listings l2  (cost=0.43..8.20 rows=1 width=16) (actual time=0.082..0.085 rows=4 loops=1)
                                                                                Index Cond: ((status)::text = 'ACTIVE'::text)
                                                                                Filter: (owner_id <> 1)
                                                                                Rows Removed by Filter: 1
                                                                          ->  Seq Scan on animals a2  (cost=0.00..1.09 rows=9 width=28) (actual time=0.002..0.002 rows=3 loops=4)
                                                                    ->  CTE Scan on my_likes ml_2  (cost=0.00..0.02 rows=1 width=8) (actual time=0.000..0.000 rows=1 loops=4)
                                                  ->  Sort  (cost=2.23..2.24 rows=1 width=8) (actual time=0.068..0.070 rows=1 loops=1)
                                                        Sort Key: r.listing_id
                                                        Sort Method: quicksort  Memory: 25kB
                                                        ->  Subquery Scan on r  (cost=2.21..2.22 rows=1 width=8) (actual time=0.053..0.054 rows=1 loops=1)
                                                              ->  Limit  (cost=2.21..2.21 rows=1 width=16) (actual time=0.052..0.053 rows=1 loops=1)
                                                                    ->  Sort  (cost=2.21..2.21 rows=1 width=16) (actual time=0.051..0.052 rows=1 loops=1)
                                                                          Sort Key: l_2.created_at DESC
                                                                          Sort Method: quicksort  Memory: 25kB
                                                                          ->  Merge Join  (cost=1.46..2.20 rows=1 width=16) (actual time=0.035..0.038 rows=1 loops=1)
                                                                                Merge Cond: (l_2.listing_id = fl_1.listing_id)
                                                                                ->  Index Scan using listings_pk on listings l_2  (cost=0.43..199610.69 rows=2000900 width=16) (actual time=0.019..0.020 rows=2 loops=1)
                                                                                ->  Sort  (cost=1.03..1.04 rows=1 width=8) (actual time=0.014..0.015 rows=1 loops=1)
                                                                                      Sort Key: fl_1.listing_id
                                                                                      Sort Method: quicksort  Memory: 25kB
                                                                                      ->  Seq Scan on favorite_listings fl_1  (cost=0.00..1.02 rows=1 width=8) (actual time=0.004..0.004 rows=1 loops=1)
                                                                                            Filter: (client_id = 1)
                                                                                            Rows Removed by Filter: 1
                                            ->  Index Scan using animals_pk on animals a1  (cost=0.14..0.15 rows=1 width=28) (actual time=0.008..0.008 rows=1 loops=3)
                                                  Index Cond: (animal_id = l1.animal_id)
                    ->  Index Scan using listings_pk on listings l  (cost=0.43..8.45 rows=1 width=24) (actual time=0.007..0.007 rows=1 loops=3)
"                          Index Cond: (listing_id = COALESCE(fl.listing_id, l2.listing_id))"
              ->  Index Scan using animals_pk on animals a  (cost=0.14..0.15 rows=1 width=33) (actual time=0.005..0.005 rows=1 loops=3)
                    Index Cond: (animal_id = l.animal_id)
Planning Time: 7.688 ms
Execution Time: 0.904 ms

The indexes worked, now the execution time is accepatable.

Security measures

SQL injection prevention

@Query(value = """
        SELECT *
        FROM get_top_active_users(:startTs, :endTs)
        """, nativeQuery = true)
    List<UserActivityRankingProjection> getTopActiveUsers(
            @Param("startTs") LocalDateTime startTs,
            @Param("endTs") LocalDateTime endTs
    );

Native queries are executed using named parameters such as :startTs and :endTs, which are bound by Spring Data JPA. This prevents SQL injection because input values are treated as parameters, not as executable SQL.

Keeping passwords

*Password security is implemented using BCryptPasswordEncoder, which hashes user passwords before storing them in the database. This ensures that passwords are not stored in plain text and provides protection against brute-force and dictionary attacks.

    @Bean
    public PasswordEncoder passwordEncoder() {
        return new BCryptPasswordEncoder();
    }

CORS

The application uses a CORS configuration to control which frontend clients are allowed to communicate with the backend API. In development, requests are allowed only from local frontend origins such as http://localhost:* (this will be changed to the correct URL of the hosted site). The configuration defines the permitted HTTP methods, including GET, POST, PUT, PATCH, DELETE, and OPTIONS.

@Bean
    public CorsConfigurationSource corsConfigurationSource() {
        CorsConfiguration configuration = new CorsConfiguration();
        configuration.setAllowedOriginPatterns(List.of("http://localhost:*"));
        configuration.setAllowedMethods(Arrays.asList("GET", "POST", "PUT","PATCH", "DELETE", "OPTIONS"));
        configuration.setAllowedHeaders(Arrays.asList("*"));
        configuration.setAllowCredentials(true);

        UrlBasedCorsConfigurationSource source = new UrlBasedCorsConfigurationSource();
        source.registerCorsConfiguration("/**", configuration);
        return source;
    }

Other developments

Note: See TracWiki for help on using the wiki.

Download in other formats:

Plain Text