wiki:OtherTopics

Version 5 (modified by 181201, 13 days ago) ( diff )

--

Other topics (Performance, Security, …)

Performance

1. Adding 1000000 mock rows to 'bookings' table

ALTER TABLE project.bookings DISABLE TRIGGER trg_booking_validation;

INSERT INTO project.bookings (status, date_from, date_to, address, owner_id, sitter_id)
SELECT
    (ARRAY['Pending', 'Confirmed', 'Completed', 'Canceled', 'Expired', 'Reviewed'])[floor(random()*6)+1] AS status,
    t.dt AS date_from,
    t.dt + (random() * INTERVAL '7 days') AS date_to,
    'Mock Address ' || gs AS address,
    o.user_id AS owner_id,
    s.user_id AS sitter_id
FROM generate_series(1, 1000000) gs
JOIN LATERAL (
    SELECT user_id FROM project.pet_owners ORDER BY random() LIMIT 1
) o ON true
JOIN LATERAL (
    SELECT user_id FROM project.pet_sitters ORDER BY random() LIMIT 1
) s ON true
JOIN LATERAL (
    SELECT CURRENT_DATE - (random() * INTERVAL '365 days') AS dt
) t ON true;

ALTER TABLE project.bookings ENABLE TRIGGER trg_booking_validation;

Benchmark query:

EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM project.bookings 
WHERE sitter_id = (SELECT user_id FROM project.pet_sitters LIMIT 1)
ORDER BY date_from DESC;

Execution without indexes:

Gather Merge  (cost=81030.64..195549.38 rows=967764 width=146) (actual time=94.209..106.394 rows=2 loops=1)
  Workers Planned: 3
  Workers Launched: 3
  Buffers: shared hit=22336
  InitPlan 1
    ->  Limit  (cost=0.00..0.02 rows=1 width=90) (actual time=3.367..3.369 rows=1 loops=1)
          Buffers: shared hit=1
          ->  Seq Scan on pet_sitters  (cost=0.00..16.90 rows=690 width=90) (actual time=0.015..0.016 rows=1 loops=1)
                Buffers: shared hit=1
  ->  Sort  (cost=80030.57..80837.04 rows=322588 width=146) (actual time=60.212..60.213 rows=0 loops=4)
        Sort Key: bookings.date_from DESC
        Sort Method: quicksort  Memory: 25kB
        Buffers: shared hit=22335
        Worker 0:  Sort Method: quicksort  Memory: 25kB
        Worker 1:  Sort Method: quicksort  Memory: 25kB
        Worker 2:  Sort Method: quicksort  Memory: 25kB
        ->  Parallel Seq Scan on bookings  (cost=0.00..26256.35 rows=322588 width=146) (actual time=37.611..60.082 rows=0 loops=4)
              Filter: ((sitter_id)::text = ((InitPlan 1).col1)::text)
              Rows Removed by Filter: 250005
              Buffers: shared hit=22224
Planning Time: 0.168 ms
JIT:
  Functions: 12
  Options: Inlining false, Optimization false, Expressions true, Deforming true
  Timing: Generation 1.466 ms (Deform 0.675 ms), Inlining 0.000 ms, Optimization 1.430 ms, Emission 19.219 ms, Total 22.115 ms
Execution Time: 106.834 ms

Average execution time (10 attempts): 105.235ms

Next, we add this index:

CREATE INDEX idx_bookings_sitter_date 
ON project.bookings (sitter_id, date_from DESC);

(Note, for the exact same performance reasons, we also created this index to optimize the queries when Pet Owners view their own bookings)

CREATE INDEX idx_bookings_owner_date 
ON project.bookings (owner_id, date_from DESC);

Execution with indexes:

Index Scan using idx_bookings_sitter_date on bookings  (cost=0.45..72503.52 rows=1000022 width=146) (actual time=0.066..0.068 rows=2 loops=1)
  Index Cond: ((sitter_id)::text = ((InitPlan 1).col1)::text)
  Buffers: shared hit=2 read=3
  InitPlan 1
    ->  Limit  (cost=0.00..0.02 rows=1 width=90) (actual time=0.012..0.013 rows=1 loops=1)
          Buffers: shared hit=1
          ->  Seq Scan on pet_sitters  (cost=0.00..16.90 rows=690 width=90) (actual time=0.011..0.011 rows=1 loops=1)
                Buffers: shared hit=1
Planning:
  Buffers: shared hit=39 read=2
Planning Time: 0.618 ms
Execution Time: 0.092 ms

Average execution time (10 attempts): 0.087ms

Because the execution time has been massively lowered by bypassing the expensive sequential scan and memory sort, we keep this index.

*(Note: For the exact same performance reasons, we also created CREATE INDEX idx_bookings_owner_date ON project.bookings (owner_id, date_from DESC); to optimize the queries when Pet Owners view their own booking history).*

2. Calculate average sitter rating

Benchmark query:

EXPLAIN (ANALYZE, BUFFERS)
SELECT AVG(r.rating) 
FROM project.reviews r
JOIN project.bookings b ON r.booking_id = b.booking_id
WHERE b.sitter_id = (SELECT user_id FROM project.pet_sitters LIMIT 1);

Execution without indexes:

Finalize Aggregate  (cost=33210.02..33210.03 rows=1 width=32) (actual time=121.837..135.038 rows=1 loops=1)
  Buffers: shared hit=25818
  InitPlan 1
    ->  Limit  (cost=0.00..0.02 rows=1 width=90) (actual time=0.014..0.015 rows=1 loops=1)
          Buffers: shared hit=1
          ->  Seq Scan on pet_sitters  (cost=0.00..16.90 rows=690 width=90) (actual time=0.012..0.013 rows=1 loops=1)
                Buffers: shared hit=1
  ->  Gather  (cost=33209.67..33209.98 rows=3 width=32) (actual time=119.802..135.016 rows=4 loops=1)
        Workers Planned: 3
        Workers Launched: 3
        Buffers: shared hit=25818
        ->  Partial Aggregate  (cost=32209.67..32209.68 rows=1 width=32) (actual time=90.870..90.873 rows=1 loops=4)
              Buffers: shared hit=25817
              ->  Parallel Hash Join  (cost=4971.92..32075.06 rows=53846 width=4) (actual time=77.089..90.867 rows=0 loops=4)
                    Hash Cond: ((b.booking_id)::text = (r.booking_id)::text)
                    Buffers: shared hit=25817
                    ->  Parallel Seq Scan on bookings b  (cost=0.00..26256.35 rows=322588 width=37) (actual time=41.349..55.119 rows=0 loops=4)
                          Filter: ((sitter_id)::text = ((InitPlan 1).col1)::text)
                          Rows Removed by Filter: 250005
                          Buffers: shared hit=22224
                    ->  Parallel Hash  (cost=4102.52..4102.52 rows=69552 width=41) (actual time=34.636..34.637 rows=41731 loops=4)
                          Buckets: 262144  Batches: 1  Memory Usage: 15200kB
                          Buffers: shared hit=3407
                          ->  Parallel Seq Scan on reviews r  (cost=0.00..4102.52 rows=69552 width=41) (actual time=0.026..11.172 rows=41731 loops=4)
                                Buffers: shared hit=3407
Planning:
  Buffers: shared hit=16
Planning Time: 0.444 ms
Execution Time: 135.097 ms

Average execution time (10 attempts): 131.722ms

We add this index:

CREATE INDEX idx_reviews_booking_id 
ON project.reviews (booking_id);

Execution with indexes:

Finalize Aggregate  (cost=33210.02..33210.03 rows=1 width=32) (actual time=125.303..138.277 rows=1 loops=1)
  Buffers: shared hit=25818
  InitPlan 1
    ->  Limit  (cost=0.00..0.02 rows=1 width=90) (actual time=0.013..0.014 rows=1 loops=1)
          Buffers: shared hit=1
          ->  Seq Scan on pet_sitters  (cost=0.00..16.90 rows=690 width=90) (actual time=0.012..0.012 rows=1 loops=1)
                Buffers: shared hit=1
  ->  Gather  (cost=33209.67..33209.98 rows=3 width=32) (actual time=123.351..138.255 rows=4 loops=1)
        Workers Planned: 3
        Workers Launched: 3
        Buffers: shared hit=25818
        ->  Partial Aggregate  (cost=32209.67..32209.68 rows=1 width=32) (actual time=93.069..93.072 rows=1 loops=4)
              Buffers: shared hit=25817
              ->  Parallel Hash Join  (cost=4971.92..32075.06 rows=53846 width=4) (actual time=78.783..93.066 rows=0 loops=4)
                    Hash Cond: ((b.booking_id)::text = (r.booking_id)::text)
                    Buffers: shared hit=25817
                    ->  Parallel Seq Scan on bookings b  (cost=0.00..26256.35 rows=322588 width=37) (actual time=42.860..57.138 rows=0 loops=4)
                          Filter: ((sitter_id)::text = ((InitPlan 1).col1)::text)
                          Rows Removed by Filter: 250005
                          Buffers: shared hit=22224
                    ->  Parallel Hash  (cost=4102.52..4102.52 rows=69552 width=41) (actual time=34.827..34.828 rows=41731 loops=4)
                          Buckets: 262144  Batches: 1  Memory Usage: 15168kB
                          Buffers: shared hit=3407
                          ->  Parallel Seq Scan on reviews r  (cost=0.00..4102.52 rows=69552 width=41) (actual time=0.029..11.437 rows=41731 loops=4)
                                Buffers: shared hit=3407
Planning:
  Buffers: shared hit=16
Planning Time: 0.554 ms
Execution Time: 138.346 ms

Average execution time (10 attempts): 138.346ms

There is practically no change in the execution time, and the query planner completely ignored the index because it determined that performing a Parallel Seq Scan and Parallel Hash Join was cheaper than doing individual index lookups on such a large dataset. Because the index is not utilized for this query and only adds complexity during INSERT and UPDATE operations, we will not keep this index.

3. Pet Sitter Leaderboard, Analytics

Benchmark query:

EXPLAIN (ANALYZE, BUFFERS)
WITH params AS (
    SELECT 
        (CURRENT_DATE - INTERVAL '1 year') AS start_date,
        CURRENT_DATE AS end_date
),
sitter_bookings AS (
    SELECT 
        b.sitter_id,
        COUNT(*) AS total_bookings,
        COUNT(*) FILTER (WHERE b.status = 'Completed') AS completed_bookings,
        COUNT(*) FILTER (WHERE b.status = 'Canceled') AS canceled_bookings
    FROM project.bookings b
    JOIN params p ON b.date_from >= p.start_date AND b.date_from <= p.end_date
    GROUP BY b.sitter_id
),
sitter_reviews AS (
    SELECT 
        b.sitter_id,
        AVG(r.rating)::numeric(10,2) AS avg_rating,
        COUNT(r.review_id) AS total_reviews
    FROM project.reviews r
    JOIN project.bookings b ON r.booking_id = b.booking_id
    GROUP BY b.sitter_id
)
SELECT 
    u.username,
    u.first_name,
    u.last_name,
    COALESCE(sb.completed_bookings, 0) AS completed,
    COALESCE(sb.canceled_bookings, 0) AS canceled,
    COALESCE(sr.avg_rating, 0) AS rating,
    (
        COALESCE(sb.completed_bookings, 0) * 5 
        + COALESCE(sr.avg_rating, 0) * 10 
        - COALESCE(sb.canceled_bookings, 0) * 3
    ) AS activity_score,
    DENSE_RANK() OVER (
        ORDER BY (
            COALESCE(sb.completed_bookings, 0) * 5 
            + COALESCE(sr.avg_rating, 0) * 10 
            - COALESCE(sb.canceled_bookings, 0) * 3
        ) DESC
    ) AS leaderboard_rank
FROM project.pet_sitters ps
JOIN project.users u ON ps.user_id = u.user_id
LEFT JOIN sitter_bookings sb ON sb.sitter_id = ps.user_id
LEFT JOIN sitter_reviews sr ON sr.sitter_id = ps.user_id
ORDER BY leaderboard_rank
LIMIT 10;

Execution without indexes:

Limit  (cost=67216.49..67216.51 rows=10 width=115) (actual time=403.788..412.560 rows=4 loops=1)
  Buffers: shared hit=22278 read=25633
  ->  Sort  (cost=67216.49..67218.21 rows=690 width=115) (actual time=403.786..412.557 rows=4 loops=1)
        Sort Key: (dense_rank() OVER (?))
        Sort Method: quicksort  Memory: 25kB
        Buffers: shared hit=22278 read=25633
        ->  WindowAgg  (cost=67177.44..67201.58 rows=690 width=115) (actual time=403.716..412.514 rows=4 loops=1)
              Buffers: shared hit=22275 read=25633
              ->  Sort  (cost=67177.43..67179.15 rows=690 width=91) (actual time=403.675..412.445 rows=4 loops=1)
                    Sort Key: (((((COALESCE((count(*) FILTER (WHERE ((b.status)::text = 'Completed'::text))), '0'::bigint) * 5))::numeric + (COALESCE(((avg(r.rating))::numeric(10,2)), '0'::numeric) * '10'::numeric)) - ((COALESCE((count(*) FILTER (WHERE ((b.status)::text = 'Canceled'::text))), '0'::bigint) * 3))::numeric)) DESC
                    Sort Method: quicksort  Memory: 25kB
                    Buffers: shared hit=22275 read=25633
                    ->  Hash Join  (cost=67125.00..67144.89 rows=690 width=91) (actual time=403.585..412.387 rows=4 loops=1)
                          Hash Cond: ((ps.user_id)::text = (u.user_id)::text)
                          Buffers: shared hit=22272 read=25633
                          ->  Merge Left Join  (cost=67101.95..67107.94 rows=690 width=122) (actual time=403.503..412.297 rows=4 loops=1)
                                Merge Cond: ((ps.user_id)::text = (b_1.sitter_id)::text)
                                Buffers: shared hit=22272 read=25632
                                ->  Merge Left Join  (cost=34564.07..34567.93 rows=690 width=106) (actual time=212.894..213.031 rows=4 loops=1)
                                      Merge Cond: ((ps.user_id)::text = (b.sitter_id)::text)
                                      Buffers: shared hit=24 read=22225
                                      ->  Sort  (cost=49.44..51.16 rows=690 width=90) (actual time=0.049..0.051 rows=4 loops=1)
                                            Sort Key: ps.user_id
                                            Sort Method: quicksort  Memory: 25kB
                                            Buffers: shared read=1
                                            ->  Seq Scan on pet_sitters ps  (cost=0.00..16.90 rows=690 width=90) (actual time=0.021..0.022 rows=4 loops=1)
                                                  Buffers: shared read=1
                                      ->  Finalize GroupAggregate  (cost=34514.63..34515.02 rows=1 width=61) (actual time=212.840..212.970 rows=4 loops=1)
                                            Group Key: b.sitter_id
                                            Buffers: shared hit=24 read=22224
                                            ->  Gather Merge  (cost=34514.63..34514.99 rows=3 width=53) (actual time=212.822..212.945 rows=7 loops=1)
                                                  Workers Planned: 3
                                                  Workers Launched: 3
                                                  Buffers: shared hit=24 read=22224
                                                  ->  Sort  (cost=33514.59..33514.60 rows=1 width=53) (actual time=206.606..206.608 rows=2 loops=4)
                                                        Sort Key: b.sitter_id
                                                        Sort Method: quicksort  Memory: 25kB
                                                        Buffers: shared hit=24 read=22224
                                                        Worker 0:  Sort Method: quicksort  Memory: 25kB
                                                        Worker 1:  Sort Method: quicksort  Memory: 25kB
                                                        Worker 2:  Sort Method: quicksort  Memory: 25kB
                                                        ->  Partial HashAggregate  (cost=33514.57..33514.58 rows=1 width=53) (actual time=206.527..206.529 rows=2 loops=4)
                                                              Group Key: b.sitter_id
                                                              Batches: 1  Memory Usage: 24kB
                                                              Buffers: shared read=22224
                                                              Worker 0:  Batches: 1  Memory Usage: 24kB
                                                              Worker 1:  Batches: 1  Memory Usage: 24kB
                                                              Worker 2:  Batches: 1  Memory Usage: 24kB
                                                              ->  Parallel Seq Scan on bookings b  (cost=0.00..29482.22 rows=322588 width=45) (actual time=0.070..120.157 rows=250004 loops=4)
                                                                    Filter: ((date_from <= CURRENT_DATE) AND (date_from >= (CURRENT_DATE - '1 year'::interval)))
                                                                    Rows Removed by Filter: 1
                                                                    Buffers: shared read=22224
                                ->  Finalize GroupAggregate  (cost=32537.88..32538.27 rows=1 width=61) (actual time=190.603..199.257 rows=4 loops=1)
                                      Group Key: b_1.sitter_id
                                      Buffers: shared hit=22248 read=3407
                                      ->  Gather Merge  (cost=32537.88..32538.24 rows=3 width=69) (actual time=190.572..199.221 rows=7 loops=1)
                                            Workers Planned: 3
                                            Workers Launched: 3
                                            Buffers: shared hit=22248 read=3407
                                            ->  Sort  (cost=31537.84..31537.84 rows=1 width=69) (actual time=184.406..184.410 rows=2 loops=4)
                                                  Sort Key: b_1.sitter_id
                                                  Sort Method: quicksort  Memory: 25kB
                                                  Buffers: shared hit=22248 read=3407
                                                  Worker 0:  Sort Method: quicksort  Memory: 25kB
                                                  Worker 1:  Sort Method: quicksort  Memory: 25kB
                                                  Worker 2:  Sort Method: quicksort  Memory: 25kB
                                                  ->  Partial HashAggregate  (cost=31537.82..31537.83 rows=1 width=69) (actual time=184.336..184.340 rows=2 loops=4)
                                                        Group Key: b_1.sitter_id
                                                        Batches: 1  Memory Usage: 24kB
                                                        Buffers: shared hit=22224 read=3407
                                                        Worker 0:  Batches: 1  Memory Usage: 24kB
                                                        Worker 1:  Batches: 1  Memory Usage: 24kB
                                                        Worker 2:  Batches: 1  Memory Usage: 24kB
                                                        ->  Parallel Hash Join  (cost=4971.92..31268.59 rows=53846 width=41) (actual time=43.109..172.841 rows=41731 loops=4)
                                                              Hash Cond: ((b_1.booking_id)::text = (r.booking_id)::text)
                                                              Buffers: shared hit=22224 read=3407
                                                              ->  Parallel Seq Scan on bookings b_1  (cost=0.00..25449.88 rows=322588 width=74) (actual time=0.019..27.882 rows=250006 loops=4)
                                                                    Buffers: shared hit=22224
                                                              ->  Parallel Hash  (cost=4102.52..4102.52 rows=69552 width=41) (actual time=42.288..42.289 rows=41731 loops=4)
                                                                    Buckets: 262144  Batches: 1  Memory Usage: 15168kB
                                                                    Buffers: shared read=3407
                                                                    ->  Parallel Seq Scan on reviews r  (cost=0.00..4102.52 rows=69552 width=41) (actual time=0.063..16.909 rows=41731 loops=4)
                                                                          Buffers: shared read=3407
                          ->  Hash  (cost=15.80..15.80 rows=580 width=64) (actual time=0.049..0.050 rows=12 loops=1)
                                Buckets: 1024  Batches: 1  Memory Usage: 10kB
                                Buffers: shared read=1
                                ->  Seq Scan on users u  (cost=0.00..15.80 rows=580 width=64) (actual time=0.029..0.032 rows=12 loops=1)
                                      Buffers: shared read=1
Planning:
  Buffers: shared hit=402 read=29
Planning Time: 3.607 ms
Execution Time: 413.138 ms

Average execution time (10 attempts): 410.225ms

This execution time is perfectly acceptable for a heavy analytical reporting query. Because this query does not block user requests and is usually run as a background task or in an admin dashboard, we do not need to do more optimization with highly specific indexes that can slow down standard INSERT and UPDATE operations.

4. Search for available sitters with highest rankings

Benchmark query:

EXPLAIN (ANALYZE, BUFFERS)
WITH requested_dates AS (
    SELECT 
        (CURRENT_DATE + INTERVAL '10 days')::date AS req_start,
        (CURRENT_DATE + INTERVAL '15 days')::date AS req_end
),
unavailable_sitters AS (
    SELECT DISTINCT b.sitter_id
    FROM project.bookings b
    JOIN requested_dates rd ON true
    WHERE b.status IN ('Confirmed', 'Pending')
      AND b.date_from <= rd.req_end 
      AND b.date_to >= rd.req_start
),
highly_rated_sitters AS (
    SELECT b.sitter_id
    FROM project.reviews r
    JOIN project.bookings b ON r.booking_id = b.booking_id
    GROUP BY b.sitter_id
    HAVING AVG(r.rating) >= 4.0
)
SELECT 
    u.username,
    u.first_name,
    u.last_name,
    u.email
FROM project.pet_sitters ps
JOIN project.users u ON ps.user_id = u.user_id
JOIN highly_rated_sitters hrs ON hrs.sitter_id = ps.user_id
LEFT JOIN unavailable_sitters us ON us.sitter_id = ps.user_id
WHERE us.sitter_id IS NULL
LIMIT 20;

Execution without indexes:

Limit  (cost=66246.09..66295.07 rows=1 width=51) (actual time=279.739..287.004 rows=3 loops=1)
  Buffers: shared hit=47892 read=2
  ->  Nested Loop Anti Join  (cost=66246.09..66295.07 rows=1 width=51) (actual time=279.736..287.000 rows=3 loops=1)
        Join Filter: ((b_1.sitter_id)::text = (ps.user_id)::text)
        Buffers: shared hit=47892 read=2
        ->  Nested Loop  (cost=32537.88..32586.84 rows=1 width=141) (actual time=184.771..188.940 rows=3 loops=1)
              Join Filter: ((ps.user_id)::text = (u.user_id)::text)
              Rows Removed by Join Filter: 20
              Buffers: shared hit=25659 read=2
              ->  Nested Loop  (cost=32537.88..32563.79 rows=1 width=127) (actual time=184.747..188.897 rows=3 loops=1)
                    Join Filter: ((ps.user_id)::text = (b.sitter_id)::text)
                    Rows Removed by Join Filter: 3
                    Buffers: shared hit=25657 read=1
                    ->  Finalize GroupAggregate  (cost=32537.88..32538.27 rows=1 width=37) (actual time=184.705..188.834 rows=3 loops=1)
                          Group Key: b.sitter_id
                          Filter: (avg(r.rating) >= 4.0)
                          Rows Removed by Filter: 1
                          Buffers: shared hit=25655
                          ->  Gather Merge  (cost=32537.88..32538.24 rows=3 width=69) (actual time=184.688..188.798 rows=7 loops=1)
                                Workers Planned: 3
                                Workers Launched: 3
                                Buffers: shared hit=25655
                                ->  Sort  (cost=31537.84..31537.84 rows=1 width=69) (actual time=178.566..178.571 rows=2 loops=4)
                                      Sort Key: b.sitter_id
                                      Sort Method: quicksort  Memory: 25kB
                                      Buffers: shared hit=25655
                                      Worker 0:  Sort Method: quicksort  Memory: 25kB
                                      Worker 1:  Sort Method: quicksort  Memory: 25kB
                                      Worker 2:  Sort Method: quicksort  Memory: 25kB
                                      ->  Partial HashAggregate  (cost=31537.82..31537.83 rows=1 width=69) (actual time=178.495..178.500 rows=2 loops=4)
                                            Group Key: b.sitter_id
                                            Batches: 1  Memory Usage: 24kB
                                            Buffers: shared hit=25631
                                            Worker 0:  Batches: 1  Memory Usage: 24kB
                                            Worker 1:  Batches: 1  Memory Usage: 24kB
                                            Worker 2:  Batches: 1  Memory Usage: 24kB
                                            ->  Parallel Hash Join  (cost=4971.92..31268.59 rows=53846 width=41) (actual time=36.863..167.092 rows=41731 loops=4)
                                                  Hash Cond: ((b.booking_id)::text = (r.booking_id)::text)
                                                  Buffers: shared hit=25631
                                                  ->  Parallel Seq Scan on bookings b  (cost=0.00..25449.88 rows=322588 width=74) (actual time=0.019..27.928 rows=250006 loops=4)
                                                        Buffers: shared hit=22224
                                                  ->  Parallel Hash  (cost=4102.52..4102.52 rows=69552 width=41) (actual time=36.033..36.034 rows=41731 loops=4)
                                                        Buckets: 262144  Batches: 1  Memory Usage: 15200kB
                                                        Buffers: shared hit=3407
                                                        ->  Parallel Seq Scan on reviews r  (cost=0.00..4102.52 rows=69552 width=41) (actual time=0.021..11.538 rows=41731 loops=4)
                                                              Buffers: shared hit=3407
                    ->  Seq Scan on pet_sitters ps  (cost=0.00..16.90 rows=690 width=90) (actual time=0.014..0.014 rows=2 loops=3)
                          Buffers: shared hit=2 read=1
              ->  Seq Scan on users u  (cost=0.00..15.80 rows=580 width=88) (actual time=0.009..0.009 rows=8 loops=3)
                    Buffers: shared hit=2 read=1
        ->  Unique  (cost=33708.21..33708.22 rows=1 width=37) (actual time=31.654..32.684 rows=0 loops=3)
              Buffers: shared hit=22233
              ->  Sort  (cost=33708.21..33708.22 rows=1 width=37) (actual time=31.654..32.684 rows=0 loops=3)
                    Sort Key: b_1.sitter_id
                    Sort Method: quicksort  Memory: 25kB
                    Buffers: shared hit=22233
                    ->  Gather  (cost=1000.00..33708.20 rows=1 width=37) (actual time=94.936..98.026 rows=0 loops=1)
                          Workers Planned: 3
                          Workers Launched: 3
                          Buffers: shared hit=22233
                          ->  Parallel Seq Scan on bookings b_1  (cost=0.00..32708.10 rows=1 width=37) (actual time=89.286..89.286 rows=0 loops=4)
                                Filter: (((status)::text = ANY ('{Confirmed,Pending}'::text[])) AND (date_from <= ((CURRENT_DATE + '15 days'::interval))::date) AND (date_to >= ((CURRENT_DATE + '10 days'::interval))::date))
                                Rows Removed by Filter: 250006
                                Buffers: shared hit=22233
Planning:
  Buffers: shared hit=11 read=8
Planning Time: 1.269 ms
Execution Time: 287.147 ms

Average execution time (10 attempts): 283.421ms

We add this index to optimize the date overlapping:

CREATE INDEX idx_bookings_sitter_status_dates
ON project.bookings (sitter_id, status, date_from, date_to);

Execution with indexes:

Limit  (cost=32538.32..48741.33 rows=1 width=51) (actual time=192.798..202.516 rows=3 loops=1)
  Buffers: shared hit=27463 read=901
  ->  Nested Loop Anti Join  (cost=32538.32..48741.33 rows=1 width=51) (actual time=192.796..202.512 rows=3 loops=1)
        Join Filter: ((b_1.sitter_id)::text = (ps.user_id)::text)
        Buffers: shared hit=27463 read=901
        ->  Nested Loop  (cost=32537.88..32586.84 rows=1 width=141) (actual time=180.755..185.162 rows=3 loops=1)
              Join Filter: ((ps.user_id)::text = (u.user_id)::text)
              Rows Removed by Join Filter: 20
              Buffers: shared hit=25661
              ->  Nested Loop  (cost=32537.88..32563.79 rows=1 width=127) (actual time=180.740..185.126 rows=3 loops=1)
                    Join Filter: ((ps.user_id)::text = (b.sitter_id)::text)
                    Rows Removed by Join Filter: 3
                    Buffers: shared hit=25658
                    ->  Finalize GroupAggregate  (cost=32537.88..32538.27 rows=1 width=37) (actual time=180.718..185.081 rows=3 loops=1)
                          Group Key: b.sitter_id
                          Filter: (avg(r.rating) >= 4.0)
                          Rows Removed by Filter: 1
                          Buffers: shared hit=25655
                          ->  Gather Merge  (cost=32537.88..32538.24 rows=3 width=69) (actual time=180.700..185.043 rows=7 loops=1)
                                Workers Planned: 3
                                Workers Launched: 3
                                Buffers: shared hit=25655
                                ->  Sort  (cost=31537.84..31537.84 rows=1 width=69) (actual time=174.919..174.925 rows=2 loops=4)
                                      Sort Key: b.sitter_id
                                      Sort Method: quicksort  Memory: 25kB
                                      Buffers: shared hit=25655
                                      Worker 0:  Sort Method: quicksort  Memory: 25kB
                                      Worker 1:  Sort Method: quicksort  Memory: 25kB
                                      Worker 2:  Sort Method: quicksort  Memory: 25kB
                                      ->  Partial HashAggregate  (cost=31537.82..31537.83 rows=1 width=69) (actual time=174.853..174.857 rows=2 loops=4)
                                            Group Key: b.sitter_id
                                            Batches: 1  Memory Usage: 24kB
                                            Buffers: shared hit=25631
                                            Worker 0:  Batches: 1  Memory Usage: 24kB
                                            Worker 1:  Batches: 1  Memory Usage: 24kB
                                            Worker 2:  Batches: 1  Memory Usage: 24kB
                                            ->  Parallel Hash Join  (cost=4971.92..31268.59 rows=53846 width=41) (actual time=36.095..163.581 rows=41731 loops=4)
                                                  Hash Cond: ((b.booking_id)::text = (r.booking_id)::text)
                                                  Buffers: shared hit=25631
                                                  ->  Parallel Seq Scan on bookings b  (cost=0.00..25449.88 rows=322588 width=74) (actual time=0.019..27.140 rows=250006 loops=4)
                                                        Buffers: shared hit=22224
                                                  ->  Parallel Hash  (cost=4102.52..4102.52 rows=69552 width=41) (actual time=35.280..35.281 rows=41731 loops=4)
                                                        Buckets: 262144  Batches: 1  Memory Usage: 15136kB
                                                        Buffers: shared hit=3407
                                                        ->  Parallel Seq Scan on reviews r  (cost=0.00..4102.52 rows=69552 width=41) (actual time=0.022..11.047 rows=41731 loops=4)
                                                              Buffers: shared hit=3407
                    ->  Seq Scan on pet_sitters ps  (cost=0.00..16.90 rows=690 width=90) (actual time=0.008..0.008 rows=2 loops=3)
                          Buffers: shared hit=3
              ->  Seq Scan on users u  (cost=0.00..15.80 rows=580 width=88) (actual time=0.006..0.006 rows=8 loops=3)
                    Buffers: shared hit=3
        ->  Unique  (cost=0.44..16154.48 rows=1 width=37) (actual time=5.776..5.776 rows=0 loops=3)
              Buffers: shared hit=1802 read=901
              ->  Index Only Scan using idx_bookings_sitter_status_dates on bookings b_1  (cost=0.44..16154.48 rows=1 width=37) (actual time=5.775..5.775 rows=0 loops=3)
                    Index Cond: ((status = ANY ('{Confirmed,Pending}'::text[])) AND (date_from <= ((CURRENT_DATE + '15 days'::interval))::date) AND (date_to >= ((CURRENT_DATE + '10 days'::interval))::date))
                    Heap Fetches: 0
                    Buffers: shared hit=1802 read=901
Planning:
  Buffers: shared hit=49 read=1
Planning Time: 1.463 ms
Execution Time: 202.649 ms

Average execution time (10 attempts): 199.530ms

With this index, we eliminated the expensive sequential scan in exchange for a highly efficient index scan. The query is still bottlenecked by AVG(rating), this index provides a crucial performance improvement to the search feature, so we will keep it.

Security

1. Password Security (BCrypt)

User passwords are encrypted before being stored in the database using BCryptPasswordEncoder. This ensures that even in the event of a database breach, plain-text passwords remain unreadable and are protected against brute-force and dictionary attacks. Authentication is securely handled by verifying the raw input against the stored BCrypt hash within the UserService.

Implementation in UserService.java:

    private final BCryptPasswordEncoder passwordEncoder = new BCryptPasswordEncoder();

    @Transactional(readOnly = true)
    public User authenticate(String username, String password) {
        User user = userRepository.findByUsername(username).orElse(null);
        if (user != null && passwordEncoder.matches(password, user.getPassword())) {
            return user;
        }
        return null;
    }

    @Transactional
    public User registerUser(String username, String password, String firstName, String lastName, String email, String role) {
        // ...
        newUser.setUsername(username);
        newUser.setPassword(passwordEncoder.encode(password));
        newUser.setFirstName(firstName);
        // ...
    }

2. SQL Injection Prevention

The application utilizes Spring Data JPA for all database interactions. By relying on JPA's built in repository methods, all user inputs are automatically treated as parameters rather than executable SQL strings. This protects the application from SQL Injection attacks.

Example of safe repository usage in UserService.java:

    @Transactional(readOnly = true)
    public User authenticate(String username, String password) {
        User user = userRepository.findByUsername(username).orElse(null);
        // ...
    }

3. Role based access control (RBAC)

The application implements strict endpoint protection based on user roles (Admin, PetOwner, PetSitter). Controller routes verify the identity and specific subtype of the authenticated user via the HttpSession. If an unauthorized role attempts to access restricted areas, they are immediately redirected, preventing privilege escalation.

Implementation example in AdminController.java:

    @GetMapping("/admin/users")
    public String showAllUsers(HttpSession session, Model model) {
        User user = (User) session.getAttribute("loggedInUser");
        
        if (user == null || user instanceof PetOwner || user instanceof PetSitter) {
            return "redirect:/dashboard";
        }
        
        List<User> users = userRepository.findAll();
        model.addAttribute("users", users);
        // ...
        return "admin-users";
    }

Not applicable

During the security analysis phase, two common web security measures were evaluated but found to be inapplicable for our specific architecture:

JWTs

JSON Web Tokens are designed for Stateless REST APIs (for example, when using a completely separate React/Vue frontend).

Because our application uses a Server Side Rendering - SSR architecture returning HTML views directly from controllers, we manage state using the built-in HttpSession. Session based authentication is the standard for this specific architecture and provides excellent security. Implementing JWTs would require an unnecessary and highly complex rewrite of our entire authentication system, moving away from SSR best practices.

CORS

Because our frontend and backend are served together from the exact same Spring Boot instance, they share the exact same origin. Therefore, cross origin requests do not occur, making CORS configuration unnecessary for our application.

Note: See TracWiki for help on using the wiki.