= Other topics (Performance, Security, …) == Performance == === Adding 1000000 mock rows to 'bookings' table === {{{ #!sql ALTER TABLE project.bookings DISABLE TRIGGER trg_booking_validation; INSERT INTO project.bookings (status, date_from, date_to, address, owner_id, sitter_id) SELECT (ARRAY['Pending', 'Confirmed', 'Completed', 'Canceled', 'Expired', 'Reviewed'])[floor(random()*6)+1] AS status, t.dt AS date_from, t.dt + (random() * INTERVAL '7 days') AS date_to, 'Mock Address ' || gs AS address, o.user_id AS owner_id, s.user_id AS sitter_id FROM generate_series(1, 1000000) gs JOIN LATERAL ( SELECT user_id FROM project.pet_owners ORDER BY random() LIMIT 1 ) o ON true JOIN LATERAL ( SELECT user_id FROM project.pet_sitters ORDER BY random() LIMIT 1 ) s ON true JOIN LATERAL ( SELECT CURRENT_DATE - (random() * INTERVAL '365 days') AS dt ) t ON true; ALTER TABLE project.bookings ENABLE TRIGGER trg_booking_validation; }}} Benchmark query: {{{ #!sql EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM project.bookings WHERE sitter_id = (SELECT user_id FROM project.pet_sitters LIMIT 1) ORDER BY date_from DESC; }}} Execution without indexes: {{{ Gather Merge (cost=81030.64..195549.38 rows=967764 width=146) (actual time=94.209..106.394 rows=2 loops=1) Workers Planned: 3 Workers Launched: 3 Buffers: shared hit=22336 InitPlan 1 -> Limit (cost=0.00..0.02 rows=1 width=90) (actual time=3.367..3.369 rows=1 loops=1) Buffers: shared hit=1 -> Seq Scan on pet_sitters (cost=0.00..16.90 rows=690 width=90) (actual time=0.015..0.016 rows=1 loops=1) Buffers: shared hit=1 -> Sort (cost=80030.57..80837.04 rows=322588 width=146) (actual time=60.212..60.213 rows=0 loops=4) Sort Key: bookings.date_from DESC Sort Method: quicksort Memory: 25kB Buffers: shared hit=22335 Worker 0: Sort Method: quicksort Memory: 25kB Worker 1: Sort Method: quicksort Memory: 25kB Worker 2: Sort Method: quicksort Memory: 25kB -> Parallel Seq Scan on bookings (cost=0.00..26256.35 rows=322588 width=146) (actual time=37.611..60.082 rows=0 loops=4) Filter: ((sitter_id)::text = ((InitPlan 1).col1)::text) Rows Removed by Filter: 250005 Buffers: shared hit=22224 Planning Time: 0.168 ms JIT: Functions: 12 Options: Inlining false, Optimization false, Expressions true, Deforming true Timing: Generation 1.466 ms (Deform 0.675 ms), Inlining 0.000 ms, Optimization 1.430 ms, Emission 19.219 ms, Total 22.115 ms Execution Time: 106.834 ms }}} '''Average execution time (10 attempts):''' 105.235ms Next, we add this index: {{{ #!sql CREATE INDEX idx_bookings_sitter_date ON project.bookings (sitter_id, date_from DESC); }}} Execution with indexes: {{{ Index Scan using idx_bookings_sitter_date on bookings (cost=0.45..72503.52 rows=1000022 width=146) (actual time=0.066..0.068 rows=2 loops=1) Index Cond: ((sitter_id)::text = ((InitPlan 1).col1)::text) Buffers: shared hit=2 read=3 InitPlan 1 -> Limit (cost=0.00..0.02 rows=1 width=90) (actual time=0.012..0.013 rows=1 loops=1) Buffers: shared hit=1 -> Seq Scan on pet_sitters (cost=0.00..16.90 rows=690 width=90) (actual time=0.011..0.011 rows=1 loops=1) Buffers: shared hit=1 Planning: Buffers: shared hit=39 read=2 Planning Time: 0.618 ms Execution Time: 0.092 ms }}} '''Average execution time (10 attempts):''' 0.087ms Because the execution time has been massively lowered by bypassing the expensive sequential scan and memory sort, we '''keep''' this index. == Calculate average sitter rating == Benchmark query: {{{ #!sql EXPLAIN (ANALYZE, BUFFERS) SELECT AVG(r.rating) FROM project.reviews r JOIN project.bookings b ON r.booking_id = b.booking_id WHERE b.sitter_id = (SELECT user_id FROM project.pet_sitters LIMIT 1); }}} Execution without indexes: {{{ Finalize Aggregate (cost=33210.02..33210.03 rows=1 width=32) (actual time=121.837..135.038 rows=1 loops=1) Buffers: shared hit=25818 InitPlan 1 -> Limit (cost=0.00..0.02 rows=1 width=90) (actual time=0.014..0.015 rows=1 loops=1) Buffers: shared hit=1 -> Seq Scan on pet_sitters (cost=0.00..16.90 rows=690 width=90) (actual time=0.012..0.013 rows=1 loops=1) Buffers: shared hit=1 -> Gather (cost=33209.67..33209.98 rows=3 width=32) (actual time=119.802..135.016 rows=4 loops=1) Workers Planned: 3 Workers Launched: 3 Buffers: shared hit=25818 -> Partial Aggregate (cost=32209.67..32209.68 rows=1 width=32) (actual time=90.870..90.873 rows=1 loops=4) Buffers: shared hit=25817 -> Parallel Hash Join (cost=4971.92..32075.06 rows=53846 width=4) (actual time=77.089..90.867 rows=0 loops=4) Hash Cond: ((b.booking_id)::text = (r.booking_id)::text) Buffers: shared hit=25817 -> Parallel Seq Scan on bookings b (cost=0.00..26256.35 rows=322588 width=37) (actual time=41.349..55.119 rows=0 loops=4) Filter: ((sitter_id)::text = ((InitPlan 1).col1)::text) Rows Removed by Filter: 250005 Buffers: shared hit=22224 -> Parallel Hash (cost=4102.52..4102.52 rows=69552 width=41) (actual time=34.636..34.637 rows=41731 loops=4) Buckets: 262144 Batches: 1 Memory Usage: 15200kB Buffers: shared hit=3407 -> Parallel Seq Scan on reviews r (cost=0.00..4102.52 rows=69552 width=41) (actual time=0.026..11.172 rows=41731 loops=4) Buffers: shared hit=3407 Planning: Buffers: shared hit=16 Planning Time: 0.444 ms Execution Time: 135.097 ms }}} '''Average execution time (10 attempts):''' 131.722ms We add this index: {{{ #!sql CREATE INDEX idx_reviews_booking_id ON project.reviews (booking_id); }}} Execution with indexes: {{{ Finalize Aggregate (cost=33210.02..33210.03 rows=1 width=32) (actual time=125.303..138.277 rows=1 loops=1) Buffers: shared hit=25818 InitPlan 1 -> Limit (cost=0.00..0.02 rows=1 width=90) (actual time=0.013..0.014 rows=1 loops=1) Buffers: shared hit=1 -> Seq Scan on pet_sitters (cost=0.00..16.90 rows=690 width=90) (actual time=0.012..0.012 rows=1 loops=1) Buffers: shared hit=1 -> Gather (cost=33209.67..33209.98 rows=3 width=32) (actual time=123.351..138.255 rows=4 loops=1) Workers Planned: 3 Workers Launched: 3 Buffers: shared hit=25818 -> Partial Aggregate (cost=32209.67..32209.68 rows=1 width=32) (actual time=93.069..93.072 rows=1 loops=4) Buffers: shared hit=25817 -> Parallel Hash Join (cost=4971.92..32075.06 rows=53846 width=4) (actual time=78.783..93.066 rows=0 loops=4) Hash Cond: ((b.booking_id)::text = (r.booking_id)::text) Buffers: shared hit=25817 -> Parallel Seq Scan on bookings b (cost=0.00..26256.35 rows=322588 width=37) (actual time=42.860..57.138 rows=0 loops=4) Filter: ((sitter_id)::text = ((InitPlan 1).col1)::text) Rows Removed by Filter: 250005 Buffers: shared hit=22224 -> Parallel Hash (cost=4102.52..4102.52 rows=69552 width=41) (actual time=34.827..34.828 rows=41731 loops=4) Buckets: 262144 Batches: 1 Memory Usage: 15168kB Buffers: shared hit=3407 -> Parallel Seq Scan on reviews r (cost=0.00..4102.52 rows=69552 width=41) (actual time=0.029..11.437 rows=41731 loops=4) Buffers: shared hit=3407 Planning: Buffers: shared hit=16 Planning Time: 0.554 ms Execution Time: 138.346 ms }}} '''Average execution time (10 attempts):''' 138.346ms There is practically no change in the execution time, and the query planner completely ignored the index because it determined that performing a Parallel Seq Scan and Parallel Hash Join was cheaper than doing individual index lookups on such a large dataset. Because the index is not utilized for this query and only adds complexity during `INSERT` and `UPDATE` operations, we will '''not''' keep this index.