Changes between Initial Version and Version 1 of OtherTopics


Ignore:
Timestamp:
06/01/26 00:42:06 (3 weeks ago)
Author:
181201
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • OtherTopics

    v1 v1  
     1= Other topics (Performance, Security, …)
     2
     3== Performance ==
     4
     5=== Adding 1000000 mock rows to 'bookings' table ===
     6
     7{{{
     8#!sql
     9ALTER TABLE project.bookings DISABLE TRIGGER trg_booking_validation;
     10
     11INSERT INTO project.bookings (status, date_from, date_to, address, owner_id, sitter_id)
     12SELECT
     13    (ARRAY['Pending', 'Confirmed', 'Completed', 'Canceled', 'Expired', 'Reviewed'])[floor(random()*6)+1] AS status,
     14    t.dt AS date_from,
     15    t.dt + (random() * INTERVAL '7 days') AS date_to,
     16    'Mock Address ' || gs AS address,
     17    o.user_id AS owner_id,
     18    s.user_id AS sitter_id
     19FROM generate_series(1, 1000000) gs
     20JOIN LATERAL (
     21    SELECT user_id FROM project.pet_owners ORDER BY random() LIMIT 1
     22) o ON true
     23JOIN LATERAL (
     24    SELECT user_id FROM project.pet_sitters ORDER BY random() LIMIT 1
     25) s ON true
     26JOIN LATERAL (
     27    SELECT CURRENT_DATE - (random() * INTERVAL '365 days') AS dt
     28) t ON true;
     29
     30ALTER TABLE project.bookings ENABLE TRIGGER trg_booking_validation;
     31}}}
     32
     33Benchmark query:
     34{{{
     35#!sql
     36EXPLAIN (ANALYZE, BUFFERS)
     37SELECT * FROM project.bookings
     38WHERE sitter_id = (SELECT user_id FROM project.pet_sitters LIMIT 1)
     39ORDER BY date_from DESC;
     40}}}
     41
     42Execution without indexes:
     43
     44{{{
     45Gather Merge  (cost=81030.64..195549.38 rows=967764 width=146) (actual time=94.209..106.394 rows=2 loops=1)
     46  Workers Planned: 3
     47  Workers Launched: 3
     48  Buffers: shared hit=22336
     49  InitPlan 1
     50    ->  Limit  (cost=0.00..0.02 rows=1 width=90) (actual time=3.367..3.369 rows=1 loops=1)
     51          Buffers: shared hit=1
     52          ->  Seq Scan on pet_sitters  (cost=0.00..16.90 rows=690 width=90) (actual time=0.015..0.016 rows=1 loops=1)
     53                Buffers: shared hit=1
     54  ->  Sort  (cost=80030.57..80837.04 rows=322588 width=146) (actual time=60.212..60.213 rows=0 loops=4)
     55        Sort Key: bookings.date_from DESC
     56        Sort Method: quicksort  Memory: 25kB
     57        Buffers: shared hit=22335
     58        Worker 0:  Sort Method: quicksort  Memory: 25kB
     59        Worker 1:  Sort Method: quicksort  Memory: 25kB
     60        Worker 2:  Sort Method: quicksort  Memory: 25kB
     61        ->  Parallel Seq Scan on bookings  (cost=0.00..26256.35 rows=322588 width=146) (actual time=37.611..60.082 rows=0 loops=4)
     62              Filter: ((sitter_id)::text = ((InitPlan 1).col1)::text)
     63              Rows Removed by Filter: 250005
     64              Buffers: shared hit=22224
     65Planning Time: 0.168 ms
     66JIT:
     67  Functions: 12
     68  Options: Inlining false, Optimization false, Expressions true, Deforming true
     69  Timing: Generation 1.466 ms (Deform 0.675 ms), Inlining 0.000 ms, Optimization 1.430 ms, Emission 19.219 ms, Total 22.115 ms
     70Execution Time: 106.834 ms
     71}}}
     72
     73'''Average execution time (10 attempts):''' 105.235ms
     74
     75Next, we add this index:
     76{{{
     77#!sql
     78CREATE INDEX idx_bookings_sitter_date
     79ON project.bookings (sitter_id, date_from DESC);
     80}}}
     81
     82Execution with indexes:
     83
     84{{{
     85Index Scan using idx_bookings_sitter_date on bookings  (cost=0.45..72503.52 rows=1000022 width=146) (actual time=0.066..0.068 rows=2 loops=1)
     86  Index Cond: ((sitter_id)::text = ((InitPlan 1).col1)::text)
     87  Buffers: shared hit=2 read=3
     88  InitPlan 1
     89    ->  Limit  (cost=0.00..0.02 rows=1 width=90) (actual time=0.012..0.013 rows=1 loops=1)
     90          Buffers: shared hit=1
     91          ->  Seq Scan on pet_sitters  (cost=0.00..16.90 rows=690 width=90) (actual time=0.011..0.011 rows=1 loops=1)
     92                Buffers: shared hit=1
     93Planning:
     94  Buffers: shared hit=39 read=2
     95Planning Time: 0.618 ms
     96Execution Time: 0.092 ms
     97}}}
     98
     99'''Average execution time (10 attempts):''' 0.087ms
     100
     101Because the execution time has been massively lowered by bypassing the expensive sequential scan and memory sort, we '''keep''' this index.
     102
     103== Calculate average sitter rating ==
     104
     105Benchmark query:
     106{{{
     107#!sql
     108EXPLAIN (ANALYZE, BUFFERS)
     109SELECT AVG(r.rating)
     110FROM project.reviews r
     111JOIN project.bookings b ON r.booking_id = b.booking_id
     112WHERE b.sitter_id = (SELECT user_id FROM project.pet_sitters LIMIT 1);
     113}}}
     114
     115Execution without indexes:
     116
     117{{{
     118Finalize Aggregate  (cost=33210.02..33210.03 rows=1 width=32) (actual time=121.837..135.038 rows=1 loops=1)
     119  Buffers: shared hit=25818
     120  InitPlan 1
     121    ->  Limit  (cost=0.00..0.02 rows=1 width=90) (actual time=0.014..0.015 rows=1 loops=1)
     122          Buffers: shared hit=1
     123          ->  Seq Scan on pet_sitters  (cost=0.00..16.90 rows=690 width=90) (actual time=0.012..0.013 rows=1 loops=1)
     124                Buffers: shared hit=1
     125  ->  Gather  (cost=33209.67..33209.98 rows=3 width=32) (actual time=119.802..135.016 rows=4 loops=1)
     126        Workers Planned: 3
     127        Workers Launched: 3
     128        Buffers: shared hit=25818
     129        ->  Partial Aggregate  (cost=32209.67..32209.68 rows=1 width=32) (actual time=90.870..90.873 rows=1 loops=4)
     130              Buffers: shared hit=25817
     131              ->  Parallel Hash Join  (cost=4971.92..32075.06 rows=53846 width=4) (actual time=77.089..90.867 rows=0 loops=4)
     132                    Hash Cond: ((b.booking_id)::text = (r.booking_id)::text)
     133                    Buffers: shared hit=25817
     134                    ->  Parallel Seq Scan on bookings b  (cost=0.00..26256.35 rows=322588 width=37) (actual time=41.349..55.119 rows=0 loops=4)
     135                          Filter: ((sitter_id)::text = ((InitPlan 1).col1)::text)
     136                          Rows Removed by Filter: 250005
     137                          Buffers: shared hit=22224
     138                    ->  Parallel Hash  (cost=4102.52..4102.52 rows=69552 width=41) (actual time=34.636..34.637 rows=41731 loops=4)
     139                          Buckets: 262144  Batches: 1  Memory Usage: 15200kB
     140                          Buffers: shared hit=3407
     141                          ->  Parallel Seq Scan on reviews r  (cost=0.00..4102.52 rows=69552 width=41) (actual time=0.026..11.172 rows=41731 loops=4)
     142                                Buffers: shared hit=3407
     143Planning:
     144  Buffers: shared hit=16
     145Planning Time: 0.444 ms
     146Execution Time: 135.097 ms
     147}}}
     148
     149'''Average execution time (10 attempts):''' 131.722ms
     150
     151We add this index:
     152{{{
     153#!sql
     154CREATE INDEX idx_reviews_booking_id
     155ON project.reviews (booking_id);
     156}}}
     157
     158Execution with indexes:
     159
     160{{{
     161Finalize Aggregate  (cost=33210.02..33210.03 rows=1 width=32) (actual time=125.303..138.277 rows=1 loops=1)
     162  Buffers: shared hit=25818
     163  InitPlan 1
     164    ->  Limit  (cost=0.00..0.02 rows=1 width=90) (actual time=0.013..0.014 rows=1 loops=1)
     165          Buffers: shared hit=1
     166          ->  Seq Scan on pet_sitters  (cost=0.00..16.90 rows=690 width=90) (actual time=0.012..0.012 rows=1 loops=1)
     167                Buffers: shared hit=1
     168  ->  Gather  (cost=33209.67..33209.98 rows=3 width=32) (actual time=123.351..138.255 rows=4 loops=1)
     169        Workers Planned: 3
     170        Workers Launched: 3
     171        Buffers: shared hit=25818
     172        ->  Partial Aggregate  (cost=32209.67..32209.68 rows=1 width=32) (actual time=93.069..93.072 rows=1 loops=4)
     173              Buffers: shared hit=25817
     174              ->  Parallel Hash Join  (cost=4971.92..32075.06 rows=53846 width=4) (actual time=78.783..93.066 rows=0 loops=4)
     175                    Hash Cond: ((b.booking_id)::text = (r.booking_id)::text)
     176                    Buffers: shared hit=25817
     177                    ->  Parallel Seq Scan on bookings b  (cost=0.00..26256.35 rows=322588 width=37) (actual time=42.860..57.138 rows=0 loops=4)
     178                          Filter: ((sitter_id)::text = ((InitPlan 1).col1)::text)
     179                          Rows Removed by Filter: 250005
     180                          Buffers: shared hit=22224
     181                    ->  Parallel Hash  (cost=4102.52..4102.52 rows=69552 width=41) (actual time=34.827..34.828 rows=41731 loops=4)
     182                          Buckets: 262144  Batches: 1  Memory Usage: 15168kB
     183                          Buffers: shared hit=3407
     184                          ->  Parallel Seq Scan on reviews r  (cost=0.00..4102.52 rows=69552 width=41) (actual time=0.029..11.437 rows=41731 loops=4)
     185                                Buffers: shared hit=3407
     186Planning:
     187  Buffers: shared hit=16
     188Planning Time: 0.554 ms
     189Execution Time: 138.346 ms
     190}}}
     191
     192'''Average execution time (10 attempts):''' 138.346ms
     193
     194There is practically no change in the execution time, and the query planner completely ignored the index because it determined that performing a Parallel Seq Scan and Parallel Hash Join was cheaper than doing individual index lookups on such a large dataset. Because the index is not utilized for this query and only adds complexity during `INSERT` and `UPDATE` operations, we will '''not''' keep this index.