| | 334 | |
| | 335 | |
| | 336 | === Сценарио 4 |
| | 337 | |
| | 338 | Последно, ќе го разгледаме извештај број 4 од [[AdvancedReports|Напредни извештаи]] - Детален извештај за најпопуларниот артист според месечни слушања за изминатата година. |
| | 339 | |
| | 340 | Од претходните примери веќе имаме генерирано доволно записи во потребните табели. |
| | 341 | |
| | 342 | Додаваме сложен индекс на `song_id` и `timestamp` во табелата `listens`. |
| | 343 | |
| | 344 | {{{ |
| | 345 | CREATE INDEX listens_song_timestamp_idx |
| | 346 | ON listens (song_id, timestamp); |
| | 347 | }}} |
| | 348 | |
| | 349 | Без индекс добиваме: |
| | 350 | {{{ |
| | 351 | +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| | 352 | |QUERY PLAN | |
| | 353 | +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| | 354 | |Sort (cost=51421.16..51421.23 rows=28 width=76) (actual time=2315.566..2315.578 rows=12 loops=1) | |
| | 355 | | Sort Key: g.n | |
| | 356 | | Sort Method: quicksort Memory: 25kB | |
| | 357 | | CTE one_year | |
| | 358 | | -> Result (cost=0.00..0.03 rows=1 width=16) (actual time=0.009..0.010 rows=1 loops=1) | |
| | 359 | | CTE monthly_artist_listens | |
| | 360 | | -> GroupAggregate (cost=37772.47..40267.47 rows=110889 width=48) (actual time=1931.028..2315.007 rows=21 loops=1) | |
| | 361 | | Group Key: (EXTRACT(month FROM l."timestamp")), a.user_id | |
| | 362 | | -> Sort (cost=37772.47..38049.69 rows=110889 width=48) (actual time=1931.015..2130.522 rows=820263 loops=1) | |
| | 363 | | Sort Key: (EXTRACT(month FROM l."timestamp")), a.user_id | |
| | 364 | | Sort Method: external merge Disk: 27352kB | |
| | 365 | | -> Hash Join (cost=2875.38..21655.66 rows=110889 width=48) (actual time=92.684..909.499 rows=820263 loops=1) | |
| | 366 | | Hash Cond: (me.released_by = a.user_id) | |
| | 367 | | -> Hash Join (cost=2814.53..21025.96 rows=110889 width=16) (actual time=92.643..618.359 rows=820263 loops=1) | |
| | 368 | | Hash Cond: (l.song_id = me.id) | |
| | 369 | | -> Nested Loop (cost=2813.04..20677.31 rows=110889 width=16) (actual time=92.618..444.931 rows=820263 loops=1) | |
| | 370 | | -> CTE Scan on one_year oy_1 (cost=0.00..0.02 rows=1 width=16) (actual time=0.000..0.003 rows=1 loops=1) | |
| | 371 | | -> Bitmap Heap Scan on listens l (cost=2813.04..19568.40 rows=110889 width=16) (actual time=92.615..348.352 rows=820263 loops=1) | |
| | 372 | | Recheck Cond: (("timestamp" >= oy_1.year_start) AND ("timestamp" <= oy_1.year_end)) | |
| | 373 | | Rows Removed by Index Recheck: 62048 | |
| | 374 | | Heap Blocks: exact=4139 lossy=2218 | |
| | 375 | | -> Bitmap Index Scan on listens_timestamp_idx (cost=0.00..2785.32 rows=110889 width=0) (actual time=91.946..91.946 rows=820263 loops=1)| |
| | 376 | | Index Cond: (("timestamp" >= oy_1.year_start) AND ("timestamp" <= oy_1.year_end)) | |
| | 377 | | -> Hash (cost=1.22..1.22 rows=22 width=16) (actual time=0.016..0.017 rows=23 loops=1) | |
| | 378 | | Buckets: 1024 Batches: 1 Memory Usage: 10kB | |
| | 379 | | -> Seq Scan on musical_entities me (cost=0.00..1.22 rows=22 width=16) (actual time=0.008..0.011 rows=23 loops=1) | |
| | 380 | | -> Hash (cost=32.60..32.60 rows=2260 width=8) (actual time=0.027..0.027 rows=10 loops=1) | |
| | 381 | | Buckets: 4096 Batches: 1 Memory Usage: 33kB | |
| | 382 | | -> Seq Scan on artists a (cost=0.00..32.60 rows=2260 width=8) (actual time=0.022..0.023 rows=10 loops=1) | |
| | 383 | | CTE max_per_month | |
| | 384 | | -> GroupAggregate (cost=0.00..2774.22 rows=200 width=40) (actual time=1931.044..2315.071 rows=9 loops=1) | |
| | 385 | | Group Key: mal_1.month | |
| | 386 | | -> CTE Scan on monthly_artist_listens mal_1 (cost=0.00..2217.78 rows=110889 width=40) (actual time=1931.030..2315.036 rows=21 loops=1) | |
| | 387 | | -> Nested Loop (cost=0.27..8378.76 rows=28 width=76) (actual time=1987.811..2315.549 rows=12 loops=1) | |
| | 388 | | -> CTE Scan on one_year oy (cost=0.00..0.02 rows=1 width=8) (actual time=0.011..0.012 rows=1 loops=1) | |
| | 389 | | -> Nested Loop Left Join (cost=0.27..8378.11 rows=28 width=25) (actual time=1987.766..2315.391 rows=12 loops=1) | |
| | 390 | | Join Filter: (mal.user_id = u.user_id) | |
| | 391 | | Rows Removed by Join Filter: 166 | |
| | 392 | | -> Hash Right Join (cost=0.27..8361.41 rows=28 width=20) (actual time=1987.723..2315.283 rows=12 loops=1) | |
| | 393 | | Hash Cond: (mal.month = (g.n)::numeric) | |
| | 394 | | -> Merge Anti Join (cost=0.00..8353.92 rows=462 width=48) (actual time=1987.698..2315.225 rows=9 loops=1) | |
| | 395 | | Merge Cond: (mal.month = mal1.month) | |
| | 396 | | Join Filter: (mal.user_id > mal1.user_id) | |
| | 397 | | Rows Removed by Join Filter: 10 | |
| | 398 | | -> Merge Join (cost=0.00..4162.84 rows=554 width=48) (actual time=1931.055..1931.092 rows=10 loops=1) | |
| | 399 | | Merge Cond: (mpm.month = mal.month) | |
| | 400 | | Join Filter: (mal.count = mpm.max_month_counter) | |
| | 401 | | Rows Removed by Join Filter: 11 | |
| | 402 | | -> CTE Scan on max_per_month mpm (cost=0.00..4.00 rows=200 width=40) (actual time=1931.045..1931.048 rows=9 loops=1) | |
| | 403 | | -> Materialize (cost=0.00..2495.00 rows=110889 width=48) (actual time=0.003..0.015 rows=21 loops=1) | |
| | 404 | | -> CTE Scan on monthly_artist_listens mal (cost=0.00..2217.78 rows=110889 width=48) (actual time=0.001..0.006 rows=21 loops=1) | |
| | 405 | | -> Materialize (cost=0.00..4164.22 rows=554 width=40) (actual time=0.006..384.106 rows=12 loops=1) | |
| | 406 | | -> Merge Join (cost=0.00..4162.84 rows=554 width=40) (actual time=0.004..384.097 rows=10 loops=1) | |
| | 407 | | Merge Cond: (mpm1.month = mal1.month) | |
| | 408 | | Join Filter: (mal1.count = mpm1.max_month_counter) | |
| | 409 | | Rows Removed by Join Filter: 11 | |
| | 410 | | -> CTE Scan on max_per_month mpm1 (cost=0.00..4.00 rows=200 width=40) (actual time=0.000..384.037 rows=9 loops=1) | |
| | 411 | | -> Materialize (cost=0.00..2495.00 rows=110889 width=48) (actual time=0.001..0.013 rows=21 loops=1) | |
| | 412 | | -> CTE Scan on monthly_artist_listens mal1 (cost=0.00..2217.78 rows=110889 width=48) (actual time=0.000..0.006 rows=21 loops=1) | |
| | 413 | | -> Hash (cost=0.12..0.12 rows=12 width=4) (actual time=0.012..0.013 rows=12 loops=1) | |
| | 414 | | Buckets: 1024 Batches: 1 Memory Usage: 9kB | |
| | 415 | | -> Function Scan on generate_series g (cost=0.00..0.12 rows=12 width=4) (actual time=0.005..0.006 rows=12 loops=1) | |
| | 416 | | -> Materialize (cost=0.00..2.51 rows=34 width=21) (actual time=0.003..0.006 rows=15 loops=12) | |
| | 417 | | -> Seq Scan on users u (cost=0.00..2.34 rows=34 width=21) (actual time=0.033..0.046 rows=20 loops=1) | |
| | 418 | |Planning Time: 0.873 ms | |
| | 419 | |Execution Time: 2321.677 ms | |
| | 420 | +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| | 421 | }}} |
| | 422 | |
| | 423 | Просечното `Execution Time` од 10 извршувања е **2248ms**. |
| | 424 | Можеме да забележиме дека тука се користи индексот кој претходно го додадовме на `timestamp`, но веднаш потоа е потребен атрибутот `song_id` на табелата `listens` за да се направи join, па за тоа се користи `Bitmap Heap Scan`. Додавање на индексот треба да го реши овој проблем. |
| | 425 | |
| | 426 | По додавање на индекс: |
| | 427 | {{{ |
| | 428 | +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| | 429 | |QUERY PLAN | |
| | 430 | +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| | 431 | |Sort (cost=55060.92..55060.99 rows=28 width=76) (actual time=1452.660..1452.672 rows=12 loops=1) | |
| | 432 | | Sort Key: g.n | |
| | 433 | | Sort Method: quicksort Memory: 25kB | |
| | 434 | | CTE one_year | |
| | 435 | | -> Result (cost=0.00..0.03 rows=1 width=16) (actual time=0.009..0.010 rows=1 loops=1) | |
| | 436 | | CTE monthly_artist_listens | |
| | 437 | | -> GroupAggregate (cost=16.41..35570.90 rows=110889 width=48) (actual time=0.248..1452.324 rows=21 loops=1) | |
| | 438 | | Group Key: a.user_id, (EXTRACT(month FROM l."timestamp")) | |
| | 439 | | -> Incremental Sort (cost=16.41..33353.12 rows=110889 width=48) (actual time=0.242..1287.613 rows=820263 loops=1) | |
| | 440 | | Sort Key: a.user_id, (EXTRACT(month FROM l."timestamp")) | |
| | 441 | | Presorted Key: a.user_id | |
| | 442 | | Full-sort Groups: 1 Sort Method: quicksort Average Memory: 27kB Peak Memory: 27kB | |
| | 443 | | Pre-sorted Groups: 5 Sort Methods: quicksort, external merge Average Memory: 20kB Peak Memory: 25kB Average Disk: 5464kB Peak Disk: 27320kB | |
| | 444 | | -> Nested Loop (cost=2.29..28807.69 rows=110889 width=48) (actual time=0.080..393.816 rows=820263 loops=1) | |
| | 445 | | -> Merge Join (cost=1.87..89.75 rows=22 width=16) (actual time=0.048..0.072 rows=23 loops=1) | |
| | 446 | | Merge Cond: (a.user_id = me.released_by) | |
| | 447 | | -> Index Only Scan using artists_pkey on artists a (cost=0.15..82.06 rows=2260 width=8) (actual time=0.019..0.027 rows=10 loops=1) | |
| | 448 | | Heap Fetches: 10 | |
| | 449 | | -> Sort (cost=1.71..1.77 rows=22 width=16) (actual time=0.026..0.030 rows=23 loops=1) | |
| | 450 | | Sort Key: me.released_by | |
| | 451 | | Sort Method: quicksort Memory: 25kB | |
| | 452 | | -> Seq Scan on musical_entities me (cost=0.00..1.22 rows=22 width=16) (actual time=0.013..0.017 rows=23 loops=1) | |
| | 453 | | -> Nested Loop (cost=0.42..1070.98 rows=22178 width=16) (actual time=0.005..8.666 rows=35664 loops=23) | |
| | 454 | | -> CTE Scan on one_year oy_1 (cost=0.00..0.02 rows=1 width=16) (actual time=0.000..0.000 rows=1 loops=23) | |
| | 455 | | -> Index Only Scan using listens_song_timestamp_idx on listens l (cost=0.42..849.18 rows=22178 width=16) (actual time=0.005..5.489 rows=35664 loops=23)| |
| | 456 | | Index Cond: ((song_id = me.id) AND ("timestamp" >= oy_1.year_start) AND ("timestamp" <= oy_1.year_end)) | |
| | 457 | | Heap Fetches: 62 | |
| | 458 | | CTE max_per_month | |
| | 459 | | -> Sort (cost=2781.87..2782.37 rows=200 width=40) (actual time=1452.185..1452.186 rows=9 loops=1) | |
| | 460 | | Sort Key: mal_1.month | |
| | 461 | | Sort Method: quicksort Memory: 25kB | |
| | 462 | | -> HashAggregate (cost=2772.22..2774.22 rows=200 width=40) (actual time=1452.171..1452.175 rows=9 loops=1) | |
| | 463 | | Group Key: mal_1.month | |
| | 464 | | Batches: 1 Memory Usage: 40kB | |
| | 465 | | -> CTE Scan on monthly_artist_listens mal_1 (cost=0.00..2217.78 rows=110889 width=40) (actual time=0.000..1452.107 rows=21 loops=1) | |
| | 466 | | -> Hash Left Join (cost=16698.15..16706.94 rows=28 width=76) (actual time=1452.627..1452.654 rows=12 loops=1) | |
| | 467 | | Hash Cond: (mal.user_id = u.user_id) | |
| | 468 | | -> Hash Right Join (cost=16695.39..16703.75 rows=28 width=28) (actual time=1452.560..1452.574 rows=12 loops=1) | |
| | 469 | | Hash Cond: (mal.month = (g.n)::numeric) | |
| | 470 | | -> Sort (cost=16694.98..16696.13 rows=462 width=48) (actual time=1452.522..1452.526 rows=9 loops=1) | |
| | 471 | | Sort Key: mal.month | |
| | 472 | | Sort Method: quicksort Memory: 25kB | |
| | 473 | | -> Hash Anti Join (cost=8343.14..16674.53 rows=462 width=48) (actual time=1452.502..1452.517 rows=9 loops=1) | |
| | 474 | | Hash Cond: (mal.month = mal1.month) | |
| | 475 | | Join Filter: (mal.user_id > mal1.user_id) | |
| | 476 | | Rows Removed by Join Filter: 11 | |
| | 477 | | -> Hash Join (cost=7.00..8329.22 rows=554 width=48) (actual time=1452.458..1452.468 rows=10 loops=1) | |
| | 478 | | Hash Cond: ((mal.month = mpm.month) AND (mal.count = mpm.max_month_counter)) | |
| | 479 | | -> CTE Scan on monthly_artist_listens mal (cost=0.00..2217.78 rows=110889 width=48) (actual time=0.249..0.252 rows=21 loops=1) | |
| | 480 | | -> Hash (cost=4.00..4.00 rows=200 width=40) (actual time=1452.199..1452.199 rows=9 loops=1) | |
| | 481 | | Buckets: 1024 Batches: 1 Memory Usage: 9kB | |
| | 482 | | -> CTE Scan on max_per_month mpm (cost=0.00..4.00 rows=200 width=40) (actual time=1452.187..1452.191 rows=9 loops=1) | |
| | 483 | | -> Hash (cost=8329.22..8329.22 rows=554 width=40) (actual time=0.028..0.029 rows=10 loops=1) | |
| | 484 | | Buckets: 1024 Batches: 1 Memory Usage: 9kB | |
| | 485 | | -> Hash Join (cost=7.00..8329.22 rows=554 width=40) (actual time=0.015..0.025 rows=10 loops=1) | |
| | 486 | | Hash Cond: ((mal1.month = mpm1.month) AND (mal1.count = mpm1.max_month_counter)) | |
| | 487 | | -> CTE Scan on monthly_artist_listens mal1 (cost=0.00..2217.78 rows=110889 width=48) (actual time=0.001..0.003 rows=21 loops=1) | |
| | 488 | | -> Hash (cost=4.00..4.00 rows=200 width=40) (actual time=0.006..0.006 rows=9 loops=1) | |
| | 489 | | Buckets: 1024 Batches: 1 Memory Usage: 9kB | |
| | 490 | | -> CTE Scan on max_per_month mpm1 (cost=0.00..4.00 rows=200 width=40) (actual time=0.001..0.002 rows=9 loops=1) | |
| | 491 | | -> Hash (cost=0.26..0.26 rows=12 width=12) (actual time=0.028..0.029 rows=12 loops=1) | |
| | 492 | | Buckets: 1024 Batches: 1 Memory Usage: 9kB | |
| | 493 | | -> Nested Loop (cost=0.00..0.26 rows=12 width=12) (actual time=0.019..0.023 rows=12 loops=1) | |
| | 494 | | -> CTE Scan on one_year oy (cost=0.00..0.02 rows=1 width=8) (actual time=0.012..0.012 rows=1 loops=1) | |
| | 495 | | -> Function Scan on generate_series g (cost=0.00..0.12 rows=12 width=4) (actual time=0.005..0.006 rows=12 loops=1) | |
| | 496 | | -> Hash (cost=2.34..2.34 rows=34 width=21) (actual time=0.038..0.038 rows=20 loops=1) | |
| | 497 | | Buckets: 1024 Batches: 1 Memory Usage: 10kB | |
| | 498 | | -> Seq Scan on users u (cost=0.00..2.34 rows=34 width=21) (actual time=0.027..0.032 rows=20 loops=1) | |
| | 499 | |Planning Time: 0.821 ms | |
| | 500 | |Execution Time: 1458.530 ms | |
| | 501 | +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| | 502 | }}} |
| | 503 | |
| | 504 | Просечното време е сега **1412ms**. |
| | 505 | |
| | 506 | Можеме да забележиме дека engine-от во едно поминување на индексот ги собира сите потребни редици. |
| | 507 | {{{ |
| | 508 | -> Index Only Scan using listens_song_timestamp_idx on listens l (cost=0.42..849.18 rows=22178 width=16) (actual time=0.005..5.489 rows=35664 loops=23) |
| | 509 | Index Cond: ((song_id = me.id) AND ("timestamp" >= oy_1.year_start) AND ("timestamp" <= oy_1.year_end)) |
| | 510 | }}} |
| | 511 | |
| | 512 | [[br]] |
| | 513 | **Забелешка**: Додавање на индекс за `listens(timestamp)` и `listens(song_id, timestamp)` може да се смета за делумно редундантно бидејќи во нашите случаи можат да се заменат со единствен сложен индекс на `listens(timestamp, song_id)`. Секако, редоследот на колоните во сложениот индекс има влијание врз перформансите. |
| | 514 | |
| | 515 | Во прашалникот кој го тестираме во сценарио 4 и двата пристапи покажаа подобрување во споредба со случајот без индекс, но перформансите со индексот `listens(song_id, timestamp)` беа подобри. Од друга страна, чување на два одделни индекси наместо еден троши повеќе меморија и го зголемува потребното време за INSERT операции. |
| | 516 | |
| | 517 | Станува збор за trade-off кој може дополнително да се дискутира. Ние одлучивме да чуваме два посебни индекси. |