| Version 2 (modified by , 4 days ago) ( diff ) |
|---|
QueryOptimization
Performance Analysis and Optimization
The performance of the non‑analytical views was analysed using EXPLAIN (ANALYZE, BUFFERS, VERBOSE, TIMING, FORMAT TEXT). Views with poor performance were optimized through indexing and, where necessary, replaced with parameterized functions.
Optimized Views
v_student_current_semester
- Performance: 23.459 ms
- This view performs well with the existing indexes. The execution plan shows efficient nested loops with index lookups throughout the join chain.
- Result: No optimization needed.
v_student_upcoming_deadlines
- Performance: 35.945 ms
- The view efficiently combines upcoming exercises and exams using a UNION ALL. Both branches use appropriate indexes.
- Result: No optimization needed.
v_student_transcript
- Performance: 0.426 ms
- Highly efficient, using primarily index scans with minimal buffer hits.
- Result: No optimization needed.
v_course_announcements
- Performance: 8.085 ms
- Performs well with the existing indexes on Course_Edition and Announcement.
- Result: No optimization needed.
v_exercise_grading_queue
- Performance: 122.986 ms
- This view is the most complex, joining through multiple tables. The execution time is still within the acceptable 3‑second threshold.
- Result: No optimization needed (performance is acceptable).
v_course_prerequisite_chain
- Performance: 0.563 ms
- The recursive CTE executes efficiently with minimal buffer hits.
- Result: No optimization needed.
Views Replaced with Functions
Two views exhibited severe performance issues that indexing could not resolve.
v_student_registration_state
Before Optimization
- Execution Time: ~2.3 seconds (2268 ms)
- Buffers: shared hit=6947 read=41906, temp read=64919 written=61623
Problem Analysis
The CTEs in the view scan all records before applying the WHERE filter. PostgreSQL materializes every CTE completely before executing the outer WHERE. Even though only data for a specific student_id is needed, the database:
- Scans all 1 M program enrollments
- Joins all 9.7 M grades with all 13.7 M course enrollments
- Expands through equivalences to ~16 M rows
- Then discards 99.999 % of them with the WHERE filter
Indexes cannot help because the CTEs have no direct condition on student_id. This wastes enormous I/O and CPU.
Optimization Attempt
Created the following indexes (which did not help):
CREATE INDEX CONCURRENTLY idx_pe_student_active ON Program_Enrollment(student_id) WHERE date_disenrollment IS NULL AND finished IS NOT TRUE; CREATE INDEX CONCURRENTLY idx_se_pe_year_sem ON Semester_Enrollment(program_enrollment_id, academic_year, semester); CREATE INDEX CONCURRENTLY idx_ce_semester ON Course_Enrollment(semester_enrollment_id); CREATE INDEX CONCURRENTLY idx_ce_edition ON Course_Enrollment(course_edition_id); CREATE INDEX CONCURRENTLY idx_sg_ce ON Student_Grade(course_enrollment_id); CREATE INDEX CONCURRENTLY idx_course_equiv_course1 ON Course_Equivalence(course1_code); CREATE INDEX CONCURRENTLY idx_course_equiv_course2 ON Course_Equivalence(course2_code);
- After Index Creation: No improvement – execution time ~2285 ms.
Solution: Parameterized Function
Replaced the view with a function that pushes the filter into the CTEs:
CREATE OR REPLACE FUNCTION fn_student_registration_state(p_student_id INTEGER) RETURNS TABLE (...) LANGUAGE plpgsql STABLE AS $$ BEGIN RETURN QUERY WITH active_pe AS ( -- Only one row due to student_id filter pushed down SELECT pe.id AS pe_id, pe.academic_program_code FROM Program_Enrollment pe WHERE pe.student_id = p_student_id AND pe.date_disenrollment IS NULL AND pe.finished IS NOT TRUE ), -- All subsequent CTEs now operate on one student's data ...
- After Optimization: ~370 ms — a 6× speed improvement.
v_teaching_assistant_eligibility
Before Optimization
- Execution Time: ~923 ms
- Buffers: shared hit=8369 read=40525, temp read=39620 written=39808
Problem Analysis
Similar to the previous view, the view processes all students and course editions before filtering, with large table scans and expensive joins.
Optimization Attempt
Created indexes (again with little effect). After indexing, execution time ~966 ms – no improvement.
Solution: Parameterized Function
Replaced the view with a function that first identifies the course edition and then filters to only students who have passed that course.
- After Optimization: ~940 ms – a slight improvement due to reduced data volume.
Optimization Summary
| View | Initial Time | Final Time | Method |
| v_student_current_semester | 23.459 ms | 23.459 ms | Existing indexes sufficient |
| v_student_upcoming_deadlines | 35.945 ms | 35.945 ms | Existing indexes sufficient |
| v_student_transcript | 0.426 ms | 0.426 ms | Existing indexes sufficient |
| v_course_announcements | 8.085 ms | 8.085 ms | Existing indexes sufficient |
| v_exercise_grading_queue | 122.986 ms | 122.986 ms | Acceptable performance |
| v_course_prerequisite_chain | 0.563 ms | 0.563 ms | Existing indexes sufficient |
| v_student_registration_state | 2268 ms | ~370 ms | Function replacement (6×) |
| v_teaching_assistant_eligibility | 923 ms | ~940 ms | Function replacement |
Lessons Learned
- CTE Materialization: PostgreSQL materializes CTEs, preventing early filtering. For queries that need to filter on student_id, CTEs can cause massive unnecessary work.
- Index Limitations: Indexes on individual tables cannot overcome the fundamental problem of processing data before filtering. When the query structure prevents early filtering, indexes provide limited benefit.
- Parameterized Functions: Pushing filters into functions allows PostgreSQL to apply conditions at the earliest possible stage, dramatically reducing I/O and processing.
- Analyze Before Indexing: Always analyse the execution plan before creating indexes. In some cases, the problem is structural, not missing indexes.
Attachments (2)
-
testing.sql
(3.0 KB
) - added by 4 days ago.
Code for prototyping optimization.
-
meta-functions.sql
(9.7 KB
) - added by 4 days ago.
Functional optimization of the two VIEWs.
Download all attachments as: .zip
