= QueryOptimization = == Performance Analysis and Optimization == The performance of the non‑analytical views was analysed using `EXPLAIN (ANALYZE, BUFFERS, VERBOSE, TIMING, FORMAT TEXT)`. Views with poor performance were optimized through indexing and, where necessary, replaced with parameterized functions. == Optimized Views == === `v_student_current_semester` === - '''Performance''': 23.459 ms - This view performs well with the existing indexes. The execution plan shows efficient nested loops with index lookups throughout the join chain. - '''Result''': No optimization needed. === `v_student_upcoming_deadlines` === - '''Performance''': 35.945 ms - The view efficiently combines upcoming exercises and exams using a UNION ALL. Both branches use appropriate indexes. - '''Result''': No optimization needed. === `v_student_transcript` === - '''Performance''': 0.426 ms - Highly efficient, using primarily index scans with minimal buffer hits. - '''Result''': No optimization needed. === v_course_announcements === - '''Performance''': 8.085 ms - Performs well with the existing indexes on Course_Edition and Announcement. - '''Result''': No optimization needed. === v_exercise_grading_queue === - '''Performance''': 122.986 ms - This view is the most complex, joining through multiple tables. The execution time is still within the acceptable 3‑second threshold. - '''Result''': No optimization needed (performance is acceptable). === v_course_prerequisite_chain === - '''Performance''': 0.563 ms - The recursive CTE executes efficiently with minimal buffer hits. - '''Result''': No optimization needed. == Views Replaced with Functions == Two views exhibited severe performance issues that indexing could not resolve. === `v_student_registration_state` === ==== Before Optimization ==== - '''Execution Time''': ~2.3 seconds (2268 ms) - '''Buffers''': shared hit=6947 read=41906, temp read=64919 written=61623 ==== Problem Analysis ==== The CTEs in the view scan all records before applying the WHERE filter. PostgreSQL materializes every CTE completely before executing the outer `WHERE`. Even though only data for a specific `student_id` is needed, the database: - Scans all 1 M program enrollments - Joins all 9.7 M grades with all 13.7 M course enrollments - Expands through equivalences to ~16 M rows - Then discards 99.999 % of them with the WHERE filter Indexes cannot help because the CTEs have no direct condition on `student_id`. This wastes enormous I/O and CPU. ==== Optimization Attempt ==== Created the following indexes (which did not help): {{{ #!sql CREATE INDEX CONCURRENTLY idx_pe_student_active ON Program_Enrollment(student_id) WHERE date_disenrollment IS NULL AND finished IS NOT TRUE; CREATE INDEX CONCURRENTLY idx_se_pe_year_sem ON Semester_Enrollment(program_enrollment_id, academic_year, semester); CREATE INDEX CONCURRENTLY idx_ce_semester ON Course_Enrollment(semester_enrollment_id); CREATE INDEX CONCURRENTLY idx_ce_edition ON Course_Enrollment(course_edition_id); CREATE INDEX CONCURRENTLY idx_sg_ce ON Student_Grade(course_enrollment_id); CREATE INDEX CONCURRENTLY idx_course_equiv_course1 ON Course_Equivalence(course1_code); CREATE INDEX CONCURRENTLY idx_course_equiv_course2 ON Course_Equivalence(course2_code); }}} - '''After Index Creation''': No improvement – execution time ~2285 ms. ==== Solution: Parameterized Function ==== Replaced the view with a function that pushes the filter into the CTEs: {{{ #!sql CREATE OR REPLACE FUNCTION fn_student_registration_state(p_student_id INTEGER) RETURNS TABLE (...) LANGUAGE plpgsql STABLE AS $$ BEGIN RETURN QUERY WITH active_pe AS ( -- Only one row due to student_id filter pushed down SELECT pe.id AS pe_id, pe.academic_program_code FROM Program_Enrollment pe WHERE pe.student_id = p_student_id AND pe.date_disenrollment IS NULL AND pe.finished IS NOT TRUE ), -- All subsequent CTEs now operate on one student's data ... }}} - '''After Optimization''': ~370 ms — a '''6× speed improvement'''. === `v_teaching_assistant_eligibility` === ==== Before Optimization ==== - '''Execution Time''': ~923 ms - '''Buffers''': shared hit=8369 read=40525, temp read=39620 written=39808 ==== Problem Analysis ==== Similar to the previous view, the view processes all students and course editions before filtering, with large table scans and expensive joins. ==== Optimization Attempt ==== Created indexes (again with little effect). After indexing, execution time ~966 ms – no improvement. ==== Solution: Parameterized Function ==== Replaced the view with a function that first identifies the course edition and then filters to only students who have passed that course. - '''After Optimization''': ~940 ms – a slight improvement due to reduced data volume. == Optimization Summary == || '''View''' || '''Initial Time''' || '''Final Time''' || '''Method''' || || v_student_current_semester || 23.459 ms || 23.459 ms || Existing indexes sufficient || || v_student_upcoming_deadlines || 35.945 ms || 35.945 ms || Existing indexes sufficient || || v_student_transcript || 0.426 ms || 0.426 ms || Existing indexes sufficient || || v_course_announcements || 8.085 ms || 8.085 ms || Existing indexes sufficient || || v_exercise_grading_queue || 122.986 ms || 122.986 ms || Acceptable performance || || v_course_prerequisite_chain || 0.563 ms || 0.563 ms || Existing indexes sufficient || || v_student_registration_state || 2268 ms || ~370 ms || Function replacement (6×) || || v_teaching_assistant_eligibility || 923 ms || ~940 ms || Function replacement || == Lessons Learned == - '''CTE Materialization''': PostgreSQL materializes CTEs, preventing early filtering. For queries that need to filter on student_id, CTEs can cause massive unnecessary work. - '''Index Limitations''': Indexes on individual tables cannot overcome the fundamental problem of processing data before filtering. When the query structure prevents early filtering, indexes provide limited benefit. - '''Parameterized Functions''': Pushing filters into functions allows PostgreSQL to apply conditions at the earliest possible stage, dramatically reducing I/O and processing. - '''Analyze Before Indexing''': Always analyse the execution plan before creating indexes. In some cases, the problem is structural, not missing indexes.