wiki:QueryOptimization

Query Optimization

Overview

This phase focuses on the analysis and optimization of database queries for the Courses Management System (CMS). The goal is to ensure that critical reporting and administrative tasks execute efficiently, even under high load. We distinguish between Analytical Views (using aggregate functions) and Non-Analytical Views (no aggregate functions allowed per course requirements).

The optimization process followed a structured approach:

  1. Analysis: Identified the primary filters and usage scenarios for each view.
  2. Performance Testing: Measured initial execution time using EXPLAIN (ANALYZE, BUFFERS, VERBOSE, TIMING, FORMAT TEXT).
  3. Optimization: Added necessary indexes or restructured queries.
  4. Re-testing: Measured final execution time.
  5. Decision: Determined if optimization was necessary based on whether the time was acceptable (< 3s for non-analytical, < 3s for analytical).

View 1: v_student_current_semester

  1. Primary Filter: The view is filtered by student_id (e.g., WHERE student_id = 12345).
  2. Usage Scenario: Used to display current semester courses for a specific student. This is critical for the application as it's used for course selection and grade viewing.
  3. Initial Execution Time: 0.94 ms.
  4. Slow Operations: None identified.
  5. Time with Indexes: N/A (No slow operations).
  6. Optimization Decision: No optimization needed. Execution time is acceptable.

View 2: v_student_upcoming_deadlines

  1. Primary Filter: The view is filtered by student_id (e.g., WHERE student_id = 12345).
  2. Usage Scenario: Used to display upcoming exercise and exam deadlines for a specific student. Critical for student dashboard.
  3. Initial Execution Time: 8.87 ms.
  4. Slow Operations: None identified.
  5. Time with Indexes: N/A (No slow operations).
  6. Optimization Decision: No optimization needed. Execution time is acceptable.

View 3: v_student_transcript

  1. Primary Filter: The view is filtered by student_id (e.g., WHERE student_id = 12345).
  2. Usage Scenario: Used to display the complete academic transcript for a student. Critical for student and teacher reporting.
  3. Initial Execution Time: 0.09 ms.
  4. Slow Operations: None identified.
  5. Time with Indexes: N/A (No slow operations).
  6. Optimization Decision: No optimization needed. Execution time is acceptable.

View 4: v_course_announcements

  1. Primary Filter: The view is filtered by edition_id (e.g., WHERE edition_id = 12345).
  2. Usage Scenario: Used to display course announcements. Critical for teacher and student communication.
  3. Initial Execution Time: 1.07 ms.
  4. Slow Operations: None identified.
  5. Time with Indexes: N/A (No slow operations).
  6. Optimization Decision: No optimization needed. Execution time is acceptable.

View 5: v_exercise_grading_queue

  1. Primary Filter: The view is filtered by exercise_id (e.g., WHERE exercise_id = 12345).
  2. Usage Scenario: Used to display the grading queue for a specific exercise. Critical for teacher grading workflow.
  3. Initial Execution Time: 92.67 ms.
  4. Slow Operations: Full scan on exercise_submission (2.4M rows) with parallel execution workers.
  5. Time with Indexes: N/A (No slow operations).
  6. Optimization Decision: No optimization needed. Execution time is acceptable.

View 6: v_course_prerequisite_chain

  1. Primary Filter: The view is filtered by course_code (e.g., WHERE course_code = 'F23L3S138').
  2. Usage Scenario: Used to display the prerequisite chain for a course. Critical for student enrollment eligibility.
  3. Initial Execution Time: 0.94 ms.
  4. Slow Operations: None identified.
  5. Time with Indexes: N/A (No slow operations).
  6. Optimization Decision: No optimization needed. Execution time is acceptable.

View 7: v_student_registration_state

  1. Primary Filter: The view is filtered by student_id (e.g., WHERE student_id = 12345).
  2. Usage Scenario: Used to check if a student can enroll in a course. Critical for enrollment system.
  3. Initial Execution Time: 2.18 s.
  4. Slow Operations: Full scan on Program_Enrollment, Course_Enrollment, Student_Grade, and Course_Equivalence. The CTEs inside the View materialized all data before applying the WHERE filter, wasting I/O and CPU.
  5. Time with Indexes: 2.20 s (Indexes did not help due to CTE materialization issue).
  6. Optimization Decision: The View was replaced by fn_student_registration_state function. The function pushes the filter down immediately, avoiding full table scans. Execution time reduced to ~400ms with proper filtering.

View 8: v_teaching_assistant_eligibility

  1. Primary Filter: The view is filtered by edition_id (e.g., WHERE edition_id = 13139).
  2. Usage Scenario: Used to check if a student is eligible to be a Teaching Assistant. Critical for TA assignment process.
  3. Initial Execution Time: 848 ms.
  4. Slow Operations: Full scan on Course_Enrollment, Teaches, Student_Grade, and Teaching_Assistant. The CTEs inside the View materialized all data before applying the WHERE filter.
  5. Time with Indexes: 841 ms (Indexes did not help due to CTE materialization issue).
  6. Optimization Decision: The View was replaced by fn_teaching_assistant_eligibility function. The function pushes the filter down immediately, avoiding full table scans. Execution time reduced to ~470ms with proper filtering.

Analytical Views (Extras)

The following views use aggregate functions and are included as extras. They execute in reasonable time (under 3s).

View 9: v_exam_results_summary

  1. Primary Filter: The view is filtered by exam_id (e.g., WHERE exam_id = 12345).
  2. Usage Scenario: Used to display exam results summary. Critical for teacher reporting.
  3. Initial Execution Time: 15 ms.
  4. Slow Operations: None identified.
  5. Time with Indexes: N/A (No slow operations).
  6. Optimization Decision: No optimization needed. Execution time is acceptable.

View 10: v_teacher_dashboard

  1. Primary Filter: The view is filtered by teacher_id (e.g., WHERE teacher_id = 12345).
  2. Usage Scenario: Used to display teacher dashboard. Critical for teacher workflow.
  3. Initial Execution Time: 8 ms.
  4. Slow Operations: None identified.
  5. Time with Indexes: N/A (No slow operations).
  6. Optimization Decision: No optimization needed. Execution time is acceptable.

Indexes Created During Optimization

Index Name Table Columns Purpose
idx_pe_student_active Program_Enrollment student_id Filter for active enrollments
idx_se_pe_year_sem Semester_Enrollment program_enrollment_id, academic_year, semester Filter for semester enrollments
idx_course_equiv_course1 Course_Equivalence course1_code Filter for course equivalences
idx_sg_ce Student_Grade course_enrollment_id Filter for grades
idx_teaches_teacher_edition Teaches teacher_id, course_edition_id Filter for teaching assignments
idx_exercise_edition Exercise course_edition_id Filter for exercises
idx_exam_edition Exam course_edition_id Filter for exams

Index Limitations and Function Replacement

Despite creating indexes, the v_student_registration_state and v_teaching_assistant_eligibility views still did not reduce query execution time significantly because:

  1. The CTEs inside the views did not have direct conditions on the indexed columns.
  2. PostgreSQL materialized every CTE completely before executing the outer WHERE filter.

Solution: Replaced the views with functions that push the filters down immediately.

Execution Plans Analysis

Execution plans were acquired using EXPLAIN (ANALYZE, BUFFERS, VERBOSE, TIMING, FORMAT TEXT).

Before Optimization (Views):

  • Seq Scan on Program_Enrollment followed by full join operations.
  • Full Table Scan on Student_Grade and Course_Enrollment.
  • High buffer usage due to materialized CTEs.

After Optimization (Functions):

  • Index Scan on Program_Enrollment immediately filtered by student_id.
  • Reduced buffer usage and faster I/O.
  • Lower CPU usage due to fewer unnecessary joins.

Database Size Impact

Optimization efforts focus on the 13GB total database size. By reducing unnecessary data movement (via function filtering) and utilizing specific indexes, we ensure that critical queries like v_student_current_semester and v_upcoming_deadlines run efficiently without triggering full table scans.

Summary

View Initial Time Optimized Time Optimization Needed
v_student_current_semester 0.94 ms 0.94 ms No
v_student_upcoming_deadlines 8.87 ms 8.87 ms No
v_student_transcript 0.09 ms 0.09 ms No
v_course_announcements 1.07 ms 1.07 ms No
v_exercise_grading_queue 92.67 ms 92.67 ms No
v_course_prerequisite_chain 0.94 ms 0.94 ms No
v_student_registration_state 2.18 s 0.40 s Yes (Function)
v_teaching_assistant_eligibility 848 ms 470 ms Yes (Function)
v_exam_results_summary 15 ms 15 ms No
v_teacher_dashboard 8 ms 8 ms No

All critical views now execute within acceptable time limits, ensuring the application functions smoothly.

Function Performance

The following functions were created to optimize critical queries:

Function Primary Filter Usage Scenario Execution Time
fn_student_registration_state student_id Check if a student can enroll in a course ~400ms
fn_teaching_assistant_eligibility edition_id Check if a student is eligible to be a TA ~470ms

No Additional Indices Retained

No additional indices were retained because they did not serve any purpose in the optimization process. The function-based approach proved more effective for the specific CTE materialization issues encountered.

Last modified 5 days ago Last modified on 05/20/26 15:28:05
Note: See TracWiki for help on using the wiki.