Query Optimization
Overview
This phase focuses on the analysis and optimization of database queries for the Courses Management System (CMS). The goal is to ensure that critical reporting and administrative tasks execute efficiently, even under high load. We distinguish between Analytical Views (using aggregate functions) and Non-Analytical Views (no aggregate functions allowed per course requirements).
The optimization process followed a structured approach:
- Analysis: Identified the primary filters and usage scenarios for each view.
- Performance Testing: Measured initial execution time using
EXPLAIN (ANALYZE, BUFFERS, VERBOSE, TIMING, FORMAT TEXT). - Optimization: Added necessary indexes or restructured queries.
- Re-testing: Measured final execution time.
- Decision: Determined if optimization was necessary based on whether the time was acceptable (< 3s for non-analytical, < 3s for analytical).
View 1: v_student_current_semester
- Primary Filter: The view is filtered by
student_id(e.g.,WHERE student_id = 12345). - Usage Scenario: Used to display current semester courses for a specific student. This is critical for the application as it's used for course selection and grade viewing.
- Initial Execution Time: 0.94 ms.
- Slow Operations: None identified.
- Time with Indexes: N/A (No slow operations).
- Optimization Decision: No optimization needed. Execution time is acceptable.
View 2: v_student_upcoming_deadlines
- Primary Filter: The view is filtered by
student_id(e.g.,WHERE student_id = 12345). - Usage Scenario: Used to display upcoming exercise and exam deadlines for a specific student. Critical for student dashboard.
- Initial Execution Time: 8.87 ms.
- Slow Operations: None identified.
- Time with Indexes: N/A (No slow operations).
- Optimization Decision: No optimization needed. Execution time is acceptable.
View 3: v_student_transcript
- Primary Filter: The view is filtered by
student_id(e.g.,WHERE student_id = 12345). - Usage Scenario: Used to display the complete academic transcript for a student. Critical for student and teacher reporting.
- Initial Execution Time: 0.09 ms.
- Slow Operations: None identified.
- Time with Indexes: N/A (No slow operations).
- Optimization Decision: No optimization needed. Execution time is acceptable.
View 4: v_course_announcements
- Primary Filter: The view is filtered by
edition_id(e.g.,WHERE edition_id = 12345). - Usage Scenario: Used to display course announcements. Critical for teacher and student communication.
- Initial Execution Time: 1.07 ms.
- Slow Operations: None identified.
- Time with Indexes: N/A (No slow operations).
- Optimization Decision: No optimization needed. Execution time is acceptable.
View 5: v_exercise_grading_queue
- Primary Filter: The view is filtered by
exercise_id(e.g.,WHERE exercise_id = 12345). - Usage Scenario: Used to display the grading queue for a specific exercise. Critical for teacher grading workflow.
- Initial Execution Time: 92.67 ms.
- Slow Operations: Full scan on
exercise_submission(2.4M rows) with parallel execution workers. - Time with Indexes: N/A (No slow operations).
- Optimization Decision: No optimization needed. Execution time is acceptable.
View 6: v_course_prerequisite_chain
- Primary Filter: The view is filtered by
course_code(e.g.,WHERE course_code = 'F23L3S138'). - Usage Scenario: Used to display the prerequisite chain for a course. Critical for student enrollment eligibility.
- Initial Execution Time: 0.94 ms.
- Slow Operations: None identified.
- Time with Indexes: N/A (No slow operations).
- Optimization Decision: No optimization needed. Execution time is acceptable.
View 7: v_student_registration_state
- Primary Filter: The view is filtered by
student_id(e.g.,WHERE student_id = 12345). - Usage Scenario: Used to check if a student can enroll in a course. Critical for enrollment system.
- Initial Execution Time: 2.18 s.
- Slow Operations: Full scan on
Program_Enrollment,Course_Enrollment,Student_Grade, andCourse_Equivalence. The CTEs inside the View materialized all data before applying the WHERE filter, wasting I/O and CPU. - Time with Indexes: 2.20 s (Indexes did not help due to CTE materialization issue).
- Optimization Decision: The View was replaced by
fn_student_registration_statefunction. The function pushes the filter down immediately, avoiding full table scans. Execution time reduced to ~400ms with proper filtering.
View 8: v_teaching_assistant_eligibility
- Primary Filter: The view is filtered by
edition_id(e.g.,WHERE edition_id = 13139). - Usage Scenario: Used to check if a student is eligible to be a Teaching Assistant. Critical for TA assignment process.
- Initial Execution Time: 848 ms.
- Slow Operations: Full scan on
Course_Enrollment,Teaches,Student_Grade, andTeaching_Assistant. The CTEs inside the View materialized all data before applying the WHERE filter. - Time with Indexes: 841 ms (Indexes did not help due to CTE materialization issue).
- Optimization Decision: The View was replaced by
fn_teaching_assistant_eligibilityfunction. The function pushes the filter down immediately, avoiding full table scans. Execution time reduced to ~470ms with proper filtering.
Analytical Views (Extras)
The following views use aggregate functions and are included as extras. They execute in reasonable time (under 3s).
View 9: v_exam_results_summary
- Primary Filter: The view is filtered by
exam_id(e.g.,WHERE exam_id = 12345). - Usage Scenario: Used to display exam results summary. Critical for teacher reporting.
- Initial Execution Time: 15 ms.
- Slow Operations: None identified.
- Time with Indexes: N/A (No slow operations).
- Optimization Decision: No optimization needed. Execution time is acceptable.
View 10: v_teacher_dashboard
- Primary Filter: The view is filtered by
teacher_id(e.g.,WHERE teacher_id = 12345). - Usage Scenario: Used to display teacher dashboard. Critical for teacher workflow.
- Initial Execution Time: 8 ms.
- Slow Operations: None identified.
- Time with Indexes: N/A (No slow operations).
- Optimization Decision: No optimization needed. Execution time is acceptable.
Indexes Created During Optimization
| Index Name | Table | Columns | Purpose |
|---|---|---|---|
idx_pe_student_active | Program_Enrollment | student_id | Filter for active enrollments |
idx_se_pe_year_sem | Semester_Enrollment | program_enrollment_id, academic_year, semester | Filter for semester enrollments |
idx_course_equiv_course1 | Course_Equivalence | course1_code | Filter for course equivalences |
idx_sg_ce | Student_Grade | course_enrollment_id | Filter for grades |
idx_teaches_teacher_edition | Teaches | teacher_id, course_edition_id | Filter for teaching assignments |
idx_exercise_edition | Exercise | course_edition_id | Filter for exercises |
idx_exam_edition | Exam | course_edition_id | Filter for exams |
Index Limitations and Function Replacement
Despite creating indexes, the v_student_registration_state and v_teaching_assistant_eligibility views still did not reduce query execution time significantly because:
- The CTEs inside the views did not have direct conditions on the indexed columns.
- PostgreSQL materialized every CTE completely before executing the outer WHERE filter.
Solution: Replaced the views with functions that push the filters down immediately.
Execution Plans Analysis
Execution plans were acquired using EXPLAIN (ANALYZE, BUFFERS, VERBOSE, TIMING, FORMAT TEXT).
Before Optimization (Views):
- Seq Scan on
Program_Enrollmentfollowed by full join operations. - Full Table Scan on
Student_GradeandCourse_Enrollment. - High buffer usage due to materialized CTEs.
After Optimization (Functions):
- Index Scan on
Program_Enrollmentimmediately filtered bystudent_id. - Reduced buffer usage and faster I/O.
- Lower CPU usage due to fewer unnecessary joins.
Database Size Impact
Optimization efforts focus on the 13GB total database size. By reducing unnecessary data movement (via function filtering) and utilizing specific indexes, we ensure that critical queries like v_student_current_semester and v_upcoming_deadlines run efficiently without triggering full table scans.
Summary
| View | Initial Time | Optimized Time | Optimization Needed |
|---|---|---|---|
| v_student_current_semester | 0.94 ms | 0.94 ms | No |
| v_student_upcoming_deadlines | 8.87 ms | 8.87 ms | No |
| v_student_transcript | 0.09 ms | 0.09 ms | No |
| v_course_announcements | 1.07 ms | 1.07 ms | No |
| v_exercise_grading_queue | 92.67 ms | 92.67 ms | No |
| v_course_prerequisite_chain | 0.94 ms | 0.94 ms | No |
| v_student_registration_state | 2.18 s | 0.40 s | Yes (Function) |
| v_teaching_assistant_eligibility | 848 ms | 470 ms | Yes (Function) |
| v_exam_results_summary | 15 ms | 15 ms | No |
| v_teacher_dashboard | 8 ms | 8 ms | No |
All critical views now execute within acceptable time limits, ensuring the application functions smoothly.
Function Performance
The following functions were created to optimize critical queries:
| Function | Primary Filter | Usage Scenario | Execution Time |
|---|---|---|---|
fn_student_registration_state | student_id | Check if a student can enroll in a course | ~400ms |
fn_teaching_assistant_eligibility | edition_id | Check if a student is eligible to be a TA | ~470ms |
No Additional Indices Retained
No additional indices were retained because they did not serve any purpose in the optimization process. The function-based approach proved more effective for the specific CTE materialization issues encountered.
