| 1 | | == SQL Performance and Index Optimization == |
| 2 | | |
| 3 | | The goal of this phase is to analyze and improve the performance of the most important database operations in the Online Learning Platform Management System. |
| 4 | | |
| 5 | | The application contains several frequent operations: |
| 6 | | |
| 7 | | browsing and filtering courses |
| 8 | | checking existing enrollments |
| 9 | | validating active user subscriptions |
| 10 | | loading course modules and lessons |
| 11 | | managing support tickets |
| 12 | | generating reports for enrollments, certificates and quiz attempts |
| 13 | | |
| 14 | | The following indexes are not automatically created by primary key constraints, but they can improve the performance of WHERE filters, JOIN operations and report queries. |
| 15 | | |
| 16 | | == Testing Methodology == |
| 17 | | |
| 18 | | The performance analysis is done using PostgreSQL EXPLAIN ANALYZE. |
| 19 | | |
| 20 | | The testing process is: |
| 21 | | |
| 22 | | first, the query is executed without additional indexes |
| 23 | | then, the required index is created |
| 24 | | the same query is executed again |
| 25 | | the execution plan and execution time are compared |
| 26 | | |
| 27 | | This allows us to check whether PostgreSQL uses Seq Scan, Index Scan or another optimized query plan. |
| 28 | | |
| 29 | | == Scenario 1: Course Browsing and Filtering == |
| 30 | | |
| 31 | | Users frequently browse available courses and filter them by course status, name or price. |
| 32 | | |
| 33 | | SQL query: |
| 34 | | |
| 35 | | {{{ |
| 36 | | EXPLAIN ANALYZE |
| 37 | | SELECT c.course_id, c.name, c.price, c.status |
| 38 | | FROM course c |
| 39 | | WHERE c.status = 'ACTIVE' |
| 40 | | ORDER BY c.price; |
| 41 | | }}} |
| | 1 | == SQL Performance Analysis of Complex Queries == |
| | 2 | |
| | 3 | The performance analysis in this phase focuses on the two most complex analytical queries developed in Phase 6. |
| | 4 | |
| | 5 | Instead of analyzing simple CRUD operations, we analyzed the most computationally expensive reports in the system. The goal was to identify the most expensive JOIN operations, grouping operations and date-based aggregations, and then improve their execution using indexes. |
| | 6 | |
| | 7 | For each query, we analyzed: |
| | 8 | |
| | 9 | JOIN conditions |
| | 10 | GROUP BY operations |
| | 11 | aggregate functions |
| | 12 | date-based filtering and grouping |
| | 13 | foreign key relationships |
| | 14 | |
| | 15 | The queries were tested using PostgreSQL EXPLAIN ANALYZE before and after index creation. |
| | 16 | |
| | 17 | === Scenario 1: Yearly Analysis of Overall Platform Activity and Student Success === |
| | 18 | |
| | 19 | This query provides a yearly analytical overview of the platform. |
| | 20 | |
| | 21 | It combines information from: |
| | 22 | |
| | 23 | courses |
| | 24 | enrollments |
| | 25 | certificates |
| | 26 | modules |
| | 27 | lessons |
| | 28 | quizzes |
| | 29 | quiz attempts |
| | 30 | |
| | 31 | The report calculates: |
| | 32 | |
| | 33 | total instructor activity |
| | 34 | total students |
| | 35 | total enrollments |
| | 36 | total quiz attempts |
| | 37 | completed enrollments |
| | 38 | total certificates |
| | 39 | average student success within courses |
| | 40 | average final grade per course |
| | 41 | final yearly success indicator |
| | 42 | |
| | 43 | The query performs multiple JOIN operations across the learning hierarchy: |
| | 44 | |
| | 45 | {{{ |
| | 46 | Course → Module → Lesson → Quiz → QuizAttempt |
| | 47 | Enrollment → Certificate |
| | 48 | Enrollment → QuizAttempt |
| | 49 | }}} |
| | 50 | |
| | 51 | The most expensive operations are the joins between: |
| | 52 | |
| | 53 | {{{ |
| | 54 | cert.enrollment_id = e.enrollment_id |
| | 55 | m.course_id = c.course_id |
| | 56 | l.module_id = m.module_id |
| | 57 | q.lesson_id = l.lesson_id |
| | 58 | qa.quiz_id = q.quiz_id |
| | 59 | qa.user_id = e.user_id |
| | 60 | }}} |
| | 61 | |
| | 62 | To improve performance, indexes were added on the columns used in the JOIN conditions. |
| 46 | | CREATE INDEX IF NOT EXISTS idx_course_status_price |
| 47 | | ON course(status, price); |
| 48 | | |
| 49 | | CREATE INDEX IF NOT EXISTS idx_course_name |
| 50 | | ON course(name); |
| 51 | | }}} |
| 52 | | |
| 53 | | Explanation: |
| 54 | | |
| 55 | | The composite index on (status, price) is useful because the query filters by status and sorts by price. |
| 56 | | |
| 57 | | The index on name improves course searching by course name. |
| 58 | | |
| 59 | | == Scenario 2: Enrollment Checking == |
| 60 | | |
| 61 | | Before a user enrolls in a course, the system checks whether the user is already enrolled in that course. |
| 62 | | |
| 63 | | SQL query: |
| 64 | | |
| 65 | | {{{ |
| 66 | | EXPLAIN ANALYZE |
| 67 | | SELECT e.enrollment_id, e.user_id, e.course_id, e.completion_status |
| 68 | | FROM enrollment e |
| 69 | | WHERE e.user_id = 1 |
| 70 | | AND e.course_id = 3; |
| 71 | | }}} |
| 72 | | |
| 73 | | Index: |
| 74 | | |
| 75 | | {{{ |
| 76 | | CREATE INDEX IF NOT EXISTS idx_enrollment_user_course |
| 77 | | ON enrollment(user_id, course_id); |
| 78 | | }}} |
| 79 | | |
| 80 | | Explanation: |
| 81 | | |
| 82 | | This index is useful because the application frequently searches enrollments by both user_id and course_id. |
| 83 | | |
| 84 | | Without this index, PostgreSQL may scan the whole enrollment table. With the index, the lookup is faster and more direct. |
| 85 | | |
| 86 | | == Scenario 3: Active Subscription Validation == |
| 87 | | |
| 88 | | When a user wants to access course content, the system checks whether the user has an active subscription. |
| 89 | | |
| 90 | | SQL query: |
| 91 | | |
| 92 | | {{{ |
| 93 | | EXPLAIN ANALYZE |
| 94 | | SELECT us.subscription_id, us.user_id, us.plan_id, us.status, us.end_date |
| 95 | | FROM user_subscription us |
| 96 | | WHERE us.user_id = 1 |
| 97 | | AND us.status = 'ACTIVE' |
| 98 | | AND us.end_date >= CURRENT_DATE; |
| 99 | | }}} |
| 100 | | |
| 101 | | Index: |
| 102 | | |
| 103 | | {{{ |
| 104 | | CREATE INDEX IF NOT EXISTS idx_user_subscription_active |
| 105 | | ON user_subscription(user_id, status, end_date); |
| 106 | | }}} |
| 107 | | |
| 108 | | Explanation: |
| 109 | | |
| 110 | | The index improves subscription validation because the query filters by user_id, status and end_date. |
| 111 | | |
| 112 | | This is important because subscription validation is performed before allowing access to paid course content. |
| 113 | | |
| 114 | | == Scenario 4: Instructor Course Management == |
| 115 | | |
| 116 | | Instructors frequently manage their own courses. The system displays all courses created by a selected instructor. |
| 117 | | |
| 118 | | SQL query: |
| 119 | | |
| 120 | | {{{ |
| 121 | | EXPLAIN ANALYZE |
| 122 | | SELECT c.course_id, c.name, c.price, c.status |
| 123 | | FROM course c |
| 124 | | WHERE c.instructor_id = 5; |
| 125 | | }}} |
| 126 | | |
| 127 | | Index: |
| 128 | | |
| 129 | | {{{ |
| 130 | | CREATE INDEX IF NOT EXISTS idx_course_instructor_id |
| 131 | | ON course(instructor_id); |
| 132 | | }}} |
| 133 | | |
| 134 | | Explanation: |
| 135 | | |
| 136 | | The instructor_id column is used for filtering courses by instructor. This index improves the performance of the instructor dashboard. |
| 137 | | |
| 138 | | == Scenario 5: Course Content Loading == |
| 139 | | |
| 140 | | When a user opens a course, the application loads all modules and lessons for that course. |
| 141 | | |
| 142 | | SQL query: |
| 143 | | |
| 144 | | {{{ |
| 145 | | EXPLAIN ANALYZE |
| 146 | | SELECT |
| 147 | | m.module_id, |
| 148 | | m.title AS module_title, |
| 149 | | l.lesson_id, |
| 150 | | l.title AS lesson_title |
| 151 | | FROM module m |
| 152 | | JOIN lesson l ON l.module_id = m.module_id |
| 153 | | WHERE m.course_id = 3 |
| 154 | | ORDER BY m.module_id, l.lesson_id; |
| 155 | | }}} |
| 156 | | |
| 157 | | Indexes: |
| 158 | | |
| 159 | | {{{ |
| 160 | | CREATE INDEX IF NOT EXISTS idx_module_course_id |
| | 67 | CREATE INDEX IF NOT EXISTS idx_certificate_enrollment |
| | 68 | ON certificate(enrollment_id); |
| | 69 | |
| | 70 | CREATE INDEX IF NOT EXISTS idx_module_course |
| 165 | | }}} |
| 166 | | |
| 167 | | Explanation: |
| 168 | | |
| 169 | | The index on module(course_id) improves loading modules for a selected course. |
| 170 | | |
| 171 | | The index on lesson(module_id) improves loading lessons for each module. |
| 172 | | |
| 173 | | == Scenario 6: Admin Support Ticket Filtering == |
| 174 | | |
| 175 | | Administrators frequently filter support tickets by status and check the newest tickets first. |
| 176 | | |
| 177 | | SQL query: |
| 178 | | |
| 179 | | {{{ |
| 180 | | EXPLAIN ANALYZE |
| 181 | | SELECT st.ticket_id, st.subject, st.status, st.created_at |
| 182 | | FROM support_ticket st |
| 183 | | WHERE st.status = 'OPEN' |
| 184 | | ORDER BY st.created_at DESC; |
| 185 | | }}} |
| 186 | | |
| 187 | | Index: |
| 188 | | |
| 189 | | {{{ |
| 190 | | CREATE INDEX IF NOT EXISTS idx_support_ticket_status_created |
| 191 | | ON support_ticket(status, created_at DESC); |
| 192 | | }}} |
| 193 | | |
| 194 | | Explanation: |
| 195 | | |
| 196 | | This index is useful because the admin panel filters tickets by status and sorts them by created_at. |
| 197 | | |
| 198 | | == Scenario 7: Annual Enrollment and Certificate Report == |
| 199 | | |
| 200 | | This report analyzes enrollments, completed courses and issued certificates per year. |
| 201 | | |
| 202 | | SQL query: |
| 203 | | |
| 204 | | {{{ |
| 205 | | EXPLAIN ANALYZE |
| 206 | | SELECT |
| 207 | | EXTRACT(YEAR FROM e.enroll_date) AS year, |
| 208 | | COUNT(DISTINCT e.enrollment_id) AS total_enrollments, |
| 209 | | COUNT(DISTINCT CASE |
| 210 | | WHEN e.completion_status = 'COMPLETED' THEN e.enrollment_id |
| 211 | | END) AS completed_enrollments, |
| 212 | | COUNT(DISTINCT c.certificate_id) AS issued_certificates, |
| 213 | | AVG(e.progress_percentage) AS average_progress |
| 214 | | FROM enrollment e |
| 215 | | LEFT JOIN certificate c ON c.enrollment_id = e.enrollment_id |
| 216 | | GROUP BY EXTRACT(YEAR FROM e.enroll_date) |
| 217 | | ORDER BY year; |
| 218 | | }}} |
| 219 | | |
| 220 | | Indexes: |
| 221 | | |
| 222 | | {{{ |
| 223 | | CREATE INDEX IF NOT EXISTS idx_enrollment_enroll_date |
| | 75 | |
| | 76 | CREATE INDEX IF NOT EXISTS idx_quiz_lesson |
| | 77 | ON quiz(lesson_id); |
| | 78 | |
| | 79 | CREATE INDEX IF NOT EXISTS idx_quiz_attempt_quiz_user |
| | 80 | ON quiz_attempt(quiz_id, user_id); |
| | 81 | |
| | 82 | CREATE INDEX IF NOT EXISTS idx_enrollment_date |
| 225 | | |
| 226 | | CREATE INDEX IF NOT EXISTS idx_enrollment_completion_status |
| 227 | | ON enrollment(completion_status); |
| 228 | | |
| 229 | | CREATE INDEX IF NOT EXISTS idx_certificate_enrollment_id |
| 230 | | ON certificate(enrollment_id); |
| 231 | | }}} |
| 232 | | |
| 233 | | Explanation: |
| 234 | | |
| 235 | | The index on enroll_date improves time-based reports. |
| 236 | | |
| 237 | | The index on completion_status helps when filtering completed enrollments. |
| 238 | | |
| 239 | | The index on certificate(enrollment_id) improves the join between certificate and enrollment. |
| 240 | | |
| 241 | | == Scenario 8: Quiz Attempts Report == |
| 242 | | |
| 243 | | The system can analyze quiz attempts by user, quiz and attempt date. |
| 244 | | |
| 245 | | SQL query: |
| 246 | | |
| 247 | | {{{ |
| 248 | | EXPLAIN ANALYZE |
| 249 | | SELECT |
| 250 | | qa.user_id, |
| 251 | | qa.quiz_id, |
| 252 | | COUNT(*) AS total_attempts, |
| 253 | | AVG(qa.score) AS average_score |
| 254 | | FROM quiz_attempt qa |
| 255 | | GROUP BY qa.user_id, qa.quiz_id |
| 256 | | ORDER BY qa.user_id; |
| 257 | | }}} |
| 258 | | |
| 259 | | Indexes: |
| 260 | | |
| 261 | | {{{ |
| 262 | | CREATE INDEX IF NOT EXISTS idx_quiz_attempt_user_id |
| 263 | | ON quiz_attempt(user_id); |
| 264 | | |
| 265 | | CREATE INDEX IF NOT EXISTS idx_quiz_attempt_quiz_id |
| | 84 | }}} |
| | 85 | |
| | 86 | Explanation: |
| | 87 | |
| | 88 | The first five indexes reduce the cost of JOIN operations by allowing PostgreSQL to directly locate related records instead of scanning entire tables. |
| | 89 | |
| | 90 | The index on enroll_date improves the yearly grouping operation because the report groups data by year. |
| | 91 | |
| | 92 | === Scenario 2: Monthly Analysis of Course Activity, Completions, Instructor Involvement and Student Success === |
| | 93 | |
| | 94 | This query provides a month-by-month analytical overview of the platform for a selected year. |
| | 95 | |
| | 96 | The report combines: |
| | 97 | |
| | 98 | enrollment activity |
| | 99 | course completions |
| | 100 | instructor participation |
| | 101 | quiz performance |
| | 102 | student success metrics |
| | 103 | |
| | 104 | The query performs multiple JOIN operations through the complete learning structure: |
| | 105 | |
| | 106 | {{{ |
| | 107 | QuizAttempt → Quiz → Lesson → Module → Course |
| | 108 | Enrollment → Course |
| | 109 | Enrollment → User |
| | 110 | }}} |
| | 111 | |
| | 112 | The most important JOIN conditions are: |
| | 113 | |
| | 114 | {{{ |
| | 115 | qa.quiz_id = q.quiz_id |
| | 116 | q.lesson_id = l.lesson_id |
| | 117 | l.module_id = m.module_id |
| | 118 | m.course_id = c.course_id |
| | 119 | e.course_id = c.course_id |
| | 120 | }}} |
| | 121 | |
| | 122 | Additionally, the report groups data by: |
| | 123 | |
| | 124 | {{{ |
| | 125 | YEAR(enroll_date) |
| | 126 | MONTH(enroll_date) |
| | 127 | |
| | 128 | YEAR(attempt_date) |
| | 129 | MONTH(attempt_date) |
| | 130 | }}} |
| | 131 | |
| | 132 | To optimize these operations, the following indexes were added: |
| | 133 | |
| | 134 | {{{ |
| | 135 | CREATE INDEX IF NOT EXISTS idx_module_course |
| | 136 | ON module(course_id); |
| | 137 | |
| | 138 | CREATE INDEX IF NOT EXISTS idx_lesson_module |
| | 139 | ON lesson(module_id); |
| | 140 | |
| | 141 | CREATE INDEX IF NOT EXISTS idx_quiz_lesson |
| | 142 | ON quiz(lesson_id); |
| | 143 | |
| | 144 | CREATE INDEX IF NOT EXISTS idx_quiz_attempt_quiz |