wiki:OtherDevelopments

Context Navigation

Version 8 (modified by 221296, 5 days ago) ( diff )
--

SQL Performance Analysis of Complex Queries

The performance analysis in this phase focuses on the two most complex analytical queries developed in Phase 6.

Instead of analyzing simple CRUD operations, we analyzed the most computationally expensive reports in the system. The goal was to identify the most expensive JOIN operations, grouping operations and date-based aggregations, and evaluate whether indexes could improve their execution performance.

For each query, we analyzed:

JOIN conditions GROUP BY operations aggregate functions date-based filtering and grouping foreign key relationships

The queries were tested using PostgreSQL EXPLAIN ANALYZE before and after index creation.

Scenario 1: Yearly analysis of overall platform activity and student success across enrollments, quiz attempts, completions, certificates, and course performance

WITH course_year_performance AS (
    SELECT
        EXTRACT(YEAR FROM qa.attempt_date) AS year,
        c.course_id,
        ROUND(AVG(qa.score), 2) AS final_grade_per_course
    FROM quiz_attempt qa
    JOIN quiz q
        ON q.quiz_id = qa.quiz_id
    JOIN lesson l
        ON l.lesson_id = q.lesson_id
    JOIN module m
        ON m.module_id = l.module_id
    JOIN course c
        ON c.course_id = m.course_id
    GROUP BY
        EXTRACT(YEAR FROM qa.attempt_date),
        c.course_id
),
year_course_summary AS (
    SELECT
        year,
        ROUND(AVG(final_grade_per_course), 2) AS final_year_success
    FROM course_year_performance
    GROUP BY year
)
SELECT
    EXTRACT(YEAR FROM e.enroll_date) AS year,
    COUNT(DISTINCT c.instructor_id) AS total_instructor_activity,
    COUNT(DISTINCT e.user_id) AS total_students,
    COUNT(DISTINCT e.enrollment_id) AS total_enrollments,
    COUNT(DISTINCT qa.attempt_id) AS total_attempts,
    COUNT(DISTINCT CASE
        WHEN e.completion_status = 'COMPLETED' THEN e.enrollment_id
    END) AS completed_enrollments,
    COUNT(DISTINCT cert.certificate_id) AS total_certificates,
    ROUND(AVG(e.progress_percentage), 2) AS avg_student_success_within_courses,
    ROUND(AVG(qa.score), 2) AS avg_final_grade_per_course, ycs.final_year_success
FROM enrollment e
JOIN course c
    ON c.course_id = e.course_id
LEFT JOIN certificate cert
    ON cert.enrollment_id = e.enrollment_id
LEFT JOIN module m
    ON m.course_id = c.course_id
LEFT JOIN lesson l
    ON l.module_id = m.module_id
LEFT JOIN quiz q
    ON q.lesson_id = l.lesson_id
LEFT JOIN quiz_attempt qa
    ON qa.quiz_id = q.quiz_id
    AND qa.user_id = e.user_id
LEFT JOIN year_course_summary ycs
    ON ycs.year = EXTRACT(YEAR FROM e.enroll_date)
GROUP BY
    EXTRACT(YEAR FROM e.enroll_date), ycs.final_year_success
ORDER BY year;

This query provides a yearly analytical overview of the platform.

It combines information from:

courses enrollments certificates modules lessons quizzes quiz attempts

The report calculates:

total instructor activity total students total enrollments total quiz attempts completed enrollments total certificates average student success within courses average final grade per course final yearly success indicator

The query performs multiple JOIN operations across the learning hierarchy:

Course → Module → Lesson → Quiz → QuizAttempt
Enrollment → Certificate
Enrollment → QuizAttempt

The most expensive operations are the joins between:

cert.enrollment_id = e.enrollment_id
m.course_id = c.course_id
l.module_id = m.module_id
q.lesson_id = l.lesson_id
qa.quiz_id = q.quiz_id
qa.user_id = e.user_id

To improve performance, indexes were added on the columns used in the JOIN conditions.

Indexes:

CREATE INDEX IF NOT EXISTS idx_certificate_enrollment
ON certificate(enrollment_id);

CREATE INDEX IF NOT EXISTS idx_module_course
ON module(course_id);

CREATE INDEX IF NOT EXISTS idx_lesson_module
ON lesson(module_id);

CREATE INDEX IF NOT EXISTS idx_quiz_lesson
ON quiz(lesson_id);

CREATE INDEX IF NOT EXISTS idx_quiz_attempt_quiz_user
ON quiz_attempt(quiz_id, user_id);

CREATE INDEX IF NOT EXISTS idx_enrollment_date
ON enrollment(enroll_date);

Explanation:

The first five indexes reduce the cost of JOIN operations by allowing PostgreSQL to directly locate related records instead of scanning entire tables.

The index on enroll_date improves the yearly grouping operation because the report groups data by year.

Result without indexes:

GroupAggregate  (cost=80709.32..144979.11 rows=73000 width=176) (actual time=164.281..170.938 rows=2 loops=1)
  Group Key: (EXTRACT(year FROM e.enroll_date)), ycs.final_year_success
  ->  Incremental Sort  (cost=80709.32..128051.98 rows=555805 width=103) (actual time=148.388..157.374 rows=20006 loops=1)
        Sort Key: (EXTRACT(year FROM e.enroll_date)), ycs.final_year_success, c.instructor_id
        Presorted Key: (EXTRACT(year FROM e.enroll_date))
        Full-sort Groups: 2  Sort Method: quicksort  Average Memory: 29kB  Peak Memory: 29kB
        Pre-sorted Groups: 2  Sort Method: quicksort  Average Memory: 1054kB  Peak Memory: 1054kB
        ->  Merge Left Join  (cost=80598.37..91715.97 rows=555805 width=103) (actual time=137.946..145.502 rows=20006 loops=1)
              Merge Cond: ((EXTRACT(year FROM e.enroll_date)) = ycs.year)
              ->  Sort  (cost=79099.25..80488.76 rows=555805 width=43) (actual time=108.940..110.102 rows=20006 loops=1)
                    Sort Key: (EXTRACT(year FROM e.enroll_date))
                    Sort Method: quicksort  Memory: 2019kB
                    ->  Hash Right Join  (cost=1576.68..8962.73 rows=555805 width=43) (actual time=91.417..97.214 rows=20006 loops=1)
                          Hash Cond: ((qa.quiz_id = q.quiz_id) AND (qa.user_id = e.user_id))
                          ->  Seq Scan on quiz_attempt qa  (cost=0.00..328.00 rows=20000 width=16) (actual time=0.016..1.444 rows=20000 loops=1)
                          ->  Hash  (cost=805.50..805.50 rows=51412 width=39) (actual time=86.618..86.623 rows=20006 loops=1)
                                Buckets: 65536  Batches: 1  Memory Usage: 1841kB
                                ->  Hash Right Join  (cost=235.03..805.50 rows=51412 width=39) (actual time=76.887..80.470 rows=20006 loops=1)
                                      Hash Cond: (m.course_id = c.course_id)
                                      ->  Hash Right Join  (cost=29.45..67.86 rows=1850 width=8) (actual time=0.114..0.139 rows=18 loops=1)
                                            Hash Cond: (l.module_id = m.module_id)
                                            ->  Hash Right Join  (cost=14.72..48.18 rows=1850 width=8) (actual time=0.054..0.070 rows=18 loops=1)
                                                  Hash Cond: (q.lesson_id = l.lesson_id)
                                                  ->  Seq Scan on quiz q  (cost=0.00..28.50 rows=1850 width=8) (actual time=0.007..0.011 rows=18 loops=1)
                                                  ->  Hash  (cost=12.10..12.10 rows=210 width=8) (actual time=0.027..0.027 rows=18 loops=1)
                                                        Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                        ->  Seq Scan on lesson l  (cost=0.00..12.10 rows=210 width=8) (actual time=0.015..0.017 rows=18 loops=1)
                                            ->  Hash  (cost=12.10..12.10 rows=210 width=8) (actual time=0.040..0.040 rows=10 loops=1)
                                                  Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                  ->  Seq Scan on module m  (cost=0.00..12.10 rows=210 width=8) (actual time=0.029..0.031 rows=10 loops=1)
                                      ->  Hash  (cost=143.05..143.05 rows=5002 width=39) (actual time=76.677..76.679 rows=5002 loops=1)
                                            Buckets: 8192  Batches: 1  Memory Usage: 397kB
                                            ->  Hash Left Join  (cost=29.45..143.05 rows=5002 width=39) (actual time=72.865..75.398 rows=5002 loops=1)
                                                  Hash Cond: (e.enrollment_id = cert.enrollment_id)
                                                  ->  Hash Join  (cost=14.05..114.51 rows=5002 width=35) (actual time=72.822..74.641 rows=5002 loops=1)
                                                        Hash Cond: (e.course_id = c.course_id)
                                                        ->  Seq Scan on enrollment e  (cost=0.00..87.02 rows=5002 width=31) (actual time=0.014..0.477 rows=5002 loops=1)
                                                        ->  Hash  (cost=11.80..11.80 rows=180 width=8) (actual time=72.758..72.758 rows=5 loops=1)
                                                              Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                              ->  Seq Scan on course c  (cost=0.00..11.80 rows=180 width=8) (actual time=72.722..72.724 rows=5 loops=1)
                                                  ->  Hash  (cost=12.40..12.40 rows=240 width=8) (actual time=0.026..0.026 rows=1 loops=1)
                                                        Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                        ->  Seq Scan on certificate cert  (cost=0.00..12.40 rows=240 width=8) (actual time=0.019..0.019 rows=1 loops=1)
              ->  Sort  (cost=1499.12..1499.62 rows=200 width=64) (actual time=28.978..28.981 rows=2 loops=1)
                    Sort Key: ycs.year
                    Sort Method: quicksort  Memory: 25kB
                    ->  Subquery Scan on ycs  (cost=1486.47..1491.47 rows=200 width=64) (actual time=28.966..28.973 rows=2 loops=1)
                          ->  HashAggregate  (cost=1486.47..1489.47 rows=200 width=64) (actual time=28.960..28.966 rows=2 loops=1)
                                Group Key: (EXTRACT(year FROM qa_1.attempt_date))
                                Batches: 1  Memory Usage: 40kB
                                ->  HashAggregate  (cost=836.47..1186.47 rows=20000 width=68) (actual time=28.889..28.938 rows=2 loops=1)
                                      Group Key: EXTRACT(year FROM qa_1.attempt_date), c_1.course_id
                                      Batches: 1  Memory Usage: 793kB
                                      ->  Hash Join  (cost=95.12..686.47 rows=20000 width=40) (actual time=0.189..22.495 rows=20000 loops=1)
                                            Hash Cond: (m_1.course_id = c_1.course_id)
                                            ->  Hash Join  (cost=81.07..568.81 rows=20000 width=12) (actual time=0.142..15.695 rows=20000 loops=1)
                                                  Hash Cond: (l_1.module_id = m_1.module_id)
                                                  ->  Hash Join  (cost=66.35..500.53 rows=20000 width=12) (actual time=0.095..11.438 rows=20000 loops=1)
                                                        Hash Cond: (q_1.lesson_id = l_1.lesson_id)
                                                        ->  Hash Join  (cost=51.62..432.25 rows=20000 width=12) (actual time=0.053..7.286 rows=20000 loops=1)
                                                              Hash Cond: (qa_1.quiz_id = q_1.quiz_id)
                                                              ->  Seq Scan on quiz_attempt qa_1  (cost=0.00..328.00 rows=20000 width=12) (actual time=0.009..1.902 rows=20000 loops=1)
                                                              ->  Hash  (cost=28.50..28.50 rows=1850 width=8) (actual time=0.024..0.024 rows=18 loops=1)
                                                                    Buckets: 2048  Batches: 1  Memory Usage: 17kB
                                                                    ->  Seq Scan on quiz q_1  (cost=0.00..28.50 rows=1850 width=8) (actual time=0.012..0.015 rows=18 loops=1)
                                                        ->  Hash  (cost=12.10..12.10 rows=210 width=8) (actual time=0.023..0.023 rows=18 loops=1)
                                                              Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                              ->  Seq Scan on lesson l_1  (cost=0.00..12.10 rows=210 width=8) (actual time=0.012..0.014 rows=18 loops=1)
                                                  ->  Hash  (cost=12.10..12.10 rows=210 width=8) (actual time=0.028..0.029 rows=10 loops=1)
                                                        Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                        ->  Seq Scan on module m_1  (cost=0.00..12.10 rows=210 width=8) (actual time=0.018..0.020 rows=10 loops=1)
                                            ->  Hash  (cost=11.80..11.80 rows=180 width=4) (actual time=0.025..0.026 rows=5 loops=1)
                                                  Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                  ->  Seq Scan on course c_1  (cost=0.00..11.80 rows=180 width=4) (actual time=0.017..0.018 rows=5 loops=1)
Planning Time: 2.538 ms
JIT:
  Functions: 112
  Options: Inlining false, Optimization false, Expressions true, Deforming true
  Timing: Generation 8.302 ms (Deform 4.108 ms), Inlining 0.000 ms, Optimization 3.069 ms, Emission 69.974 ms, Total 81.346 ms
Execution Time: 944.483 ms

# Before adding indexes, PostgreSQL used sequential scans on the main joined tables, especially on quiz_attempt, which contains 20000 test records. The query execution time was 202.790 ms.

Result with indexes:

GroupAggregate  (cost=80709.32..144979.11 rows=73000 width=176) (actual time=157.669..164.135 rows=2 loops=1)
  Group Key: (EXTRACT(year FROM e.enroll_date)), ycs.final_year_success
  ->  Incremental Sort  (cost=80709.32..128051.98 rows=555805 width=103) (actual time=141.941..150.977 rows=20006 loops=1)
        Sort Key: (EXTRACT(year FROM e.enroll_date)), ycs.final_year_success, c.instructor_id
        Presorted Key: (EXTRACT(year FROM e.enroll_date))
        Full-sort Groups: 2  Sort Method: quicksort  Average Memory: 29kB  Peak Memory: 29kB
        Pre-sorted Groups: 2  Sort Method: quicksort  Average Memory: 1054kB  Peak Memory: 1054kB
        ->  Merge Left Join  (cost=80598.37..91715.97 rows=555805 width=103) (actual time=131.481..139.062 rows=20006 loops=1)
              Merge Cond: ((EXTRACT(year FROM e.enroll_date)) = ycs.year)
              ->  Sort  (cost=79099.25..80488.76 rows=555805 width=43) (actual time=104.761..105.930 rows=20006 loops=1)
                    Sort Key: (EXTRACT(year FROM e.enroll_date))
                    Sort Method: quicksort  Memory: 2019kB
                    ->  Hash Right Join  (cost=1576.68..8962.73 rows=555805 width=43) (actual time=87.109..92.938 rows=20006 loops=1)
                          Hash Cond: ((qa.quiz_id = q.quiz_id) AND (qa.user_id = e.user_id))
                          ->  Seq Scan on quiz_attempt qa  (cost=0.00..328.00 rows=20000 width=16) (actual time=0.012..1.530 rows=20000 loops=1)
                          ->  Hash  (cost=805.50..805.50 rows=51412 width=39) (actual time=82.485..82.490 rows=20006 loops=1)
                                Buckets: 65536  Batches: 1  Memory Usage: 1841kB
                                ->  Hash Right Join  (cost=235.03..805.50 rows=51412 width=39) (actual time=72.927..76.507 rows=20006 loops=1)
                                      Hash Cond: (m.course_id = c.course_id)
                                      ->  Hash Right Join  (cost=29.45..67.86 rows=1850 width=8) (actual time=0.106..0.131 rows=18 loops=1)
                                            Hash Cond: (l.module_id = m.module_id)
                                            ->  Hash Right Join  (cost=14.72..48.18 rows=1850 width=8) (actual time=0.053..0.069 rows=18 loops=1)
                                                  Hash Cond: (q.lesson_id = l.lesson_id)
                                                  ->  Seq Scan on quiz q  (cost=0.00..28.50 rows=1850 width=8) (actual time=0.006..0.010 rows=18 loops=1)
                                                  ->  Hash  (cost=12.10..12.10 rows=210 width=8) (actual time=0.025..0.026 rows=18 loops=1)
                                                        Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                        ->  Seq Scan on lesson l  (cost=0.00..12.10 rows=210 width=8) (actual time=0.014..0.016 rows=18 loops=1)
                                            ->  Hash  (cost=12.10..12.10 rows=210 width=8) (actual time=0.034..0.034 rows=10 loops=1)
                                                  Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                  ->  Seq Scan on module m  (cost=0.00..12.10 rows=210 width=8) (actual time=0.023..0.025 rows=10 loops=1)
                                      ->  Hash  (cost=143.05..143.05 rows=5002 width=39) (actual time=72.755..72.757 rows=5002 loops=1)
                                            Buckets: 8192  Batches: 1  Memory Usage: 397kB
                                            ->  Hash Left Join  (cost=29.45..143.05 rows=5002 width=39) (actual time=68.918..71.468 rows=5002 loops=1)
                                                  Hash Cond: (e.enrollment_id = cert.enrollment_id)
                                                  ->  Hash Join  (cost=14.05..114.51 rows=5002 width=35) (actual time=68.863..70.694 rows=5002 loops=1)
                                                        Hash Cond: (e.course_id = c.course_id)
                                                        ->  Seq Scan on enrollment e  (cost=0.00..87.02 rows=5002 width=31) (actual time=0.013..0.508 rows=5002 loops=1)
                                                        ->  Hash  (cost=11.80..11.80 rows=180 width=8) (actual time=68.810..68.810 rows=5 loops=1)
                                                              Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                              ->  Seq Scan on course c  (cost=0.00..11.80 rows=180 width=8) (actual time=68.775..68.777 rows=5 loops=1)
                                                  ->  Hash  (cost=12.40..12.40 rows=240 width=8) (actual time=0.037..0.038 rows=1 loops=1)
                                                        Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                        ->  Seq Scan on certificate cert  (cost=0.00..12.40 rows=240 width=8) (actual time=0.030..0.031 rows=1 loops=1)
              ->  Sort  (cost=1499.12..1499.62 rows=200 width=64) (actual time=26.690..26.694 rows=2 loops=1)
                    Sort Key: ycs.year
                    Sort Method: quicksort  Memory: 25kB
                    ->  Subquery Scan on ycs  (cost=1486.47..1491.47 rows=200 width=64) (actual time=26.678..26.684 rows=2 loops=1)
                          ->  HashAggregate  (cost=1486.47..1489.47 rows=200 width=64) (actual time=26.673..26.678 rows=2 loops=1)
                                Group Key: (EXTRACT(year FROM qa_1.attempt_date))
                                Batches: 1  Memory Usage: 40kB
                                ->  HashAggregate  (cost=836.47..1186.47 rows=20000 width=68) (actual time=26.606..26.652 rows=2 loops=1)
                                      Group Key: EXTRACT(year FROM qa_1.attempt_date), c_1.course_id
                                      Batches: 1  Memory Usage: 793kB
                                      ->  Hash Join  (cost=95.12..686.47 rows=20000 width=40) (actual time=0.181..20.697 rows=20000 loops=1)
                                            Hash Cond: (m_1.course_id = c_1.course_id)
                                            ->  Hash Join  (cost=81.07..568.81 rows=20000 width=12) (actual time=0.135..14.468 rows=20000 loops=1)
                                                  Hash Cond: (l_1.module_id = m_1.module_id)
                                                  ->  Hash Join  (cost=66.35..500.53 rows=20000 width=12) (actual time=0.094..10.598 rows=20000 loops=1)
                                                        Hash Cond: (q_1.lesson_id = l_1.lesson_id)
                                                        ->  Hash Join  (cost=51.62..432.25 rows=20000 width=12) (actual time=0.052..6.728 rows=20000 loops=1)
                                                              Hash Cond: (qa_1.quiz_id = q_1.quiz_id)
                                                              ->  Seq Scan on quiz_attempt qa_1  (cost=0.00..328.00 rows=20000 width=12) (actual time=0.005..1.791 rows=20000 loops=1)
                                                              ->  Hash  (cost=28.50..28.50 rows=1850 width=8) (actual time=0.024..0.025 rows=18 loops=1)
                                                                    Buckets: 2048  Batches: 1  Memory Usage: 17kB
                                                                    ->  Seq Scan on quiz q_1  (cost=0.00..28.50 rows=1850 width=8) (actual time=0.013..0.015 rows=18 loops=1)
                                                        ->  Hash  (cost=12.10..12.10 rows=210 width=8) (actual time=0.023..0.023 rows=18 loops=1)
                                                              Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                              ->  Seq Scan on lesson l_1  (cost=0.00..12.10 rows=210 width=8) (actual time=0.012..0.014 rows=18 loops=1)
                                                  ->  Hash  (cost=12.10..12.10 rows=210 width=8) (actual time=0.023..0.023 rows=10 loops=1)
                                                        Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                        ->  Seq Scan on module m_1  (cost=0.00..12.10 rows=210 width=8) (actual time=0.013..0.015 rows=10 loops=1)
                                            ->  Hash  (cost=11.80..11.80 rows=180 width=4) (actual time=0.026..0.026 rows=5 loops=1)
                                                  Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                  ->  Seq Scan on course c_1  (cost=0.00..11.80 rows=180 width=4) (actual time=0.017..0.018 rows=5 loops=1)
Planning Time: 1.811 ms
JIT:
  Functions: 112
  Options: Inlining false, Optimization false, Expressions true, Deforming true
  Timing: Generation 5.875 ms (Deform 2.882 ms), Inlining 0.000 ms, Optimization 2.428 ms, Emission 66.682 ms, Total 74.985 ms
Execution Time: 171.114 ms

# After adding indexes, the execution time decreased from 944.483 ms to 171.114 ms. This shows that the added indexes significantly improved the execution of the yearly analysis query. The improvement is especially visible because the query joins multiple tables and processes 20000 quiz_attempt records.

Scenario 2: Monthly Analysis of Course Activity, Completions, Instructor Involvement and Student Success

This query provides a month-by-month analytical overview of the platform for a selected year.

The report combines:

enrollment activity course completions instructor participation quiz performance student success metrics

The query performs multiple JOIN operations through the complete learning structure:

QuizAttempt → Quiz → Lesson → Module → Course
Enrollment → Course
Enrollment → User

The most important JOIN conditions are:

qa.quiz_id = q.quiz_id
q.lesson_id = l.lesson_id
l.module_id = m.module_id
m.course_id = c.course_id
e.course_id = c.course_id

Additionally, the report groups data by:

YEAR(enroll_date)
MONTH(enroll_date)

YEAR(attempt_date)
MONTH(attempt_date)

To optimize these operations, the following indexes were added:

CREATE INDEX IF NOT EXISTS idx_module_course
ON module(course_id);

CREATE INDEX IF NOT EXISTS idx_lesson_module
ON lesson(module_id);

CREATE INDEX IF NOT EXISTS idx_quiz_lesson
ON quiz(lesson_id);

CREATE INDEX IF NOT EXISTS idx_quiz_attempt_quiz
ON quiz_attempt(quiz_id);

CREATE INDEX IF NOT EXISTS idx_enrollment_course
ON enrollment(course_id);

CREATE INDEX IF NOT EXISTS idx_enrollment_date
ON enrollment(enroll_date);

CREATE INDEX IF NOT EXISTS idx_quiz_attempt_date
ON quiz_attempt(attempt_date);

Explanation:

Indexes on foreign key columns improve JOIN performance, while indexes on date columns improve monthly grouping and chronological analysis.

These indexes reduce the amount of data PostgreSQL needs to scan when generating monthly reports.

Result without indexes:

Seq Scan on enrollment e
Seq Scan on course c
Hash Join
Merge Left Join
Sort
HashAggregate

Planning Time: 5.537 ms
Execution Time: 87.215 ms

Result њith indexes:

Seq Scan on enrollment e
Seq Scan on course c
Hash Join
Merge Left Join
Sort
HashAggregate

Planning Time: 6.799 ms
Execution Time: 111.731 ms

# After adding indexes, PostgreSQL still selected sequential scans for the main tables. The query performs monthly aggregation over most of the enrollment and quiz_attempt records, so the planner estimated that sequential scanning was cheaper than using indexes. The execution time increased from 87.215 ms to 111.731 ms, therefore these indexes did not improve this specific analytical query.

Overall Conclusion

Both analytical queries were tested using PostgreSQL EXPLAIN ANALYZE before and after index creation.

Although indexes were added on the main JOIN and date columns, PostgreSQL continued to choose Sequential Scan operations for the largest tables.

This behavior is expected for analytical reports that process a large portion of the available data through JOIN, GROUP BY, COUNT and AVG operations.

The analysis demonstrates how PostgreSQL selects the most cost-effective execution plan and confirms that indexes do not necessarily improve every type of query.

Security and Protection Measures

Spring Security Authentication

The application uses Spring Security for authentication and authorization.

Unauthenticated users can access public pages such as:

home page login page registration page static resources such as CSS and JavaScript files

All other requests require authentication.

Example configuration:

@Bean
public SecurityFilterChain securityFilterChain(HttpSecurity http) throws Exception {

http
    .csrf(Customizer.withDefaults())
    .authorizeHttpRequests(auth -> auth
        .requestMatchers("/", "/login", "/register", "/css/**", "/js/**").permitAll()
        .anyRequest().authenticated()
    )
    .formLogin(form -> form
        .defaultSuccessUrl("/", true)
        .permitAll()
    )
    .logout(logout -> logout
        .logoutUrl("/logout")
        .logoutSuccessUrl("/login?logout")
        .permitAll()
    );

return http.build();

}

Explanation:

Spring Security protects the application by requiring users to log in before accessing protected resources.

The configuration also defines login and logout behavior.

Password Hashing with BCrypt

User passwords are not stored in plain text. The application uses BCrypt for password hashing.

Example:

@Bean
public PasswordEncoder passwordEncoder() {
return new BCryptPasswordEncoder();
}

During login, the raw password entered by the user is compared with the stored hashed password.

passwordEncoder.matches(rawPassword, storedHashedPassword);

Explanation:

This protects user accounts because even if someone gets access to the database, the original passwords are not directly visible.

Role-Based Access Control

The database contains a role attribute in the user_entity table.

role VARCHAR(20) NOT NULL CHECK (role IN ('USER', 'ADMIN', 'INSTRUCTOR'))

The system supports three roles:

USER — can browse courses, enroll in courses and access learning content INSTRUCTOR — can manage courses, modules, lessons and quizzes ADMIN — can manage categories, subscription plans and support tickets

Explanation:

The CHECK constraint prevents invalid roles from being inserted into the database.

The separate tables users, administrators and instructors also support the supertype/subtype structure of the system.

SQL Injection Prevention

SQL Injection is prevented by using Spring Data JPA and parameterized queries instead of directly concatenating user input into SQL strings.

Safe example:

@Query("SELECT u FROM UserEntity u WHERE u.email = ")
Optional findByEmail(@Param("email") String email);

Unsafe example:

String query = "SELECT * FROM user_entity WHERE email = '" + email + "'";

Explanation:

The safe approach treats user input as data, not as executable SQL code.

This prevents malicious input from changing the meaning of the SQL query.

Database Constraints and Data Integrity

The database schema contains constraints that protect the data from invalid states.

Examples:

email VARCHAR(150) UNIQUE NOT NULL

This prevents duplicate user accounts with the same email.

progress_percentage INT CHECK (progress_percentage BETWEEN 0 AND 100)

This prevents invalid course progress values.

certificate_code VARCHAR(100) UNIQUE NOT NULL

This ensures that every certificate has a unique certificate code.

enrollment_id INT UNIQUE NOT NULL

This ensures that one enrollment can have only one certificate.

Foreign keys are also used to protect relationships between tables:

FOREIGN KEY (user_id) REFERENCES users(id)
FOREIGN KEY (course_id) REFERENCES course(course_id)
FOREIGN KEY (enrollment_id) REFERENCES enrollment(enrollment_id)

Explanation:

These constraints prevent orphan records and keep the database consistent.

Transaction Management

Some operations must be executed as one atomic unit.

For example, when a user enrolls in a course, the system may need to:

check whether the user has an active subscription check whether the user is already enrolled insert a new enrollment record update related data

Example:

@Transactional
public void enrollUserInCourse(Long userId, Long courseId) {
// check active subscription
// check existing enrollment
// create new enrollment
}

Explanation:

If one step fails, the whole transaction is rolled back.

This protects the database from partial or inconsistent changes.

Database Connection Configuration

The application uses PostgreSQL as the database system.

Example configuration:

spring.application.name=db
spring.datasource.url=jdbc:postgresql://localhost:3307/db_202526z_va_prj_olpms
spring.datasource.username=USERNAME
spring.datasource.password=PASSWORD
spring.datasource.driver-class-name=org.postgresql.Driver

spring.jpa.hibernate.ddl-auto=validate
spring.jpa.show-sql=true
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.PostgreSQLDialect

Explanation:

Real database credentials should not be published on the wiki page. Because of that, the username and password are replaced with placeholders.

Spring Boot uses HikariCP as the default database connection pool implementation. Connection pooling improves performance by reusing already opened database connections instead of opening a new connection for every request.

Note: See TracWiki for help on using the wiki.

Download in other formats:

Plain Text