wiki:OtherDevelopments

SQL Performance Analysis of Complex Queries

The performance analysis in this phase focuses on the two most complex analytical queries developed in Phase 6.

Instead of analyzing simple CRUD operations, we analyzed the most computationally expensive reports in the system. The goal was to identify the most expensive JOIN operations, grouping operations and date-based aggregations, and evaluate whether indexes could improve their execution performance.

For each query, we analyzed:

JOIN conditions GROUP BY operations aggregate functions date-based filtering and grouping foreign key relationships

The queries were tested using PostgreSQL EXPLAIN ANALYZE before and after index creation.

Scenario 1: Yearly Analysis of Overall Platform Activity and Student Success

This query provides a yearly analytical overview of the platform.

It combines information from:

courses enrollments certificates modules lessons quizzes quiz attempts

The report calculates:

total instructor activity total students total enrollments total quiz attempts completed enrollments total certificates average student success within courses average final grade per course final yearly success indicator

The query performs multiple JOIN operations across the learning hierarchy:

Course → Module → Lesson → Quiz → QuizAttempt
Enrollment → Certificate
Enrollment → QuizAttempt

The most expensive operations are the joins between:

cert.enrollment_id = e.enrollment_id
m.course_id = c.course_id
l.module_id = m.module_id
q.lesson_id = l.lesson_id
qa.quiz_id = q.quiz_id
qa.user_id = e.user_id

To improve performance, indexes were added on the columns used in the JOIN conditions.

Indexes:

CREATE INDEX IF NOT EXISTS idx_certificate_enrollment
ON certificate(enrollment_id);

CREATE INDEX IF NOT EXISTS idx_module_course
ON module(course_id);

CREATE INDEX IF NOT EXISTS idx_lesson_module
ON lesson(module_id);

CREATE INDEX IF NOT EXISTS idx_quiz_lesson
ON quiz(lesson_id);

CREATE INDEX IF NOT EXISTS idx_quiz_attempt_quiz_user
ON quiz_attempt(quiz_id, user_id);

CREATE INDEX IF NOT EXISTS idx_enrollment_date
ON enrollment(enroll_date);

Explanation:

The first five indexes reduce the cost of JOIN operations by allowing PostgreSQL to directly locate related records instead of scanning entire tables.

The index on enroll_date improves the yearly grouping operation because the report groups data by year.

Result without indexes:

Seq Scan on quiz_attempt qa_1
Seq Scan on quiz q_1
Seq Scan on lesson l_1
Seq Scan on module m_1
Seq Scan on course c_1

Planning Time: 3.565 ms
Execution Time: 202.790 ms

# Before adding indexes, PostgreSQL used sequential scans on the main joined tables, especially on quiz_attempt, which contains 20000 test records. The query execution time was 202.790 ms.

Result with indexes:

Seq Scan on quiz_attempt qa_1
Seq Scan on quiz q_1
Seq Scan on lesson l_1
Seq Scan on module m_1
Seq Scan on course c_1

Planning Time: 8.008 ms
Execution Time: 218.476 ms

# After adding indexes, PostgreSQL still selected sequential scans for the main tables. This happened because the query aggregates almost all quiz_attempt records, so reading the whole table was cheaper than using indexes. The execution time increased slightly from 202.790 ms to 218.476 ms; these indexes did not improve this specific analytical query.

Scenario 2: Monthly Analysis of Course Activity, Completions, Instructor Involvement and Student Success

This query provides a month-by-month analytical overview of the platform for a selected year.

The report combines:

enrollment activity course completions instructor participation quiz performance student success metrics

The query performs multiple JOIN operations through the complete learning structure:

QuizAttempt → Quiz → Lesson → Module → Course
Enrollment → Course
Enrollment → User

The most important JOIN conditions are:

qa.quiz_id = q.quiz_id
q.lesson_id = l.lesson_id
l.module_id = m.module_id
m.course_id = c.course_id
e.course_id = c.course_id

Additionally, the report groups data by:

YEAR(enroll_date)
MONTH(enroll_date)

YEAR(attempt_date)
MONTH(attempt_date)

To optimize these operations, the following indexes were added:

CREATE INDEX IF NOT EXISTS idx_module_course
ON module(course_id);

CREATE INDEX IF NOT EXISTS idx_lesson_module
ON lesson(module_id);

CREATE INDEX IF NOT EXISTS idx_quiz_lesson
ON quiz(lesson_id);

CREATE INDEX IF NOT EXISTS idx_quiz_attempt_quiz
ON quiz_attempt(quiz_id);

CREATE INDEX IF NOT EXISTS idx_enrollment_course
ON enrollment(course_id);

CREATE INDEX IF NOT EXISTS idx_enrollment_date
ON enrollment(enroll_date);

CREATE INDEX IF NOT EXISTS idx_quiz_attempt_date
ON quiz_attempt(attempt_date);

Explanation:

Indexes on foreign key columns improve JOIN performance, while indexes on date columns improve monthly grouping and chronological analysis.

These indexes reduce the amount of data PostgreSQL needs to scan when generating monthly reports.

Result without indexes:

Seq Scan on enrollment e
Seq Scan on course c
Hash Join
Merge Left Join
Sort
HashAggregate

Planning Time: 5.537 ms
Execution Time: 87.215 ms

Result њith indexes:

Seq Scan on enrollment e
Seq Scan on course c
Hash Join
Merge Left Join
Sort
HashAggregate

Planning Time: 6.799 ms
Execution Time: 111.731 ms

# After adding indexes, PostgreSQL still selected sequential scans for the main tables. The query performs monthly aggregation over most of the enrollment and quiz_attempt records, so the planner estimated that sequential scanning was cheaper than using indexes. The execution time increased from 87.215 ms to 111.731 ms, therefore these indexes did not improve this specific analytical query.

Overall Conclusion

Both analytical queries were tested using PostgreSQL EXPLAIN ANALYZE before and after index creation.

Although indexes were added on the main JOIN and date columns, PostgreSQL continued to choose Sequential Scan operations for the largest tables.

This behavior is expected for analytical reports that process a large portion of the available data through JOIN, GROUP BY, COUNT and AVG operations.

The analysis demonstrates how PostgreSQL selects the most cost-effective execution plan and confirms that indexes do not necessarily improve every type of query.

Security and Protection Measures

Spring Security Authentication

The application uses Spring Security for authentication and authorization.

Unauthenticated users can access public pages such as:

home page login page registration page static resources such as CSS and JavaScript files

All other requests require authentication.

Example configuration:

@Bean
public SecurityFilterChain securityFilterChain(HttpSecurity http) throws Exception {

http
    .csrf(Customizer.withDefaults())
    .authorizeHttpRequests(auth -> auth
        .requestMatchers("/", "/login", "/register", "/css/**", "/js/**").permitAll()
        .anyRequest().authenticated()
    )
    .formLogin(form -> form
        .defaultSuccessUrl("/", true)
        .permitAll()
    )
    .logout(logout -> logout
        .logoutUrl("/logout")
        .logoutSuccessUrl("/login?logout")
        .permitAll()
    );

return http.build();

}

Explanation:

Spring Security protects the application by requiring users to log in before accessing protected resources.

The configuration also defines login and logout behavior.

Password Hashing with BCrypt

User passwords are not stored in plain text. The application uses BCrypt for password hashing.

Example:

@Bean
public PasswordEncoder passwordEncoder() {
return new BCryptPasswordEncoder();
}

During login, the raw password entered by the user is compared with the stored hashed password.

passwordEncoder.matches(rawPassword, storedHashedPassword);

Explanation:

This protects user accounts because even if someone gets access to the database, the original passwords are not directly visible.

Role-Based Access Control

The database contains a role attribute in the user_entity table.

role VARCHAR(20) NOT NULL CHECK (role IN ('USER', 'ADMIN', 'INSTRUCTOR'))

The system supports three roles:

USER — can browse courses, enroll in courses and access learning content INSTRUCTOR — can manage courses, modules, lessons and quizzes ADMIN — can manage categories, subscription plans and support tickets

Explanation:

The CHECK constraint prevents invalid roles from being inserted into the database.

The separate tables users, administrators and instructors also support the supertype/subtype structure of the system.

SQL Injection Prevention

SQL Injection is prevented by using Spring Data JPA and parameterized queries instead of directly concatenating user input into SQL strings.

Safe example:

@Query("SELECT u FROM UserEntity u WHERE u.email = ")
Optional findByEmail(@Param("email") String email);

Unsafe example:

String query = "SELECT * FROM user_entity WHERE email = '" + email + "'";

Explanation:

The safe approach treats user input as data, not as executable SQL code.

This prevents malicious input from changing the meaning of the SQL query.

Database Constraints and Data Integrity

The database schema contains constraints that protect the data from invalid states.

Examples:

email VARCHAR(150) UNIQUE NOT NULL

This prevents duplicate user accounts with the same email.

progress_percentage INT CHECK (progress_percentage BETWEEN 0 AND 100)

This prevents invalid course progress values.

certificate_code VARCHAR(100) UNIQUE NOT NULL

This ensures that every certificate has a unique certificate code.

enrollment_id INT UNIQUE NOT NULL

This ensures that one enrollment can have only one certificate.

Foreign keys are also used to protect relationships between tables:

FOREIGN KEY (user_id) REFERENCES users(id)
FOREIGN KEY (course_id) REFERENCES course(course_id)
FOREIGN KEY (enrollment_id) REFERENCES enrollment(enrollment_id)

Explanation:

These constraints prevent orphan records and keep the database consistent.

Transaction Management

Some operations must be executed as one atomic unit.

For example, when a user enrolls in a course, the system may need to:

check whether the user has an active subscription check whether the user is already enrolled insert a new enrollment record update related data

Example:

@Transactional
public void enrollUserInCourse(Long userId, Long courseId) {
// check active subscription
// check existing enrollment
// create new enrollment
}

Explanation:

If one step fails, the whole transaction is rolled back.

This protects the database from partial or inconsistent changes.

Database Connection Configuration

The application uses PostgreSQL as the database system.

Example configuration:

spring.application.name=db
spring.datasource.url=jdbc:postgresql://localhost:3307/db_202526z_va_prj_olpms
spring.datasource.username=USERNAME
spring.datasource.password=PASSWORD
spring.datasource.driver-class-name=org.postgresql.Driver

spring.jpa.hibernate.ddl-auto=validate
spring.jpa.show-sql=true
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.PostgreSQLDialect

Explanation:

Real database credentials should not be published on the wiki page. Because of that, the username and password are replaced with placeholders.

Spring Boot uses HikariCP as the default database connection pool implementation. Connection pooling improves performance by reusing already opened database connections instead of opening a new connection for every request.

Last modified 6 days ago Last modified on 06/04/26 22:47:07
Note: See TracWiki for help on using the wiki.