Normalization
Initial de-normalized relation and functional dependencies
Global set of attributes
This represents the initial de-normalized relation for the Online Learning Platform Management System (OLPMS), before applying any normalization steps.
All attributes from the conceptual (ER) model are combined into a single universal relation R, assuming that all data is stored in one table. Attribute names are made unique to avoid duplication.
R = {
user_id, first_name, last_name, email, password, role,
plan_id, plan_name, price, duration_months, plan_description, access_type,
subscription_id, start_date, end_date, subscription_status,
payment_id, amount,
ticket_id, admin_id, subject, ticket_description, ticket_status, created_at,
category_id, category_name, category_description,
course_id, course_name, course_price, course_status, instructor_id,
module_id, module_title, module_description,
lesson_id, lesson_title, material,
quiz_id, total_points, passing_score,
attempt_id, score, attempt_date,
enrollment_id, enroll_date, completion_status, progress_percentage,
certificate_id, issue_date, certificate_code, certificate_status
}
Functional dependencies (initial set)
The following functional dependencies represent the initial global set of dependencies that hold in the universal de-normalized relation R for the Online Learning Platform Management System (OLPMS).
They are derived from the semantics of the ER model and the business rules of the system. The goal is to capture the essential determinants that will later guide the normalization process.
User / specialization
FD1: user_id → first_name, last_name, email, password, role FD2: instructor_id → user_id FD3: admin_id → user_id
Subscription plan / subscription / payment
FD4: plan_id → plan_name, price, duration_months, plan_description, access_type FD5: subscription_id → user_id, plan_id, start_date, end_date, subscription_status FD6: payment_id → user_id, subscription_id, amount
Support ticket
FD7: ticket_id → user_id, admin_id, subject, ticket_description, ticket_status, created_at
Category / course
FD8: category_id → category_name, category_description FD9: course_id → course_name, course_price, course_status, instructor_id
CourseCategory relation
The relation CourseCategory has composite key (course_id, category_id) and no additional non-trivial functional dependencies.
Course content hierarchy
FD10: module_id → course_id, module_title, module_description FD11: lesson_id → module_id, lesson_title, material FD12: quiz_id → lesson_id, total_points, passing_score FD13: attempt_id → user_id, quiz_id, score, attempt_date
Enrollment / certificate
FD14: enrollment_id → user_id, course_id, enroll_date, completion_status, progress_percentage FD15: certificate_id → enrollment_id, issue_date, certificate_code, certificate_status
Candidate keys and primary key selection
Determining a candidate key
To determine a candidate key of the universal de-normalized relation, we identify a minimal set of identifiers whose closure determines all attributes of R.
Because the universal relation combines several independent components of the system (subscriptions, payments, support tickets, course-category assignments, quiz attempts, and certificates), the candidate key must include one identifier from each independent component that is not functionally determined by another component.
A candidate key for the initial de-normalized relation is:
K = {payment_id, ticket_id, course_id, category_id, attempt_id, certificate_id}
Its closure includes all attributes of R, so it determines the whole relation.
Closure proof for K
Let K = {payment_id, ticket_id, course_id, category_id, attempt_id, certificate_id}
Compute K+:
Start: K+ = {payment_id, ticket_id, course_id, category_id, attempt_id, certificate_id}
From FD6:
payment_id → user_id, subscription_id, amount
Add: user_id, subscription_id, amount
From FD5:
subscription_id → plan_id, start_date, end_date, subscription_status
Add: plan_id, start_date, end_date, subscription_status
From FD4:
plan_id → plan_name, price, duration_months, plan_description, access_type
Add: plan_name, price, duration_months, plan_description, access_type
From FD1:
user_id → first_name, last_name, email, password, role
Add: first_name, last_name, email, password, role
From FD7:
ticket_id → user_id, admin_id, subject, ticket_description, ticket_status, created_at
Add: admin_id, subject, ticket_description, ticket_status, created_at
(user_id already belongs to K+)
From FD3:
admin_id → user_id
user_id already belongs to K+
From FD8:
category_id → category_name, category_description
Add: category_name, category_description
From FD9:
course_id → course_name, course_price, course_status, instructor_id
Add: course_name, course_price, course_status, instructor_id
From FD2:
instructor_id → user_id
user_id already belongs to K+
From FD13:
attempt_id → user_id, quiz_id, score, attempt_date
Add: quiz_id, score, attempt_date
(user_id already belongs to K+)
From FD12:
quiz_id → lesson_id, total_points, passing_score
Add: lesson_id, total_points, passing_score
From FD11:
lesson_id → module_id, lesson_title, material
Add: module_id, lesson_title, material
From FD10:
module_id → course_id, module_title, module_description
Add: module_title, module_description
(course_id already belongs to K+)
From FD15:
certificate_id → enrollment_id, issue_date, certificate_code, certificate_status
Add: enrollment_id, issue_date, certificate_code, certificate_status
From FD14:
enrollment_id → user_id, course_id, enroll_date, completion_status, progress_percentage
Add: enroll_date, completion_status, progress_percentage
(user_id and course_id already belong to K+)
Therefore, K+ contains all attributes of the universal relation R, so K is a superkey.
K is also minimal, because removing any attribute from K would cause the loss of at least one independent component of the universal relation:
- without payment_id, payment/subscription information is lost
- Without ticket_id, support ticket information is lost
- without course_id or category_id, the course-category association is lost
- without attempt_id, quiz attempt, and course content assessment information is lost
- Without certificate_id, certificate, and enrollment completion information is lost
So, K is a candidate key.
Chosen primary key
We select the candidate key above as the primary key of the initial de-normalized relation:
K = {payment_id, ticket_id, course_id, category_id, attempt_id, certificate_id}
Normal form checks
Check for 1NF
R satisfies 1NF because all attributes are atomic and there are no repeating groups or multi-valued attributes inside a single tuple. Each attribute stores a single value, and the universal relation is identified by the chosen composite candidate key. Therefore, R is in 1NF
Check for 2NF
R does not satisfy 2NF because it has a composite candidate key and many non-key attributes depend only on part of that key, instead of depending on the whole key.
Examples of partial dependencies are:
- payment_id → user_id, subscription_id, amount
- ticket_id → user_id, admin_id, subject, ticket_description, ticket_status, created_at
- course_id → course_name, course_price, course_status, instructor_id
- category_id → category_name, category_description
- attempt_id → user_id, quiz_id, score, attempt_date
- certificate_id → enrollment_id, issue_date, certificate_code, certificate_status
These are partial dependencies on components of the composite key, so R violates 2NF.
Decomposition to 2NF
Grouping by determinants (partial dependencies)
To remove partial dependencies, the attributes are grouped according to the determinants from the functional dependencies:
- user_id → first_name, last_name, email, password, role
- instructor_id → user_id
- admin_id → user_id
- plan_id → plan_name, price, duration_months, plan_description, access_type
- subscription_id → user_id, plan_id, start_date, end_date, subscription_status
- payment_id → user_id, subscription_id, amount
- ticket_id → user_id, admin_id, subject, ticket_description, ticket_status, created_at
- category_id → category_name, category_description
- course_id → course_name, course_price, course_status, instructor_id
- module_id → course_id, module_title, module_description
- lesson_id → module_id, lesson_title, material
- quiz_id → lesson_id, total_points, passing_score
- attempt_id → user_id, quiz_id, score, attempt_date
- enrollment_id → user_id, course_id, enroll_date, completion_status, progress_percentage
- certificate_id → enrollment_id, issue_date, certificate_code, certificate_status
The relation CourseCategory has composite key (course_id, category_id) and no additional non-trivial functional dependencies.
2NF relations (first decomposition)
After removing the partial dependencies, we obtain the following relations:
UserEntity (user_id, first_name, last_name, email, password, role)Users(user_id)Administrators(admin_id, user_id)Instructors(instructor_id, user_id)
SubscriptionPlan(plan_id, plan_name, price, duration_months, plan_description, access_type)UserSubscription(subscription_id, user_id, plan_id, start_date, end_date, subscription_status)Payment(payment_id, user_id, subscription_id, amount)
SupportTicket(ticket_id, user_id, admin_id, subject, ticket_description, ticket_status, created_at)
Category(category_id, category_name, category_description)Course(course_id, course_name, course_price, course_status, instructor_id)CourseCategory(course_id, category_id)
Module(module_id, course_id, module_title, module_description)Lesson(lesson_id, module_id, lesson_title, material)Quiz(quiz_id, lesson_id, total_points, passing_score)QuizAttempt(attempt_id, user_id, quiz_id, score, attempt_date)
Enrollment(enrollment_id, user_id, course_id, enroll_date, completion_status, progress_percentage)Certificate(certificate_id, enrollment_id, issue_date, certificate_code, certificate_status)
Lossless join:
The decomposition is lossless because each new relation is formed around a determinant that becomes the key of that relation. The common attributes used in the decomposition are keys in at least one of the resulting relations, so no information is lost.
Dependency preservation:
The decomposition is dependency-preserving because each functional dependency FD1–FD15 is represented fully inside one of the resulting relations. CourseCategory has only its composite key and no additional non-trivial functional dependencies.
Check for 3NF (and decomposition)
Now we look for transitive dependencies inside the relations obtained after the 2NF decomposition.
Typical transitive issues from the ER
- In Payment(payment_id, user_id, subscription_id, amount), we have:
payment_id → subscription_id
subscription_id → user_id, plan_id, start_date, end_date, subscription_status.
Because subscription details are stored in UserSubscription, they should not be repeated in Payment. Payment keeps only payment_id, user_id, subscription_id and amount, so no transitive dependency remains inside Payment.
- In UserSubscription(subscription_id, user_id, plan_id, start_date, end_date, subscription_status), we have: subscription_id → plan_id plan_id → plan_name, price, duration_months, plan_description, access_type Since plan descriptive attributes are stored in SubscriptionPlan, no transitive dependency remains inside UserSubscription.
- In Course(course_id, course_name, course_price, course_status, instructor_id), we have: course_id → instructor_id instructor_id → user_id
The user descriptive attributes are stored in UserEntity, not in Course, so no transitive dependency remains inside Course.
- In QuizAttempt(attempt_id, user_id, quiz_id, score, attempt_date), we have: attempt_id → quiz_id quiz_id → lesson_id, total_points, passing_score Quiz details are stored in Quiz, so QuizAttempt does not contain transitive dependencies.
- In Certificate(certificate_id, enrollment_id, issue_date, certificate_code, certificate_status), we have: certificate_id → enrollment_id enrollment_id → user_id, course_id, enroll_date, completion_status, progress_percentage
Enrollment details are stored in Enrollment, so no transitive dependency remains inside Certificate.
*
User specializations (User, Administrator and Instructor)
The ER model contains a specialization of UserEntity into User, Administrator and Instructor.
The common user attributes are stored in UserEntity:
UserEntity(user_id, first_name, last_name, email, password, role)
The subtype relations store only the identifier that connects the subtype with the general user entity:
Users(user_id) Administrators(admin_id, user_id) Instructors(instructor_id, user_id)
This avoids storing user descriptive attributes multiple times in the subtype relations. Therefore, no transitive dependency remains inside the specialization relations.
Results after 3NF step
After checking the relations obtained from the 2NF decomposition, no additional decomposition is required for 3NF.
The resulting relations after the 3NF step are:
UserEntity(user_id, first_name, last_name, email, password, role)Users(user_id)Administrators(admin_id, user_id)Instructors(instructor_id, user_id)
SubscriptionPlan(plan_id, plan_name, price, duration_months, plan_description, access_type)UserSubscription(subscription_id, user_id, plan_id, start_date, end_date, subscription_status)Payment(payment_id, user_id, subscription_id, amount)
SupportTicket(ticket_id, user_id, admin_id, subject, ticket_description, ticket_status, created_at)
Category(category_id, category_name, category_description)Course(course_id, course_name, course_price, course_status, instructor_id)CourseCategory(course_id, category_id)
Module(module_id, course_id, module_title, module_description)Lesson(lesson_id, module_id, lesson_title, material)Quiz(quiz_id, lesson_id, total_points, passing_score)QuizAttempt(attempt_id, user_id, quiz_id, score, attempt_date)
Enrollment(enrollment_id, user_id, course_id, enroll_date, completion_status, progress_percentage)Certificate(certificate_id, enrollment_id, issue_date, certificate_code, certificate_status)
All resulting relations satisfy 3NF because for every non-trivial functional dependency X → Y inside a relation, X is a key of that relation, or the dependent attributes are stored in a separate relation determined by their own key.
Check for BCNF
Each resulting relation is also in BCNF, because in every non-trivial functional dependency within a relation, the determinant is a superkey of that relation.
UserEntity: user_id is key → BCNFUsers: user_id is key → BCNFAdministrators: admin_id is key → BCNFInstructors: instructor_id is key → BCNFSubscriptionPlan: plan_id is key → BCNFUserSubscription: subscription_id is key → BCNFPayment: payment_id is key → BCNFSupportTicket: ticket_id is key → BCNFCategory: category_id is key → BCNFCourse: course_id is key → BCNFCourseCategory: (course_id, category_id) is composite key → BCNFModule: module_id is key → BCNFLesson: lesson_id is key → BCNFQuiz: quiz_id is key → BCNFQuizAttempt: attempt_id is key → BCNFEnrollment: enrollment_id is key → BCNFCertificate: certificate_id is key → BCNF
Therefore, the resulting schema satisfies BCNF.
Check for 4NF (multivalued dependencies)
No additional non-trivial multivalued dependencies are identified in the resulting schema.
The many-to-many relationship between Course and Category is already decomposed into the separate relation:
CourseCategory(course_id, category_id)
This relation contains only the composite key and no additional descriptive attributes, so no further decomposition is required.
All other relations describe a single entity or relationship around one key, and no independent multi-valued facts are stored together in the same relation.
Therefore, the schema satisfies 4NF.
Final relations
Final relations
After applying normalization up to 4NF, the final normalized schema consists of the following relations:
UserEntity(user_id, first_name, last_name, email, password, role) Users(user_id) Administrators(admin_id, user_id) Instructors(instructor_id, user_id) SubscriptionPlan(plan_id, plan_name, price, duration_months, plan_description, access_type) UserSubscription(subscription_id, user_id, plan_id, start_date, end_date, subscription_status) Payment(payment_id, user_id, subscription_id, amount) SupportTicket(ticket_id, user_id, admin_id, subject, ticket_description, ticket_status, created_at) Category(category_id, category_name, category_description) Course(course_id, course_name, course_price, course_status, instructor_id) CourseCategory(course_id, category_id) Module(module_id, course_id, module_title, module_description) Lesson(lesson_id, module_id, lesson_title, material) Quiz(quiz_id, lesson_id, total_points, passing_score) QuizAttempt(attempt_id, user_id, quiz_id, score, attempt_date) Enrollment(enrollment_id, user_id, course_id, enroll_date, completion_status, progress_percentage) Certificate(certificate_id, enrollment_id, issue_date, certificate_code, certificate_status)
The final schema removes partial and transitive dependencies, avoids unnecessary redundancy, and preserves the functional dependencies from the initial universal relation.
Conclusion
The normalization process confirms the same structural design obtained from the ER model in phase 2.
