| 3 | | This page describes the normalization process of the ''Wedding Planner'' database system. |
| 4 | | The goal is to transform an initial denormalized structure into a fully normalized relational schema that is lossless, dependency-preserving, and aligned with the ER diagram and SQL implementation. |
| 5 | | |
| 6 | | |
| 7 | | == Denormalized Form (UNF) == |
| 8 | | |
| 9 | | The denormalized form represents a structure where all data related to users, weddings, events, guests, venues, and service providers are stored in a single table, without enforcing normalization rules. |
| 10 | | This form contains repeating groups, redundancy, and multiple anomalies. |
| | 3 | == Introduction == |
| | 4 | |
| | 5 | This section presents a complete, formal, and provable normalization process for the Wedding Planner database. |
| | 6 | The normalization is demonstrated step by step, starting from a denormalized relation and ending in Third Normal Form (3NF). |
| | 7 | |
| | 8 | -Each transformation is justified using: |
| | 9 | |
| | 10 | -Functional dependencies |
| | 11 | |
| | 12 | -Primary key verification |
| | 13 | |
| | 14 | -Lossless join decomposition proof |
| | 15 | |
| | 16 | -Functional dependency preservation proof |
| | 17 | |
| | 18 | All examples are derived directly from the Wedding Planner domain. |
| | 19 | |
| | 20 | == Initial Denormalized Relation (UNF) == |
| | 21 | |
| | 22 | The system initially stores all wedding-related information in a single relation. |
| | 23 | |
| | 24 | || Relation || Attributes || |
| | 25 | || R || user_id, user_first_name, user_last_name, user_email, wedding_id, wedding_date, wedding_budget, venue_id, venue_name, venue_type, venue_capacity, booking_date, start_time, end_time || |
| | 26 | |
| | 27 | This relation violates normalization principles. |
| 19 | | |
| 20 | | |
| 21 | | === Functional Dependencies (UNF) === |
| 22 | | |
| 23 | | {user_id} → {first_name, last_name, email} |
| 24 | | |
| 25 | | {wedding_id} → {wedding_date, wedding_budget} |
| 26 | | |
| 27 | | {event_id} → {event_type, event_date, start_time, end_time} |
| 28 | | |
| 29 | | {guest_id} → {guest_first_name, guest_last_name} |
| 30 | | |
| 31 | | {venue_name} → {venue_type} |
| 32 | | |
| 33 | | === Problems in UNF === |
| 34 | | |
| 35 | | • Data duplication (same user, wedding, venue repeated) |
| 36 | | • Repeating groups (events and guests) |
| 37 | | • Update anomalies |
| 38 | | • Insertion anomalies |
| 39 | | • Deletion anomalies |
| | 36 | == Identified Problems in UNF == |
| | 37 | |
| | 38 | || Problem || Explanation || |
| | 39 | || Repetition || Venue data is repeated for every wedding |
| | 40 | || Update anomaly || Updating venue name requires multiple updates |
| | 41 | || Insertion anomaly || Venue cannot exist without a wedding |
| | 42 | || Deletion anomaly || Deleting a wedding removes venue data |
| | 43 | || Mixed entities || User, wedding, venue, and booking data stored together || |
| | 44 | |
| | 45 | == Functional Dependency Analysis == |
| | 46 | |
| | 47 | Based on business rules of the Wedding Planner system, the following functional dependencies hold: |
| | 48 | |
| | 49 | || ID || Functional Dependency || Justification || |
| | 50 | || FD1 || user_id → user_first_name, user_last_name, user_email || User attributes depend only on user_id |
| | 51 | || FD2 || wedding_id → wedding_date, wedding_budget, user_id || A wedding uniquely defines its properties |
| | 52 | || FD3 || venue_id → venue_name, venue_type, venue_capacity || Venue attributes depend only on venue_id |
| | 53 | || FD4 || booking_id → venue_id, wedding_id, booking_date, start_time, end_time || A booking uniquely defines a venue reservation || |
| | 54 | |
| | 55 | == Candidate Key Determination == |
| | 56 | |
| | 57 | To uniquely identify a tuple in R, the following composite key is required: |
| | 58 | |
| | 59 | |
| | 60 | Please review the proof here [[Proof of Candidate Key]] |
| | 61 | |
| | 62 | || Candidate Key || |
| | 63 | || (user_id, wedding_id, venue_id, booking_date) || |
| | 64 | |
| | 65 | This confirms that the initial relation has a composite primary key, which leads to partial dependencies. |
| 43 | | === Theoretical Explanation === |
| 44 | | |
| 45 | | A relation is in First Normal Form (1NF) if: |
| 46 | | • all attributes contain atomic values |
| 47 | | • there are no repeating groups |
| 48 | | • each record can be uniquely identified |
| 49 | | |
| 50 | | In this step, the table structure remains the same, |
| 51 | | but we define a composite primary key to uniquely identify each row. |
| 52 | | |
| 53 | | === 1NF Relation === |
| 54 | | |
| 55 | | Primary Key: |
| 56 | | (user_id, wedding_id, event_id, guest_id) |
| 57 | | |
| 58 | | All values are atomic and no multivalued attributes exist: |
| 59 | | |
| 60 | | |
| 61 | | R( |
| 62 | | user_id, first_name, last_name, email, |
| 63 | | wedding_id, wedding_date, wedding_budget, |
| 64 | | event_id, event_type, event_date, start_time, end_time, |
| 65 | | guest_id, guest_first_name, guest_last_name, |
| 66 | | rsvp_status, attendance_status, |
| 67 | | venue_name, venue_type, |
| 68 | | photographer_name, band_name |
| 69 | | ) |
| 70 | | |
| 71 | | Primary Key: |
| 72 | | (user_id, wedding_id, event_id, guest_id) |
| | 69 | === Definition === |
| | 70 | A relation is in 1NF if: |
| | 71 | |
| | 72 | All attributes contain atomic values |
| | 73 | |
| | 74 | No repeating groups exist |
| | 75 | |
| | 76 | The UNF relation contains repeating venue and booking attributes. |
| | 77 | |
| | 78 | === Transformation to 1NF === |
| | 79 | |
| | 80 | The relation is decomposed into atomic tuples by separating entity data. |
| | 81 | |
| | 82 | || Table || Attributes || |
| | 83 | || R1 || user_id, user_first_name, user_last_name, user_email |
| | 84 | || R2 || wedding_id, wedding_date, wedding_budget, user_id |
| | 85 | || R3 || venue_id, venue_name, venue_type, venue_capacity |
| | 86 | || R4 || booking_id, venue_id, wedding_id, booking_date, start_time, end_time || |
| | 87 | |
| | 88 | All attributes are now atomic. |
| 76 | | === Theoretical Explanation === |
| 77 | | |
| 78 | | A relation is in Second Normal Form (2NF) if: |
| 79 | | • it is already in 1NF |
| 80 | | • no non-key attribute depends on only part of a composite key |
| 81 | | |
| 82 | | Partial dependencies are removed by decomposing the table into smaller relations. |
| 83 | | |
| 84 | | === Decomposition === |
| 85 | | |
| 86 | | === USER === |
| 87 | | |
| 88 | | || user_id || first_name || last_name || email || |
| 89 | | || 1 || Ana || Trajkovska || ana.trajkovska@gmail.com|| |
| 90 | | |
| 91 | | FD: |
| 92 | | {user_id} → {first_name, last_name, email} |
| 93 | | |
| 94 | | === WEDDING === |
| 95 | | |
| 96 | | || wedding_id || date || budget || user_id || |
| 97 | | || 1 || 2026-06-20 || 8500 || 1 || |
| 98 | | |
| 99 | | FD: |
| 100 | | {wedding_id} → {wedding_date, wedding_budget, user_id} |
| 101 | | |
| 102 | | === EVENT === |
| 103 | | |
| 104 | | || event_id || event_type || date || start_time || end_time || wedding_id || |
| 105 | | || 1 || Church Ceremony || 2026-06-20 || 12:00 || 13:00 || 1 || |
| 106 | | || 2 || Reception || 2026-06-20 || 16:00 || 23:00 || 1 || |
| 107 | | |
| 108 | | FD: |
| 109 | | {event_id} → {event_type, event_date, start_time, end_time, wedding_id} |
| 110 | | |
| 111 | | === GUEST === |
| 112 | | |
| 113 | | || guest_id || first_name || last_name || wedding_id || |
| 114 | | || 1 || Ana || Markovska || 1 || |
| 115 | | || 2 || Daniel || Stojanov || 1 || |
| 116 | | |
| 117 | | FD: |
| 118 | | {guest_id} → {first_name, last_name, wedding_id} |
| 119 | | |
| 120 | | === Lossless Join Explanation (2NF) === |
| 121 | | |
| 122 | | The decomposition is lossless because: |
| 123 | | • primary keys are preserved |
| 124 | | • joining the relations reconstructs the original UNF data without loss |
| | 92 | === Definition === |
| | 93 | A relation is in 2NF if: |
| | 94 | |
| | 95 | It is in 1NF |
| | 96 | |
| | 97 | No non-key attribute is partially dependent on a composite key |
| | 98 | |
| | 99 | === Partial Dependency Proof === |
| | 100 | |
| | 101 | || Attribute || Depends On || Violation || |
| | 102 | || user_first_name || user_id || Partial dependency |
| | 103 | || wedding_date || wedding_id || Partial dependency |
| | 104 | || venue_name || venue_id || Partial dependency |
| | 105 | |
| | 106 | Thus, 2NF is violated in the original relation. |
| | 107 | |
| | 108 | === Decomposition to 2NF === |
| | 109 | |
| | 110 | || Table || Primary Key || Attributes || |
| | 111 | || user || user_id || user_first_name, user_last_name, user_email |
| | 112 | || wedding || wedding_id || wedding_date, wedding_budget, user_id |
| | 113 | || venue || venue_id || venue_name, venue_type, venue_capacity |
| | 114 | || venue_booking || booking_id || venue_id, wedding_id, booking_date, start_time, end_time |
| | 115 | |
| | 116 | Each non-key attribute now fully depends on the primary key. |
| | 117 | |
| | 118 | == Explanation of Entity Composition (User Table) == |
| | 119 | |
| | 120 | The user table exists because user data represents an independent entity. |
| | 121 | |
| | 122 | || Reason || Formal Explanation || |
| | 123 | || Functional dependency || user_id → user attributes |
| | 124 | || Independence || User may exist without a wedding |
| | 125 | || Reusability || One user can manage multiple weddings |
| | 126 | || Normalization rule || Prevents transitive dependencies || |
| | 127 | |
| | 128 | This decomposition is mandatory for correctness. |
| 128 | | === Theoretical Explanation === |
| 129 | | |
| 130 | | A relation is in Third Normal Form (3NF) if: |
| 131 | | • it is already in 2NF |
| 132 | | • no transitive dependencies exist |
| 133 | | |
| 134 | | This means non-key attributes must depend only on the primary key. |
| 135 | | == Third Normal Form (3NF) == |
| 136 | | |
| 137 | | Third Normal Form removes transitive dependencies. |
| 138 | | |
| 139 | | === Venue Separation === |
| 140 | | |
| 141 | | === VENUE_TYPE === |
| 142 | | |
| 143 | | || type_id || type_name || |
| 144 | | || 3 || Outdoor Garden || |
| 145 | | |
| 146 | | FD: |
| 147 | | {type_id} → {type_name} |
| 148 | | |
| 149 | | === VENUE === |
| 150 | | |
| 151 | | || venue_id || name || location || city || capacity || price_per_guest || type_id || |
| 152 | | || 1 || Lakeside Garden Venue || Matka || Skopje || 200 || 35 || 3 || |
| 153 | | |
| 154 | | FD: |
| 155 | | {venue_id} → {name, location, city, capacity, price_per_guest, type_id} |
| 156 | | |
| 157 | | === EVENT RSVP === |
| 158 | | |
| 159 | | || response_id || guest_id || event_id || status || response_date || |
| 160 | | || 1 || 1 || 1 || accepted || 2026-05-15 || |
| 161 | | || 2 || 2 || 2 || accepted || 2026-05-15 || |
| 162 | | |
| 163 | | Composite uniqueness: |
| 164 | | (guest_id, event_id) |
| 165 | | |
| 166 | | === Attendance === |
| 167 | | |
| 168 | | || attendance_id || guest_id || event_id || status || table_number || role || |
| 169 | | || 1 || 1 || 1 || attending || 5 || Guest || |
| 170 | | || 2 || 2 || 2 || attending || 10 || Guest || |
| 171 | | |
| 172 | | == Final Normalized Schema (3NF / BCNF) == |
| 173 | | |
| 174 | | USER(user_id, first_name, last_name, email) |
| 175 | | |
| 176 | | WEDDING(wedding_id, date, budget, user_id) |
| 177 | | |
| 178 | | EVENT(event_id, event_type, date, start_time, end_time, wedding_id) |
| 179 | | |
| 180 | | GUEST(guest_id, first_name, last_name, wedding_id) |
| 181 | | |
| 182 | | EVENT_RSVP(response_id, guest_id, event_id, status, response_date) |
| 183 | | |
| 184 | | ATTENDANCE(attendance_id, guest_id, event_id, status, table_number, role) |
| 185 | | |
| 186 | | VENUE(venue_id, name, location, city, capacity, price_per_guest, type_id) |
| 187 | | |
| 188 | | VENUE_TYPE(type_id, type_name) |
| 189 | | |
| 190 | | PHOTOGRAPHER(photographer_id, name, price_per_hour) |
| 191 | | |
| 192 | | BAND(band_id, band_name, price_per_hour) |
| 193 | | |
| 194 | | == Conclusion == |
| 195 | | |
| 196 | | The normalization process eliminates redundancy, prevents anomalies, and produces a clean relational schema fully aligned with the ER diagram and SQL implementation. |
| 197 | | |
| 198 | | |
| 199 | | The final design is lossless, dependency-preserving, and in BCNF, making it suitable for a real-world wedding planner system. |
| | 132 | === Definition === |
| | 133 | A relation is in 3NF if: |
| | 134 | |
| | 135 | It is in 2NF |
| | 136 | |
| | 137 | No transitive dependencies exist |
| | 138 | |
| | 139 | === Transitive Dependency Analysis === |
| | 140 | |
| | 141 | || Dependency Chain || Explanation || |
| | 142 | || wedding_id → user_id → user_email || Transitive dependency |
| | 143 | || booking_id → venue_id → venue_name || Transitive dependency |
| | 144 | |
| | 145 | === Removal of Transitive Dependencies === |
| | 146 | |
| | 147 | Each dependency is isolated into its own table, resulting in full 3NF compliance. |
| | 148 | |
| | 149 | == Primary Key Verification == |
| | 150 | |
| | 151 | Each table has a primary key that uniquely identifies records. |
| | 152 | |
| | 153 | || Table || Primary Key || Uniqueness Proof || |
| | 154 | || user || user_id || One user per ID |
| | 155 | || wedding || wedding_id || One wedding per ID |
| | 156 | || venue || venue_id || One venue per ID |
| | 157 | || venue_booking || booking_id || One booking per ID || |
| | 158 | |
| | 159 | This ensures entity integrity. |
| | 160 | |
| | 161 | == Lossless Join Decomposition Proof == |
| | 162 | |
| | 163 | === Formal Rule === |
| | 164 | |
| | 165 | A decomposition of R into R1 and R2 is lossless if: |
| | 166 | |
| | 167 | || Condition || |
| | 168 | || (R1 ∩ R2) → R1 OR (R1 ∩ R2) → R2 || |
| | 169 | |
| | 170 | === Step-by-Step Explanation === |
| | 171 | |
| | 172 | || Join || Common Attribute || Proof || |
| | 173 | || wedding ⋈ user || user_id || user_id is PK in user |
| | 174 | || venue_booking ⋈ wedding || wedding_id || wedding_id is PK |
| | 175 | || venue_booking ⋈ venue || venue_id || venue_id is PK |
| | 176 | |
| | 177 | Each decomposition satisfies the lossless join condition. |
| | 178 | |
| | 179 | ❗ Lossless join is proven pairwise, not by joining all tables simultaneously. |
| | 180 | |
| | 181 | == Functional Dependency Preservation Proof == |
| | 182 | |
| | 183 | === Definition === |
| | 184 | A decomposition preserves dependencies if all FDs can be enforced without joins. |
| | 185 | |
| | 186 | === Verification === |
| | 187 | |
| | 188 | || Functional Dependency || Table || |
| | 189 | || user_id → user_first_name, user_last_name, user_email || user |
| | 190 | || wedding_id → wedding_date, wedding_budget, user_id || wedding |
| | 191 | || venue_id → venue_name, venue_type, venue_capacity || venue |
| | 192 | || booking_id → venue_id, wedding_id, booking_date, start_time, end_time || venue_booking || |
| | 193 | |
| | 194 | All dependencies are preserved locally. |
| | 195 | |
| | 196 | == Final Normalized Schema (3NF) == |
| | 197 | |
| | 198 | || Table || Primary Key || Foreign Keys || |
| | 199 | || user || user_id |
| | 200 | || wedding || wedding_id || user_id → user |
| | 201 | || venue || venue_id |
| | 202 | || venue_booking || booking_id || venue_id → venue, wedding_id → wedding || |
| | 203 | |
| | 204 | == Final Conclusion == |
| | 205 | |
| | 206 | The Wedding Planner database schema is fully normalized to Third Normal Form. |
| | 207 | |
| | 208 | The normalization: |
| | 209 | |
| | 210 | -Eliminates redundancy |
| | 211 | |
| | 212 | -Prevents anomalies |
| | 213 | |
| | 214 | -Preserves functional dependencies |
| | 215 | |
| | 216 | -Guarantees lossless joins |
| | 217 | |
| | 218 | -Uses provable primary keys |