Changes between Version 4 and Version 5 of Normalization


Ignore:
Timestamp:
03/10/26 00:09:48 (11 days ago)
Author:
231035
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Normalization

    v4 v5  
    11= Normalization
     2
    23== Initial de-normalized relation and functional dependencies
    3 === Global set of attributes (single de-normalized relation)
     4
     5=== Global set of attributes
    46This represents the initial de-normalized relation before any normalization is applied.
    57Attribute names are made unique to avoid duplication.
     
    3133}
    3234}}}
     35
    3336=== Functional dependencies (initial set)
     37
    3438User / specialization
    3539{{{
     
    3741FD2: client_id → user_id, is_blocked, blocked_at, blocked_reason
    3842FD3: admin_id → user_id
    39 FD4: owner_id → client_id
    40 }}}
     43FD4: owner_id → client_id
     44}}}
     45
    4146Notification
    4247{{{
    4348FD5: notification_id → notification_type, notification_message, notification_created_at, is_read, user_id
    4449}}}
     50
    4551Vet Clinic / application
    4652{{{
     
    4854FD7: application_id → app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason, clinic_id
    4955}}}
     56
    5057Animal
    5158{{{
    5259FD8: animal_id → animal_name, located_name, species, animal_type, breed, sex, date_of_birth, photo_url, owner_id
    5360}}}
     61
    5462Listing
    5563{{{
    5664FD9: listing_id → listing_status, listing_created_at, price, listing_description, animal_id, admin_id
    5765}}}
     66
    5867Appointment
    5968{{{
    6069FD10: appointment_id → date_time, appointment_status, notes, animal_id, clinic_id
    6170}}}
    62 Health Record
     71
     72Health Record
    6373{{{
    6474FD11: healthrecord_id → hr_type, hr_description, hr_date, animal_id, appointment_id
    6575}}}
     76
    6677Review and subtypes
    6778{{{
    6879FD12: review_id → reviewer_id, rating, comment, review_created_at, updated_at, is_deleted
    69 FD13: review_id → target_user_id
    70 FD14: review_id → target_clinic_id
    71 }}}
     80FD13: review_id → target_user_id
     81FD14: review_id → target_clinic_id
     82}}}
     83
    7284Favorite Listing
    7385{{{
    74 FD15: (saved_user_id, saved_listing_id) → ∅
    75 }}}
     86The relation FavoriteListings has composite key (saved_user_id, saved_listing_id)
     87and no additional non-trivial functional dependencies.
     88}}}
     89
    7690== Candidate keys and primary key selection
    77 === Attributes appearing only on the left side
    78 {{{
    79 user_id, client_id, admin_id, owner_id,
    80 notification_id, clinic_id, application_id,
    81 animal_id, listing_id, appointment_id, healthrecord_id, review_id,
    82 (saved_user_id, saved_listing_id)
    83 }}}
    84 [[BR]]
    85 Because these identifiers can’t be derived from other attributes, they must be included in any superkey of the fully de-normalized R.
    86 [[BR]]
    87 A valid superkey for R is:
    88 {{{
    89 K = {notification_id, application_id, clinic_id, animal_id, listing_id,
    90  appointment_id, healthrecord_id, review_id, saved_user_id, saved_listing_id}
    91 }}}
    92 Its closure includes all descriptive attributes via FD1–FD15, so it determines all attributes of R.
    93 === Closure proof for K
    94 {{{
    95 Let K = {notification_id, application_id, clinic_id, animal_id, listing_id,
    96          appointment_id, healthrecord_id, review_id, saved_user_id, saved_listing_id}
     91=== Determining a candidate key
     92To determine a candidate key of the universal de-normalized relation, we must identify a minimal set of identifiers whose closure determines all attributes in R.
     93{{{#!comment It is not correct to simply collect all identifiers, because some identifiers are functionally determined by others. For example:
     94application_id → clinic_id
     95listing_id → animal_id, admin_id
     96animal_id → owner_id
     97owner_id → client_id
     98healthrecord_id → appointment_id, animal_id
     99appointment_id → clinic_id, animal_id
     100
     101
     102Therefore, attributes such as clinic_id, animal_id, appointment_id, admin_id, owner_id, and client_id do not need to appear explicitly in the candidate key if they can already be derived from other identifiers.
     103}}}
     104
     105A candidate key for the initial de-normalized relation is:
     106
     107{{{
     108K = {notification_id, application_id, listing_id, healthrecord_id, review_id, saved_user_id, saved_listing_id}
     109}}}
     110
     111Its closure includes all attributes of R, so it determines the whole relation.
     112
     113=== Closure proof for K
     114{{{
     115Let K = {notification_id, application_id, listing_id, healthrecord_id, review_id, saved_user_id, saved_listing_id}
    97116
    98117Compute K+:
     
    107126Add: username, email, name, surname, full_name, password_hash, user_created_at
    108127
     128From FD7:
     129application_id → app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason, clinic_id
     130Add: app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason, clinic_id
     131
     132From FD6:
     133clinic_id → clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city
     134Add: clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city
     135
    109136From FD9:
    110137listing_id → listing_status, listing_created_at, price, listing_description, animal_id, admin_id
    111 Add: listing_status, listing_created_at, price, listing_description, admin_id
    112 (animal_id already in K)
     138Add: listing_status, listing_created_at, price, listing_description, animal_id, admin_id
    113139
    114140From FD3:
    115 admin_id → user_id (already in K+)
     141admin_id → user_id
     142user_id already belongs to K+
    116143
    117144From FD8:
     
    125152From FD2:
    126153client_id → user_id, is_blocked, blocked_at, blocked_reason
    127 Add: is_blocked, blocked_at, blocked_reason (user_id already in K+)
    128 
    129 From FD6:
    130 clinic_id → clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city
    131 Add: clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city
    132 
    133 From FD7:
    134 application_id → app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status,
    135                  reviewed_at, reviewed_by, denial_reason, clinic_id
    136 Add: app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status,
    137      reviewed_at, reviewed_by, denial_reason (clinic_id already in K+)
     154Add: is_blocked, blocked_at, blocked_reason
     155user_id already belongs to K+
     156
     157From FD11:
     158healthrecord_id → hr_type, hr_description, hr_date, animal_id, appointment_id
     159Add: hr_type, hr_description, hr_date, appointment_id
     160animal_id already belongs to K+
    138161
    139162From FD10:
    140163appointment_id → date_time, appointment_status, notes, animal_id, clinic_id
    141 Add: date_time, appointment_status, notes (animal_id, clinic_id already in K+)
    142 
    143 From FD11:
    144 healthrecord_id → hr_type, hr_description, hr_date, animal_id, appointment_id
    145 Add: hr_type, hr_description, hr_date
     164Add: date_time, appointment_status, notes
     165animal_id and clinic_id already belong to K+
    146166
    147167From FD12:
     
    149169Add: reviewer_id, rating, comment, review_created_at, updated_at, is_deleted
    150170
    151 From FD13 and FD14:
    152 review_id → target_user_id, target_clinic_id
    153 Add: target_user_id, target_clinic_id
    154 
    155 FD15 adds no attributes.
    156 
    157 Therefore K+ = R, so K is a superkey.
    158 [[BR]]
    159 K is also a minimal and a candidate key.
    160 }}}
     171From FD13:
     172review_id → target_user_id
     173Add: target_user_id
     174
     175From FD14:
     176review_id → target_clinic_id
     177Add: target_clinic_id
     178}}}
     179
     180Therefore, K+ = R, so K is a superkey.
     181
     182K is also minimal, because removing any attribute from K would cause the loss of one independent component of the universal relation
     183{{{#!comment
     184* without notification_id, notification attributes cannot be derived
     185* without application_id, application attributes cannot be derived
     186* without listing_id, listing attributes cannot be derived
     187* without healthrecord_id, health record attributes cannot be derived
     188* without review_id, review attributes cannot be derived
     189* without saved_user_id or saved_listing_id, the FavoriteListings relationship cannot be identified
     190}}}
     191
     192So K is a candidate key.
     193
    161194=== Chosen primary key
    162 We select the minimal superkey built from "left-only identifiers":
    163 [[BR]]
    164 Primary key (of the initial de-normalized R):
    165 [[BR]]
    166 {{{
    167 {notification_id, application_id, clinic_id, animal_id, listing_id,
    168 appointment_id, healthrecord_id, review_id, saved_user_id, saved_listing_id}
    169 }}}
     195We select the candidate key above as the primary key of the initial de-normalized relation:
     196
     197{{{
     198{notification_id, application_id, listing_id, healthrecord_id, review_id, saved_user_id, saved_listing_id}
     199}}}
     200
    170201== Normal form checks
     202
    171203=== Check for 1NF
    172 R satisfies 1NF (all attributes are atomic and each row is uniquely identified by the chosen primary key).
     204R satisfies 1NF because all attributes are atomic and there are no repeating groups.
     205Each row is uniquely identified by the chosen composite candidate key.
     206
    173207=== Check for 2NF
    174 R does not satisfy 2NF, because many non-key attributes depend only on a part of the composite primary key:
    175 * listing_id → price, listing_description, listing_status, …
    176 * animal_id → species, breed, …
    177 * clinic_id → clinic_name, …
    178 * user_id → username, email, …
    179 * review_id → rating, comment, …
    180 * notification_id → message, type, …
    181 So we decompose by grouping attributes by the key they depend on.
     208R does not satisfy 2NF because it has a composite primary key and many non-key attributes depend only on part of that key rather than on the whole key.
     209
     210Examples:
     211* listing_id → listing_status, listing_created_at, price, listing_description
     212* animal_id → animal_name, species, breed, sex, date_of_birth, photo_url, owner_id
     213* clinic_id → clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city
     214* review_id → reviewer_id, rating, comment, review_created_at, updated_at, is_deleted
     215* notification_id → notification_type, notification_message, notification_created_at, is_read, user_id
     216
     217These are partial dependencies on components of the composite key, so R violates 2NF.
     218
    182219== Decomposition to 2NF
     220
    183221=== Grouping by determinants (partial dependencies)
    184222* user_id → username, email, name, surname, full_name, password_hash, user_created_at
     
    188226* notification_id → notification_type, notification_message, notification_created_at, is_read, user_id
    189227* clinic_id → clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city
    190 * application_id → app_*, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason, clinic_id
    191 * animal_id → animal_*, owner_id
    192 * listing_id → listing_*, animal_id, admin_id
     228* application_id → app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason, clinic_id
     229* animal_id → animal_name, located_name, species, animal_type, breed, sex, date_of_birth, photo_url, owner_id
     230* listing_id → listing_status, listing_created_at, price, listing_description, animal_id, admin_id
    193231* appointment_id → date_time, appointment_status, notes, animal_id, clinic_id
    194232* healthrecord_id → hr_type, hr_description, hr_date, animal_id, appointment_id
    195233* review_id → reviewer_id, rating, comment, review_created_at, updated_at, is_deleted
    196 * review_id → target_user_id / target_clinic_id (subtype dependent)
    197 * (saved_user_id, saved_listing_id) → /
    198 === 2NF relations(first decomposition)
     234* review_id → target_user_id / target_clinic_id
     235* FavoriteListings has key (saved_user_id, saved_listing_id) only
     236
     237=== 2NF relations (first decomposition)
    199238* Users(user_id, username, email, name, surname, full_name, password_hash, user_created_at)
    200239* Clients(client_id, user_id, is_blocked, blocked_at, blocked_reason)
     
    210249* Reviews(review_id, reviewer_id, rating, comment, review_created_at, updated_at, is_deleted)
    211250* FavoriteListings(saved_user_id, saved_listing_id)
     251
    212252{{{Lossless join:}}}
    213253[[BR]]
    214 For each decomposition R → R1, R2, the intersection (R1 ∩ R2) is a key of at least one of the relations(user_id is key of Users etc.).So each decomposition step is lossless.
    215 {{{Dependency preservation:}}} FD1–FD15 are preserved because each FD now lives entirely inside one of the new relations.
    216 == Check for 3NF(and decomposition)
     254The decomposition is lossless because each new relation is formed from a determinant that becomes the key of that relation. The common attributes used in the decomposition are keys in at least one of the resulting relations, so no information is lost.
     255
     256{{{Dependency preservation:}}}
     257[[BR]]
     258The decomposition is dependency-preserving because each functional dependency FD1–FD14 is represented fully inside one of the resulting relations. FavoriteListings has only its composite key and no additional non-trivial dependencies.
     259
     260== Check for 3NF (and decomposition)
    217261Now we look for transitive dependencies inside the produced relations.
     262
    218263=== Typical transitive issues from the ER
    219 * In Clients(client_id, user_id, …) we have client_id → user_id and user_id → (email, username, …) but those user attributes are no longer in Clients, so this is already clean.
    220 * In Listings(listing_id, admin_id, animal_id, …) we have admin_id → user_id (FD3) but user data is in Users, not in Listings, so no transitive dependency remains in Listings.
    221 * In VetClinicApplications(application_id, clinic_id, …) we have clinic_id → clinic attributes, but clinic attributes are not stored there.
    222 === Review subtypes(User Review and Clinic Review)
    223 The ER has specialization:
    224 [[BR]]
    225  Review is generalized into UserReview and ClinicReview. If we keep target_user_id and target_clinic_id inside Reviews, we introduce subtype semantics and potential null-heavy design.
    226 [[BR]]
     264* In Clients(client_id, user_id, …), we have client_id → user_id, but user descriptive attributes are stored in Users, so no transitive dependency remains inside Clients.
     265* In Listings(listing_id, admin_id, animal_id, …), we have listing_id → admin_id and globally admin_id → user_id, but user attributes are stored in Users, not in Listings, so no transitive dependency remains inside Listings.
     266* In VetClinicApplications(application_id, clinic_id, …), we have application_id → clinic_id and clinic_id determines clinic attributes, but those attributes are stored in VetClinics, not in VetClinicApplications, so no transitive dependency remains there either.
     267
     268=== Review subtypes (User Review and Clinic Review)
     269The ER model contains a specialization of Review into UserReview and ClinicReview.
     270If both target_user_id and target_clinic_id are stored in Reviews, the relation mixes subtype-specific semantics and may introduce nulls or invalid combinations.
    227271So we decompose:
    228 [[BR]]
    229 {{{UserReviews}}}(review_id, target_user_id)
    230 {{{ClinicReviews}}}(review_id, target_clinic_id)
     272{{{
     273UserReviews(review_id, target_user_id)
     274ClinicReviews(review_id, target_clinic_id)
     275}}}
     276
    231277Now:
    232 * review_id determines the base review fields in Reviews (FD12)
    233 * review_id determines subtype target in the subtype table (FD13/FD14)
     278* Reviews stores the common review attributes
     279* UserReviews stores the target user of a user review
     280* ClinicReviews stores the target clinic of a clinic review
     281
    234282=== Results after 3NF step
    235283* UserReviews(review_id, target_user_id)
    236284* ClinicReviews(review_id, target_clinic_id)
    237 All relations are now in 3NF.
     285
     286After separating subtype-specific attributes into UserReviews and ClinicReviews, all resulting relations satisfy 3NF.
     287
    238288== Check for BCNF
    239 For each relation, every non-trivial FD has a determinant that is a superkey:
     289Each resulting relation is also in BCNF because, in every non-trivial functional dependency within a relation, the determinant is a superkey of that relation.
     290
    240291* Users: user_id is key → BCNF
    241292* Clients: client_id is key → BCNF
     
    250301* HealthRecords: healthrecord_id is key → BCNF
    251302* Reviews: review_id is key → BCNF
    252 * UserReviews / ClinicReviews: review_id is key → BCNF
     303* UserReviews: review_id is key → BCNF
     304* ClinicReviews: review_id is key → BCNF
    253305* FavoriteListings: (saved_user_id, saved_listing_id) is key → BCNF
    254306{{{So the schema satisfies BCNF.}}}
    255 == Check for 4NF(multivalued dependencies)
    256 No additional multivalued dependencies are identified beyond the M:N relationship FavoriteListing,which is already represented as its own relation FavoriteListings(saved_user_id, saved_listing_id). So no further 4NF decomposition is required.
     307
     308== Check for 4NF (multivalued dependencies)
     309No additional non-trivial multivalued dependencies are identified in the resulting schema.
     310The many-to-many relationship FavoriteListing is already decomposed into the separate relation FavoriteListings(saved_user_id, saved_listing_id), so no further decomposition is required for 4NF.
    257311{{{The schema satisfies 4NF.}}}
     312
    258313== Final relations
    259314* Users(user_id, username, email, name, surname, full_name, password_hash, user_created_at)
     
    261316* Admins(admin_id, user_id)
    262317* Owners(owner_id, client_id)
    263 
    264318* Notifications(notification_id, user_id, notification_type, notification_message, notification_created_at, is_read)
    265 
    266319* VetClinics(clinic_id, clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city)
    267 
    268320* VetClinicApplications(application_id, clinic_id, app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason)
    269 
    270321* Animals(animal_id, owner_id, animal_name, located_name, species, animal_type, breed, sex, date_of_birth, photo_url)
    271 
    272322* Listings(listing_id, admin_id, animal_id, listing_status, listing_created_at, price, listing_description)
    273 
    274323* FavoriteListings(saved_user_id, saved_listing_id)
    275 
    276324* Appointments(appointment_id, clinic_id, animal_id, date_time, appointment_status, notes)
    277 
    278325* HealthRecords(healthrecord_id, appointment_id, animal_id, hr_type, hr_description, hr_date)
    279 
    280326* Reviews(review_id, reviewer_id, rating, comment, review_created_at, updated_at, is_deleted)
    281327* UserReviews(review_id, target_user_id)
    282328* ClinicReviews(review_id, target_clinic_id)
    283329
    284 Business constraint:
    285 A ClinicReview is allowed only if the reviewer has at least one Appointment at that clinic with status = 'DONE'.
    286330== Conclusion
    287 The final normalized relational design matches the Phase 2 ER-based design:
    288 * Each entity becomes a relation.
    289 * Specializations are represented as subtype relations.
    290 * The M:N relationship FavoriteListing is represented as a junction relation FavoriteListings.
    291 Therefore, there are no structural differences from Phase 2.
     331The normalization process confirms the same structural design obtained from the ER model in phase 2.