wiki:Normalization

Normalization

Initial de-normalized relation and functional dependencies

Global set of attributes

This represents the initial de-normalized relation before any normalization is applied. Attribute names are made unique to avoid duplication.

R = {
user_id, username, email, name, surname, full_name, password_hash, user_created_at,
client_id, is_blocked, blocked_at, blocked_reason,
admin_id,
owner_id,

notification_id, notification_type, notification_message, notification_created_at, is_read,

animal_id, animal_name, located_name, species, animal_type, breed, sex, date_of_birth, photo_url,

listing_id, listing_status, listing_created_at, price, listing_description,

saved_user_id, saved_listing_id,

clinic_id, clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city,

application_id, app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason,

appointment_id, date_time, appointment_status, notes,

healthrecord_id, hr_type, hr_description, hr_date,

review_id, reviewer_id, rating, comment, review_created_at, updated_at, is_deleted,
target_user_id, target_clinic_id
}

Functional dependencies (initial set)

User / specialization

FD1: user_id → username, email, name, surname, full_name, password_hash, user_created_at
FD2: client_id → user_id, is_blocked, blocked_at, blocked_reason
FD3: admin_id → user_id
FD4: owner_id → client_id

Notification

FD5: notification_id → notification_type, notification_message, notification_created_at, is_read, user_id

Vet Clinic / application

FD6: clinic_id → clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city
FD7: application_id → app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason, clinic_id

Animal

FD8: animal_id → animal_name, located_name, species, animal_type, breed, sex, date_of_birth, photo_url, owner_id

Listing

FD9: listing_id → listing_status, listing_created_at, price, listing_description, animal_id, admin_id

Appointment

FD10: appointment_id → date_time, appointment_status, notes, animal_id, clinic_id

Health Record

FD11: healthrecord_id → hr_type, hr_description, hr_date, animal_id, appointment_id

Review and subtypes

FD12: review_id → reviewer_id, rating, comment, review_created_at, updated_at, is_deleted
FD13: review_id → target_user_id
FD14: review_id → target_clinic_id

Favorite Listing

The relation FavoriteListings has composite key (saved_user_id, saved_listing_id)
and no additional non-trivial functional dependencies.

Candidate keys and primary key selection

Determining a candidate key

To determine a candidate key of the universal de-normalized relation, we must identify a minimal set of identifiers whose closure determines all attributes in R.

A candidate key for the initial de-normalized relation is:

K = {notification_id, application_id, listing_id, healthrecord_id, review_id, saved_user_id, saved_listing_id}

Its closure includes all attributes of R, so it determines the whole relation.

Closure proof for K

Let K = {notification_id, application_id, listing_id, healthrecord_id, review_id, saved_user_id, saved_listing_id}

Compute K+:
Start: K+ = K

From FD5:
notification_id → notification_type, notification_message, notification_created_at, is_read, user_id
Add: notification_type, notification_message, notification_created_at, is_read, user_id

From FD1:
user_id → username, email, name, surname, full_name, password_hash, user_created_at
Add: username, email, name, surname, full_name, password_hash, user_created_at

From FD7:
application_id → app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason, clinic_id
Add: app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason, clinic_id

From FD6:
clinic_id → clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city
Add: clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city

From FD9:
listing_id → listing_status, listing_created_at, price, listing_description, animal_id, admin_id
Add: listing_status, listing_created_at, price, listing_description, animal_id, admin_id

From FD3:
admin_id → user_id
user_id already belongs to K+

From FD8:
animal_id → animal_name, located_name, species, animal_type, breed, sex, date_of_birth, photo_url, owner_id
Add: animal_name, located_name, species, animal_type, breed, sex, date_of_birth, photo_url, owner_id

From FD4:
owner_id → client_id
Add: client_id

From FD2:
client_id → user_id, is_blocked, blocked_at, blocked_reason
Add: is_blocked, blocked_at, blocked_reason
user_id already belongs to K+

From FD11:
healthrecord_id → hr_type, hr_description, hr_date, animal_id, appointment_id
Add: hr_type, hr_description, hr_date, appointment_id
animal_id already belongs to K+

From FD10:
appointment_id → date_time, appointment_status, notes, animal_id, clinic_id
Add: date_time, appointment_status, notes
animal_id and clinic_id already belong to K+

From FD12:
review_id → reviewer_id, rating, comment, review_created_at, updated_at, is_deleted
Add: reviewer_id, rating, comment, review_created_at, updated_at, is_deleted

From FD13:
review_id → target_user_id
Add: target_user_id

From FD14:
review_id → target_clinic_id
Add: target_clinic_id

Therefore, K+ = R, so K is a superkey.

K is also minimal, because removing any attribute from K would cause the loss of one independent component of the universal relation

So K is a candidate key.

Chosen primary key

We select the candidate key above as the primary key of the initial de-normalized relation:

{notification_id, application_id, listing_id, healthrecord_id, review_id, saved_user_id, saved_listing_id}

Normal form checks

Check for 1NF

R satisfies 1NF because all attributes are atomic and there are no repeating groups. Each row is uniquely identified by the chosen composite candidate key.

Check for 2NF

R does not satisfy 2NF because it has a composite primary key and many non-key attributes depend only on part of that key rather than on the whole key.

Examples:

  • listing_id → listing_status, listing_created_at, price, listing_description
  • animal_id → animal_name, species, breed, sex, date_of_birth, photo_url, owner_id
  • clinic_id → clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city
  • review_id → reviewer_id, rating, comment, review_created_at, updated_at, is_deleted
  • notification_id → notification_type, notification_message, notification_created_at, is_read, user_id

These are partial dependencies on components of the composite key, so R violates 2NF.

Decomposition to 2NF

Grouping by determinants (partial dependencies)

  • user_id → username, email, name, surname, full_name, password_hash, user_created_at
  • client_id → user_id, is_blocked, blocked_at, blocked_reason
  • admin_id → user_id
  • owner_id → client_id
  • notification_id → notification_type, notification_message, notification_created_at, is_read, user_id
  • clinic_id → clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city
  • application_id → app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason, clinic_id
  • animal_id → animal_name, located_name, species, animal_type, breed, sex, date_of_birth, photo_url, owner_id
  • listing_id → listing_status, listing_created_at, price, listing_description, animal_id, admin_id
  • appointment_id → date_time, appointment_status, notes, animal_id, clinic_id
  • healthrecord_id → hr_type, hr_description, hr_date, animal_id, appointment_id
  • review_id → reviewer_id, rating, comment, review_created_at, updated_at, is_deleted
  • review_id → target_user_id / target_clinic_id
  • FavoriteListings has key (saved_user_id, saved_listing_id) only

2NF relations (first decomposition)

  • Users(user_id, username, email, name, surname, full_name, password_hash, user_created_at)
  • Clients(client_id, user_id, is_blocked, blocked_at, blocked_reason)
  • Admins(admin_id, user_id)
  • Owners(owner_id, client_id)
  • Notifications(notification_id, user_id, notification_type, notification_message, notification_created_at, is_read)
  • VetClinics(clinic_id, clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city)
  • VetClinicApplications(application_id, clinic_id, app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason)
  • Animals(animal_id, owner_id, animal_name, located_name, species, animal_type, breed, sex, date_of_birth, photo_url)
  • Listings(listing_id, admin_id, animal_id, listing_status, listing_created_at, price, listing_description)
  • Appointments(appointment_id, clinic_id, animal_id, date_time, appointment_status, notes)
  • HealthRecords(healthrecord_id, appointment_id, animal_id, hr_type, hr_description, hr_date)
  • Reviews(review_id, reviewer_id, rating, comment, review_created_at, updated_at, is_deleted)
  • FavoriteListings(saved_user_id, saved_listing_id)

Lossless join:
The decomposition is lossless because each new relation is formed from a determinant that becomes the key of that relation. The common attributes used in the decomposition are keys in at least one of the resulting relations, so no information is lost.

Dependency preservation:
The decomposition is dependency-preserving because each functional dependency FD1–FD14 is represented fully inside one of the resulting relations. FavoriteListings has only its composite key and no additional non-trivial dependencies.

Check for 3NF (and decomposition)

Now we look for transitive dependencies inside the produced relations.

Typical transitive issues from the ER

  • In Clients(client_id, user_id, …), we have client_id → user_id, but user descriptive attributes are stored in Users, so no transitive dependency remains inside Clients.
  • In Listings(listing_id, admin_id, animal_id, …), we have listing_id → admin_id and globally admin_id → user_id, but user attributes are stored in Users, not in Listings, so no transitive dependency remains inside Listings.
  • In VetClinicApplications(application_id, clinic_id, …), we have application_id → clinic_id and clinic_id determines clinic attributes, but those attributes are stored in VetClinics, not in VetClinicApplications, so no transitive dependency remains there either.

Review subtypes (User Review and Clinic Review)

The ER model contains a specialization of Review into UserReview and ClinicReview. If both target_user_id and target_clinic_id are stored in Reviews, the relation mixes subtype-specific semantics and may introduce nulls or invalid combinations. So we decompose:

UserReviews(review_id, target_user_id)
ClinicReviews(review_id, target_clinic_id)

Now:

  • Reviews stores the common review attributes
  • UserReviews stores the target user of a user review
  • ClinicReviews stores the target clinic of a clinic review

Results after 3NF step

After separating subtype-specific attributes into UserReviews and ClinicReviews, all resulting relations satisfy 3NF.

Check for BCNF

Each resulting relation is also in BCNF because, in every non-trivial functional dependency within a relation, the determinant is a superkey of that relation.

  • Users: user_id is key → BCNF
  • Clients: client_id is key → BCNF
  • Admins: admin_id is key → BCNF
  • Owners: owner_id is key → BCNF
  • Notifications: notification_id is key → BCNF
  • VetClinics: clinic_id is key → BCNF
  • VetClinicApplications: application_id is key → BCNF
  • Animals: animal_id is key → BCNF
  • Listings: listing_id is key → BCNF
  • Appointments: appointment_id is key → BCNF
  • HealthRecords: healthrecord_id is key → BCNF
  • Reviews: review_id is key → BCNF
  • UserReviews: review_id is key → BCNF
  • ClinicReviews: review_id is key → BCNF
  • FavoriteListings: (saved_user_id, saved_listing_id) is key → BCNF

So the schema satisfies BCNF.

Check for 4NF (multivalued dependencies)

No additional non-trivial multivalued dependencies are identified in the resulting schema. The many-to-many relationship FavoriteListing is already decomposed into the separate relation FavoriteListings(saved_user_id, saved_listing_id), so no further decomposition is required for 4NF. The schema satisfies 4NF.

Final relations

  • Users(user_id, username, email, name, surname, full_name, password_hash, user_created_at)
  • Clients(client_id, user_id, is_blocked, blocked_at, blocked_reason)
  • Admins(admin_id, user_id)
  • Owners(owner_id, client_id)
  • Notifications(notification_id, user_id, notification_type, notification_message, notification_created_at, is_read)
  • VetClinics(clinic_id, clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city)
  • VetClinicApplications(application_id, clinic_id, app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason)
  • Animals(animal_id, owner_id, animal_name, located_name, species, animal_type, breed, sex, date_of_birth, photo_url)
  • Listings(listing_id, admin_id, animal_id, listing_status, listing_created_at, price, listing_description)
  • FavoriteListings(saved_user_id, saved_listing_id)
  • Appointments(appointment_id, clinic_id, animal_id, date_time, appointment_status, notes)
  • HealthRecords(healthrecord_id, appointment_id, animal_id, hr_type, hr_description, hr_date)
  • Reviews(review_id, reviewer_id, rating, comment, review_created_at, updated_at, is_deleted)
  • UserReviews(review_id, target_user_id)
  • ClinicReviews(review_id, target_clinic_id)

Conclusion

The normalization process confirms the same structural design obtained from the ER model in phase 2.

Last modified 11 days ago Last modified on 03/10/26 00:09:48
Note: See TracWiki for help on using the wiki.