wiki:Normalization

Normalization

Initial de-normalized relation and functional dependencies

Global set of attributes

This represents the initial de-normalized relation before any normalization is applied. Attribute names are made unique to avoid duplication.

R = {
user_id, username, email, name, surname, full_name, password_hash, user_created_at,
client_id, is_blocked, blocked_at, blocked_reason,
admin_id,
owner_id,

notification_id, notification_type, notification_message, notification_created_at, is_read,

animal_id, animal_name, located_name, species, animal_type, breed, sex, date_of_birth, photo_url,

listing_id, listing_status, listing_created_at, price, listing_description,

saved_user_id, saved_listing_id,

clinic_id, clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city,

application_id, app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason,

appointment_id, date_time, appointment_status, notes,

healthrecord_id, hr_type, hr_description, hr_date,

review_id, reviewer_id, rating, comment, review_created_at, updated_at, is_deleted,
target_user_id, target_clinic_id
}

Functional dependencies (initial set)

User / specialization

FD1: user_id → username, email, name, surname, full_name, password_hash, user_created_at
FD2: client_id → user_id, is_blocked, blocked_at, blocked_reason
FD3: admin_id → user_id
FD4: owner_id → client_id

Notification

FD5: notification_id → notification_type, notification_message, notification_created_at, is_read, user_id

Vet Clinic / application

FD6: clinic_id → clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city
FD7: application_id → app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason, clinic_id

Animal

FD8: animal_id → animal_name, located_name, species, animal_type, breed, sex, date_of_birth, photo_url, owner_id

Listing

FD9: listing_id → listing_status, listing_created_at, price, listing_description, animal_id, admin_id

Appointment

FD10: appointment_id → date_time, appointment_status, notes, animal_id, clinic_id

Health Record

FD11: healthrecord_id → hr_type, hr_description, hr_date, animal_id, appointment_id

Review and subtypes

FD12: review_id → reviewer_id, rating, comment, review_created_at, updated_at, is_deleted
FD13: review_id → target_user_id
FD14: review_id → target_clinic_id

Favorite Listing

The relation FavoriteListings has composite key (saved_user_id, saved_listing_id)
and no additional non-trivial functional dependencies.

Candidate keys and primary key selection

Determining a candidate key

To determine a candidate key of the universal de-normalized relation, we must identify a minimal set of identifiers whose closure determines all attributes in R.

A candidate key for the initial de-normalized relation is:

K = {notification_id, application_id, listing_id, healthrecord_id, review_id, saved_user_id, saved_listing_id}

Its closure includes all attributes of R, so it determines the whole relation.

Closure proof for K

Let K = {notification_id, application_id, listing_id, healthrecord_id, review_id, saved_user_id, saved_listing_id}

Compute K+:
Start: K+ = K

From FD5:
notification_id → notification_type, notification_message, notification_created_at, is_read, user_id
Add: notification_type, notification_message, notification_created_at, is_read, user_id

From FD1:
user_id → username, email, name, surname, full_name, password_hash, user_created_at
Add: username, email, name, surname, full_name, password_hash, user_created_at

From FD7:
application_id → app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason, clinic_id
Add: app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason, clinic_id

From FD6:
clinic_id → clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city
Add: clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city

From FD9:
listing_id → listing_status, listing_created_at, price, listing_description, animal_id, admin_id
Add: listing_status, listing_created_at, price, listing_description, animal_id, admin_id

From FD3:
admin_id → user_id
user_id already belongs to K+

From FD8:
animal_id → animal_name, located_name, species, animal_type, breed, sex, date_of_birth, photo_url, owner_id
Add: animal_name, located_name, species, animal_type, breed, sex, date_of_birth, photo_url, owner_id

From FD4:
owner_id → client_id
Add: client_id

From FD2:
client_id → user_id, is_blocked, blocked_at, blocked_reason
Add: is_blocked, blocked_at, blocked_reason
user_id already belongs to K+

From FD11:
healthrecord_id → hr_type, hr_description, hr_date, animal_id, appointment_id
Add: hr_type, hr_description, hr_date, appointment_id
animal_id already belongs to K+

From FD10:
appointment_id → date_time, appointment_status, notes, animal_id, clinic_id
Add: date_time, appointment_status, notes
animal_id and clinic_id already belong to K+

From FD12:
review_id → reviewer_id, rating, comment, review_created_at, updated_at, is_deleted
Add: reviewer_id, rating, comment, review_created_at, updated_at, is_deleted

From FD13:
review_id → target_user_id
Add: target_user_id

From FD14:
review_id → target_clinic_id
Add: target_clinic_id

Therefore, K+ = R, so K is a superkey.

K is also minimal, because removing any attribute from K would cause the loss of one independent component of the universal relation

So K is a candidate key.

Chosen primary key

We select the candidate key above as the primary key of the initial de-normalized relation:

{notification_id, application_id, listing_id, healthrecord_id, review_id, saved_user_id, saved_listing_id}

Normal form checks

Check for 1NF

R satisfies 1NF because all attributes are atomic and there are no repeating groups. Each row is uniquely identified by the chosen composite candidate key.

Check for 2NF

R does not satisfy 2NF because it has a composite primary key and many non-key attributes depend only on part of that key rather than on the whole key.

Examples:

  • listing_id → listing_status, listing_created_at, price, listing_description
  • animal_id → animal_name, species, breed, sex, date_of_birth, photo_url, owner_id
  • clinic_id → clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city
  • review_id → reviewer_id, rating, comment, review_created_at, updated_at, is_deleted
  • notification_id → notification_type, notification_message, notification_created_at, is_read, user_id

These are partial dependencies on components of the composite key, so R violates 2NF.

Decomposition to 2NF

Grouping by determinants (partial dependencies)

  • user_id → username, email, name, surname, full_name, password_hash, user_created_at
  • client_id → user_id, is_blocked, blocked_at, blocked_reason
  • admin_id → user_id
  • owner_id → client_id
  • notification_id → notification_type, notification_message, notification_created_at, is_read, user_id
  • clinic_id → clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city
  • application_id → app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason, clinic_id
  • animal_id → animal_name, located_name, species, animal_type, breed, sex, date_of_birth, photo_url, owner_id
  • listing_id → listing_status, listing_created_at, price, listing_description, animal_id, admin_id
  • appointment_id → date_time, appointment_status, notes, animal_id, clinic_id
  • healthrecord_id → hr_type, hr_description, hr_date, animal_id, appointment_id
  • review_id → reviewer_id, rating, comment, review_created_at, updated_at, is_deleted
  • review_id → target_user_id / target_clinic_id
  • FavoriteListings has key (saved_user_id, saved_listing_id) only

Explanation of grouping by determinants

The grouping is performed based on the principle that each set of attributes is determined by a single identifier. This ensures that each relation will have a primary key on which all other attributes fully depend, thus satisfying 2NF.

  • user_id → username, email, name, surname, full_name, password_hash, user_created_at

These attributes describe a user and depend only on user_id, so they are grouped into the Users relation.

  • client_id → user_id, is_blocked, blocked_at, blocked_reason

These attributes describe the client role of a user, so they are grouped into Clients, separated from Users.

  • admin_id → user_id

This represents the admin role, so it is separated into its own relation to reflect specialization.

  • owner_id → client_id

Owner is a specialization of Client, so it is grouped separately to preserve this hierarchy.

  • notification_id → notification_type, notification_message, notification_created_at, is_read, user_id

These attributes describe a notification and depend only on notification_id, so they form the Notifications relation.

  • clinic_id → clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city

These attributes describe a clinic and depend only on clinic_id, so they are grouped into VetClinics.

  • application_id → app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason, clinic_id

These attributes describe a clinic application and depend only on application_id, so they form VetClinicApplications.

  • animal_id → animal_name, located_name, species, animal_type, breed, sex, date_of_birth, photo_url, owner_id

These attributes describe an animal and depend only on animal_id, so they form the Animals relation.

  • listing_id → listing_status, listing_created_at, price, listing_description, animal_id, admin_id

These attributes describe a listing and depend only on listing_id, so they form the Listings relation.

  • appointment_id → date_time, appointment_status, notes, animal_id, clinic_id

These attributes describe an appointment and depend only on appointment_id, so they form the Appointments relation.

  • healthrecord_id → hr_type, hr_description, hr_date, animal_id, appointment_id

These attributes describe a health record and depend only on healthrecord_id, so they form the HealthRecords relation.

  • review_id → reviewer_id, rating, comment, review_created_at, updated_at, is_deleted

These attributes describe a review and depend only on review_id, so they form the Reviews relation.

  • review_id → target_user_id / target_clinic_id

These attributes represent specialization (UserReview and ClinicReview), so they will be separated later.

This represents a many-to-many relationship between users and listings. Since it has no non-key attributes, it is already correctly grouped and does not require further decomposition.

2NF relations (first decomposition)

  • Users(user_id, username, email, name, surname, full_name, password_hash, user_created_at)
  • Clients(client_id, user_id, is_blocked, blocked_at, blocked_reason)
  • Admins(admin_id, user_id)
  • Owners(owner_id, client_id)
  • Notifications(notification_id, user_id, notification_type, notification_message, notification_created_at, is_read)
  • VetClinics(clinic_id, clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city)
  • VetClinicApplications(application_id, clinic_id, app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason)
  • Animals(animal_id, owner_id, animal_name, located_name, species, animal_type, breed, sex, date_of_birth, photo_url)
  • Listings(listing_id, admin_id, animal_id, listing_status, listing_created_at, price, listing_description)
  • Appointments(appointment_id, clinic_id, animal_id, date_time, appointment_status, notes)
  • HealthRecords(healthrecord_id, appointment_id, animal_id, hr_type, hr_description, hr_date)
  • Reviews(review_id, reviewer_id, rating, comment, review_created_at, updated_at, is_deleted)
  • FavoriteListings(saved_user_id, saved_listing_id)

Lossless join:
The decomposition is lossless because each new relation is formed from a determinant that becomes the key of that relation. The common attributes used in the decomposition are keys in at least one of the resulting relations, so no information is lost.

Dependency preservation:
The decomposition is dependency-preserving because each functional dependency FD1–FD14 is represented fully inside one of the resulting relations. FavoriteListings has only its composite key and no additional non-trivial dependencies.

Check for 3NF (and decomposition)

Now we look for transitive dependencies inside the produced relations.

Typical transitive issues from the ER

  • In Clients(client_id, user_id, …), we have client_id → user_id, but user descriptive attributes are stored in Users, so no transitive dependency remains inside Clients.
  • In Listings(listing_id, admin_id, animal_id, …), we have listing_id → admin_id and globally admin_id → user_id, but user attributes are stored in Users, not in Listings, so no transitive dependency remains inside Listings.
  • In VetClinicApplications(application_id, clinic_id, …), we have application_id → clinic_id and clinic_id determines clinic attributes, but those attributes are stored in VetClinics, not in VetClinicApplications, so no transitive dependency remains there either.

Review subtypes (User Review and Clinic Review)

The ER model contains a specialization of Review into UserReview and ClinicReview. If both target_user_id and target_clinic_id are stored in Reviews, the relation mixes subtype-specific semantics and may introduce nulls or invalid combinations. So we decompose:

UserReviews(review_id, target_user_id)
ClinicReviews(review_id, target_clinic_id)

Now:

  • Reviews stores the common review attributes
  • UserReviews stores the target user of a user review
  • ClinicReviews stores the target clinic of a clinic review

Results after 3NF step

After separating subtype-specific attributes into UserReviews and ClinicReviews, all resulting relations satisfy 3NF.

Check for BCNF

Each resulting relation is also in BCNF because, in every non-trivial functional dependency within a relation, the determinant is a superkey of that relation.

  • Users: user_id is key → BCNF
  • Clients: client_id is key → BCNF
  • Admins: admin_id is key → BCNF
  • Owners: owner_id is key → BCNF
  • Notifications: notification_id is key → BCNF
  • VetClinics: clinic_id is key → BCNF
  • VetClinicApplications: application_id is key → BCNF
  • Animals: animal_id is key → BCNF
  • Listings: listing_id is key → BCNF
  • Appointments: appointment_id is key → BCNF
  • HealthRecords: healthrecord_id is key → BCNF
  • Reviews: review_id is key → BCNF
  • UserReviews: review_id is key → BCNF
  • ClinicReviews: review_id is key → BCNF
  • FavoriteListings: (saved_user_id, saved_listing_id) is key → BCNF

So the schema satisfies BCNF.

Check for 4NF (multivalued dependencies)

No additional non-trivial multivalued dependencies are identified in the resulting schema. The many-to-many relationship FavoriteListing is already decomposed into the separate relation FavoriteListings(saved_user_id, saved_listing_id), so no further decomposition is required for 4NF. The schema satisfies 4NF.

Final relations

  • Users(user_id, username, email, name, surname, full_name, password_hash, user_created_at)
  • Clients(client_id, user_id, is_blocked, blocked_at, blocked_reason)
  • Admins(admin_id, user_id)
  • Owners(owner_id, client_id)
  • Notifications(notification_id, user_id, notification_type, notification_message, notification_created_at, is_read)
  • VetClinics(clinic_id, clinic_name, clinic_email, clinic_phone, clinic_address, clinic_location, clinic_city)
  • VetClinicApplications(application_id, clinic_id, app_name, app_email, app_phone, app_city, app_address, submitted_at, app_status, reviewed_at, reviewed_by, denial_reason)
  • Animals(animal_id, owner_id, animal_name, located_name, species, animal_type, breed, sex, date_of_birth, photo_url)
  • Listings(listing_id, admin_id, animal_id, listing_status, listing_created_at, price, listing_description)
  • FavoriteListings(saved_user_id, saved_listing_id)
  • Appointments(appointment_id, clinic_id, animal_id, date_time, appointment_status, notes)
  • HealthRecords(healthrecord_id, appointment_id, animal_id, hr_type, hr_description, hr_date)
  • Reviews(review_id, reviewer_id, rating, comment, review_created_at, updated_at, is_deleted)
  • UserReviews(review_id, target_user_id)
  • ClinicReviews(review_id, target_clinic_id)

Conclusion

The normalization process confirms the same structural design obtained from the ER model in phase 2.

Last modified 11 days ago Last modified on 04/21/26 12:12:53
Note: See TracWiki for help on using the wiki.