wiki:Normalization

Normalization

Initial de-normalized relation and functional dependencies

Global set of attributes

Single unified de-normalized relation that includes all attributes from the ER model. It is important that there are no duplicate attribute names.

R = { user_id, username, first_name, last_name, password, email,

admin_id,

owner_id,

sitter_id,

pettype_id, species, average_lifespan, needs_outdoor_walk,

pet_id, pet_name, photo, age, special_needs, pet_description,

service_id, service_type, service_description,

booking_id, booking_status, date_from, date_to, address,

review_id, rating, comment,

payment_id, amount, payment_type

}

Functional dependencies

FD1: user_idusername, first_name, last_name, password, email

FD2: admin_iduser_id

FD3: owner_iduser_id

FD4: sitter_iduser_id

FD5: pettype_idspecies, average_lifespan, needs_outdoor_walk

FD6: pet_idpet_name, photo, age, special_needs, pet_description, owner_id, pettype_id

FD7: service_idservice_type, service_description

FD8: booking_idbooking_status, date_from, date_to, address, owner_id, sitter_id

FD9: review_idrating, comment, booking_id

FD10: payment_idamount, payment_type, booking_id

Candidate keys and primary key selection

Formally identify candidate keys

To determine a candidate key of the universal de-normalized relation R, we must find a minimal set of attributes whose closure (K+) determines all other attributes in the relation.

Because there are independent M:N relationships and disjoint entities that do not functionally determine each other, the candidate key must be a composite of the independent determinants.

Candidate Key K = { admin_id, owner_id, sitter_id, pet_id, service_id, booking_id, review_id, payment_id}

Closure Proof for K

Let K = { admin_id, owner_id, sitter_id, pet_id, service_id, booking_id, review_id, payment_id }

Compute K+:

  • Start: K+ = K
  • From FD2: admin_id -> user_id
    • Add: user_id
  • From FD1: user_id -> username, first_name, last_name, password, email
    • Add: username, first_name, last_name, password, email
  • From FD3 & FD4: owner_id -> user_id and sitter_id -> user_id
    • (user_id already belongs to K+)
  • From FD6: pet_id -> pet_name, photo, age, special_needs, pet_description, owner_id, pettype_id
    • Add: pet_name, photo, age, special_needs, pet_description, pettype_id
    • (owner_id already belongs to K+)
  • From FD5: pettype_id -> species, average_lifespan, needs_outdoor_walk
    • Add: species, average_lifespan, needs_outdoor_walk
  • From FD7: service_id -> service_type, service_description
    • Add: service_type, service_description
  • From FD8: booking_id -> booking_status, date_from, date_to, address, owner_id, sitter_id
    • Add: booking_status, date_from, date_to, address
    • (owner_id and sitter_id already belong to K+)
  • From FD9: review_id -> rating, comment, booking_id
    • Add: rating, comment
    • (booking_id already belongs to K+)
  • From FD10: payment_id -> amount, payment_type, booking_id
    • Add: amount, payment_type
    • (booking_id already belongs to K+)

Therefore, K+ = R, so K is a valid superkey.

K is minimal. If we remove any single attribute from K (such as removing service_id or pet_id), we entirely lose our starting point for that independent element, so its specific attributes could never be used to the closure. Which means, K is an official Candidate Key.

Primary Key Selection

We select the single composite candidate key K as the official primary key of the initial de-normalized relation:

admin_id, owner_id, sitter_id, pet_id, service_id, booking_id, review_id, payment_id

Primary Key Selection

We select the candidate key K as the primary key of the initial de-normalized relation.

Current Normal Form Status

Before decomposition, the universal relation R is in 1NF. It satisfies 1NF because all attributes contain atomic values and there are no repeating groups. However, it violates 2NF due to severe partial dependencies on the composite primary key.

Step-by-step decomposition to highest possible normal form

Step 1: Decomposition to 2NF

Relation analyzed: The universal de-normalized relation R.

Issues with higher normal forms: R violates 2NF because non-prime attributes are partially dependent on components of the composite primary key. For example, pet_name depends ONLY on pet_id, not on the entire composite key K.

Action taken: We decompose R into smaller relations by grouping attributes based on their exact determinants (eliminating partial dependencies).

Relations obtained after 2NF decomposition:

Users (user_id, username, first_name, last_name, password, email)

Admins (admin_id, user_id)

PetOwners (owner_id, user_id)

PetSitters (sitter_id, user_id)

PetTypes (pettype_id, species, average_lifespan, needs_outdoor_walk)

Pets (pet_id, pet_name, photo, age, special_needs, pet_description, owner_id, pettype_id)

Services (service_id, service_type, service_description)

Bookings (booking_id, booking_status, date_from, date_to, address, owner_id, sitter_id)

Reviews (review_id, rating, comment, booking_id)

Payments (payment_id, amount, payment_type, booking_id)

AdminManagement (admin_id, user_id)

BookingPets (booking_id, pet_id)

SitterServices (sitter_id, service_id)

BookingServices (booking_id, service_id)

Validation:

Loss-less join property: Preserved. Every split was made along a functional dependency where the common attribute became a candidate key in the new relation.

Dependency preservation: Preserved. Every functional dependency (FD1 through FD10) is fully contained within a single decomposed table.

Step 2: Decomposition to 3NF

Relations analyzed: All 14 relations obtained from the 2NF decomposition.

Issues with higher normal forms: A relation violates 3NF if there is a transitive dependency (A → B → C) where a non-prime attribute depends on another non-prime attribute.

Action taken: We inspect the 2NF relations for transitive dependencies.

In Bookings, booking_id determines owner_id, and globally owner_id determines user_id details. However, the user details are NOT stored inside Bookings; they are safely isolated in the Users table.

In Reviews, review_id determines booking_id, which globally determines the Sitter. However, sitter details are not stored inside the Reviews table.

Result: Because we did not store redundant foreign keys (like putting sitter_id inside the Reviews table), there are absolutely no transitive dependencies within the non-prime attributes of any relation.

Status: All relations are already in 3NF.

Step 3: Decomposition to BCNF

Relations analyzed: All 14 relations currently in 3NF.

Issues with higher normal forms: A relation violates Boyce-Codd Normal Form (BCNF) if a non-trivial functional dependency X → Y exists where X is not a superkey.

Action taken: We evaluate the left side (determinant) of every functional dependency within all 14 relations:

In Users, user_id is a superkey.

In Admins, admin_id is a superkey.

In PetOwners, owner_id is a superkey.

In PetSitters, sitter_id is a superkey.

In PetTypes, pettype_id is a superkey.

In Pets, pet_id is a superkey.

In Services, service_id is a superkey.

In Bookings, booking_id is a superkey.

In Reviews, review_id is a superkey.

In Payments, payment_id is a superkey.

In AdminManagement, the composite key (admin_id, user_id) is the only determinant, meaning it is trivially a superkey.

In BookingPets, the composite key (booking_id, pet_id) is the only determinant, meaning it is trivially a superkey.

In SitterServices, the composite key (sitter_id, service_id) is the only determinant, meaning it is trivially a superkey.

In BookingServices, the composite key (booking_id, service_id) is the only determinant, meaning it is trivially a superkey.

Result: In every single relation, the determinant is a candidate/superkey.

Status: All 14 relations naturally satisfy BCNF. The decomposition process is complete.

Final result and discussion

Final normalized relational design

The final database schema operates in strict BCNF with the following 14 relations:

Users (user_id, username, first_name, last_name, password, email)

Admins (admin_id, user_id)

PetOwners (owner_id, user_id)

PetSitters (sitter_id, user_id)

PetTypes (pettype_id, species, average_lifespan, needs_outdoor_walk)

Pets (pet_id, pet_name, photo, age, special_needs, pet_description, owner_id, pettype_id)

Services (service_id, service_type, service_description)

Bookings (booking_id, booking_status, date_from, date_to, address, owner_id, sitter_id)

Reviews (review_id, rating, comment, booking_id)

Payments (payment_id, amount, payment_type, booking_id)

AdminManagement (admin_id, user_id)

BookingPets (booking_id, pet_id)

SitterServices (sitter_id, service_id)

BookingServices (booking_id, service_id)

Discussion - compare with Phase 2

With completing the normalization process up to BCNF, we have proven that the resulting schema is identical to the logical design created in Phase 2.

Last modified 7 days ago Last modified on 05/19/26 09:58:05
Note: See TracWiki for help on using the wiki.