Normalization
Initial de-normalized relation and functional dependencies
Global set of attributes
Single unified de-normalized relation that includes all attributes from the ER model. It is important that there are no duplicate attribute names.
R = { user_id, username, first_name, last_name, password, email,
admin_id,
owner_id,
sitter_id,
pettype_id, species, average_lifespan, needs_outdoor_walk,
pet_id, pet_name, photo, age, special_needs, pet_description,
service_id, service_type, service_description,
booking_id, booking_status, date_from, date_to, address,
review_id, rating, comment,
payment_id, amount, payment_type
}
Functional dependencies
FD1: user_id → username, first_name, last_name, password, email
FD2: admin_id → user_id
FD3: owner_id → user_id
FD4: sitter_id → user_id
FD5: pettype_id → species, average_lifespan, needs_outdoor_walk
FD6: pet_id → pet_name, photo, age, special_needs, pet_description, owner_id, pettype_id
FD7: service_id → service_type, service_description
FD8: booking_id → booking_status, date_from, date_to, address, owner_id, sitter_id
FD9: review_id → rating, comment, booking_id
FD10: payment_id → amount, payment_type, booking_id
Candidate keys and primary key selection
Formally identify candidate keys
To determine a candidate key of the universal de-normalized relation R, we must find a minimal set of attributes whose closure (K+) determines all other attributes in the relation.
Because there are independent M:N relationships and disjoint entities that do not functionally determine each other, the candidate key must be a composite of the independent determinants.
Candidate Key K = { admin_id, owner_id, sitter_id, pet_id, service_id, booking_id, review_id, payment_id}
Closure Proof for K
Let K = { admin_id, owner_id, sitter_id, pet_id, service_id, booking_id, review_id, payment_id }
Compute K+:
- Start: K+ = K
- From FD2: admin_id -> user_id
- Add: user_id
- From FD1: user_id -> username, first_name, last_name, password, email
- Add: username, first_name, last_name, password, email
- From FD3 & FD4: owner_id -> user_id and sitter_id -> user_id
- (user_id already belongs to K+)
- From FD6: pet_id -> pet_name, photo, age, special_needs, pet_description, owner_id, pettype_id
- Add: pet_name, photo, age, special_needs, pet_description, pettype_id
- (owner_id already belongs to K+)
- From FD5: pettype_id -> species, average_lifespan, needs_outdoor_walk
- Add: species, average_lifespan, needs_outdoor_walk
- From FD7: service_id -> service_type, service_description
- Add: service_type, service_description
- From FD8: booking_id -> booking_status, date_from, date_to, address, owner_id, sitter_id
- Add: booking_status, date_from, date_to, address
- (owner_id and sitter_id already belong to K+)
- From FD9: review_id -> rating, comment, booking_id
- Add: rating, comment
- (booking_id already belongs to K+)
- From FD10: payment_id -> amount, payment_type, booking_id
- Add: amount, payment_type
- (booking_id already belongs to K+)
Therefore, K+ = R, so K is a valid superkey.
K is minimal. If we remove any single attribute from K (such as removing service_id or pet_id), we entirely lose our starting point for that independent element, so its specific attributes could never be used to the closure. Which means, K is an official Candidate Key.
Primary Key Selection
We select the single composite candidate key K as the official primary key of the initial de-normalized relation:
admin_id, owner_id, sitter_id, pet_id, service_id, booking_id, review_id, payment_id
Primary Key Selection
We select the candidate key K as the primary key of the initial de-normalized relation.
Current Normal Form Status
Before decomposition, the universal relation R is in 1NF. It satisfies 1NF because all attributes contain atomic values and there are no repeating groups. However, it violates 2NF due to severe partial dependencies on the composite primary key.
Step-by-step decomposition to highest possible normal form
Step 1: Decomposition to 2NF
Relation analyzed: The universal de-normalized relation R.
Issues with higher normal forms: R violates 2NF because non-prime attributes are partially dependent on components of the composite primary key. For example, pet_name depends ONLY on pet_id, not on the entire composite key K.
Action taken: We decompose R into smaller relations by grouping attributes based on their exact determinants (eliminating partial dependencies).
Relations obtained after 2NF decomposition:
Users (user_id, username, first_name, last_name, password, email)
Admins (admin_id, user_id)
PetOwners (owner_id, user_id)
PetSitters (sitter_id, user_id)
PetTypes (pettype_id, species, average_lifespan, needs_outdoor_walk)
Pets (pet_id, pet_name, photo, age, special_needs, pet_description, owner_id, pettype_id)
Services (service_id, service_type, service_description)
Bookings (booking_id, booking_status, date_from, date_to, address, owner_id, sitter_id)
Reviews (review_id, rating, comment, booking_id)
Payments (payment_id, amount, payment_type, booking_id)
AdminManagement (admin_id, user_id)
BookingPets (booking_id, pet_id)
SitterServices (sitter_id, service_id)
BookingServices (booking_id, service_id)
Validation:
Loss-less join property: Preserved. Every split was made along a functional dependency where the common attribute became a candidate key in the new relation.
Dependency preservation: Preserved. Every functional dependency (FD1 through FD10) is fully contained within a single decomposed table.
Step 2: Decomposition to 3NF
Relations analyzed: All 14 relations obtained from the 2NF decomposition.
Issues with higher normal forms: A relation violates 3NF if there is a transitive dependency (A → B → C) where a non-prime attribute depends on another non-prime attribute.
Action taken: We inspect the 2NF relations for transitive dependencies.
In Bookings, booking_id determines owner_id, and globally owner_id determines user_id details. However, the user details are NOT stored inside Bookings; they are safely isolated in the Users table.
In Reviews, review_id determines booking_id, which globally determines the Sitter. However, sitter details are not stored inside the Reviews table.
Result: Because we did not store redundant foreign keys (like putting sitter_id inside the Reviews table), there are absolutely no transitive dependencies within the non-prime attributes of any relation.
Status: All relations are already in 3NF.
Step 3: Decomposition to BCNF
Relations analyzed: All 14 relations currently in 3NF.
Issues with higher normal forms: A relation violates Boyce-Codd Normal Form (BCNF) if a non-trivial functional dependency X → Y exists where X is not a superkey.
Action taken: We evaluate the left side (determinant) of every functional dependency within all 14 relations:
In Users, user_id is a superkey.
In Admins, admin_id is a superkey.
In PetOwners, owner_id is a superkey.
In PetSitters, sitter_id is a superkey.
In PetTypes, pettype_id is a superkey.
In Pets, pet_id is a superkey.
In Services, service_id is a superkey.
In Bookings, booking_id is a superkey.
In Reviews, review_id is a superkey.
In Payments, payment_id is a superkey.
In AdminManagement, the composite key (admin_id, user_id) is the only determinant, meaning it is trivially a superkey.
In BookingPets, the composite key (booking_id, pet_id) is the only determinant, meaning it is trivially a superkey.
In SitterServices, the composite key (sitter_id, service_id) is the only determinant, meaning it is trivially a superkey.
In BookingServices, the composite key (booking_id, service_id) is the only determinant, meaning it is trivially a superkey.
Result: In every single relation, the determinant is a candidate/superkey.
Status: All 14 relations naturally satisfy BCNF. The decomposition process is complete.
Final result and discussion
Final normalized relational design
The final database schema operates in strict BCNF with the following 14 relations:
Users (user_id, username, first_name, last_name, password, email)
Admins (admin_id, user_id)
PetOwners (owner_id, user_id)
PetSitters (sitter_id, user_id)
PetTypes (pettype_id, species, average_lifespan, needs_outdoor_walk)
Pets (pet_id, pet_name, photo, age, special_needs, pet_description, owner_id, pettype_id)
Services (service_id, service_type, service_description)
Bookings (booking_id, booking_status, date_from, date_to, address, owner_id, sitter_id)
Reviews (review_id, rating, comment, booking_id)
Payments (payment_id, amount, payment_type, booking_id)
AdminManagement (admin_id, user_id)
BookingPets (booking_id, pet_id)
SitterServices (sitter_id, service_id)
BookingServices (booking_id, service_id)
Discussion - compare with Phase 2
With completing the normalization process up to BCNF, we have proven that the resulting schema is identical to the logical design created in Phase 2.
