| Version 5 (modified by , 7 hours ago) ( diff ) |
|---|
Normalization
Initial de-normalized relation and functional dependencies
Global set of attributes
Single unified de-normalized relation that includes all attributes from the ER model. It is important that there are no duplicate attribute names.
R = { user_id, username, first_name, last_name, password, email,
admin_id,
owner_id,
sitter_id,
pettype_id, species, average_lifespan, needs_outdoor_walk,
pet_id, pet_name, photo, age, special_needs, pet_description,
service_id, service_type, service_description,
booking_id, booking_status, date_from, date_to, address,
review_id, rating, comment,
payment_id, amount, payment_type
}
Functional dependencies
FD1: user_id → username, first_name, last_name, password, email
FD2: admin_id → user_id
FD3: owner_id → user_id
FD4: sitter_id → user_id
FD5: pettype_id → species, average_lifespan, needs_outdoor_walk
FD6: pet_id → pet_name, photo, age, special_needs, pet_description, owner_id, pettype_id
FD7: service_id → service_type, service_description
FD8: booking_id → booking_status, date_from, date_to, address, owner_id, sitter_id
FD9: review_id → rating, comment, booking_id
FD10: payment_id → amount, payment_type, booking_id
FD11: (admin_id, user_id) → ∅ (represents AdminManagement M:N)
FD12: (booking_id, pet_id) → ∅ (represents BookingPets M:N)
FD13: (sitter_id, service_id) → ∅ (represents SitterServices M:N)
FD14: (booking_id, service_id) → ∅ (represents BookingServices M:N)
Candidate keys and primary key selection
Determination of Candidate Keys
To find a candidate key, we must find a minimal set of attributes whose closure contains all attributes of the relation. We begin by classifying all attributes based on where they appear in the Functional Dependencies.
Attribute Classification (Left / Right Side)
| Attribute | Left Side | Right Side | Classification |
| admin_id | ✓ (FD2, FD11) | ✗ | Left only |
| pet_id | ✓ (FD6, FD12) | ✗ | Left only |
| service_id | ✓ (FD7, FD13, FD14) | ✗ | Left only |
| review_id | ✓ (FD9) | ✗ | Left only |
| payment_id | ✓ (FD10) | ✗ | Left only |
user_id | ✓ (FD1, FD11) | ✓ (FD2, FD3, FD4) | Both |
owner_id | ✓ (FD3) | ✓ (FD6, FD8) | Both |
sitter_id | ✓ (FD4, FD13) | ✓ (FD8) | Both |
pettype_id | ✓ (FD5) | ✓ (FD6) | Both |
booking_id | ✓ (FD8, FD12, FD14) | ✓ (FD9, FD10) | Both |
username, first_name, last_name, password, email | ✗ | ✓ (FD1) | Right only |
species, average_lifespan, needs_outdoor_walk | ✗ | ✓ (FD5) | Right only |
pet_name, photo, age, special_needs, pet_description | ✗ | ✓ (FD6) | Right only |
service_type, service_description | ✗ | ✓ (FD7) | Right only |
booking_status, date_from, date_to, address | ✗ | ✓ (FD8) | Right only |
rating, comment | ✗ | ✓ (FD9) | Right only |
amount, payment_type | ✗ | ✓ (FD10) | Right only |
Attributes That Appear ONLY on the Left Side
According to relational theory, attributes that appear only on the left side of functional dependencies can never be derived from any other attribute. Therefore, they must be part of every candidate key:
- admin_id
- pet_id
- service_id
- review_id
- payment_id
Closure Computation
Step 1: We start with our mandatory attributes {admin_id, pet_id, service_id, review_id, payment_id} and compute the mathematical closure:
{admin_id, pet_id, service_id, review_id, payment_id}⁺ :
- From FD2 (
admin_id→user_id): We obtainuser_id - From FD1 (
user_id→username,first_name...): We obtainusername, first_name, last_name, password, email - From FD6 (
pet_id→pet_name,photo...owner_id,pettype_id): We obtainpet_name, photo, age, special_needs, pet_description, owner_id, pettype_id - From FD5 (
pettype_id→species...): We obtainspecies, average_lifespan, needs_outdoor_walk - From FD7 (
service_id→service_type...): We obtainservice_type, service_description - From FD9 (
review_id→rating,comment,booking_id): We obtainrating, comment, booking_id - From FD8 (
booking_id→booking_status...sitter_id): We obtainbooking_status, date_from, date_to, address, sitter_id - From FD10 (
payment_id→amount,payment_type,booking_id): We obtainamount, payment_type(booking_idis already in closure) - From FD3 & FD4:
owner_idandsitter_idmap touser_id, which is already in the closure. - From FD11-FD14: All composite pairings (e.g.,
booking_idandpet_id) are already present in the closure, satisfying the M:N relations.
Closure = Universal_Relation ✓ (all 33 attributes are successfully derived)
Minimality Check
To formally prove this key is minimal, we test proper subsets of K = {admin_id, pet_id, service_id, review_id, payment_id}:
| Subset | Closure Equals Universal_Relation? | Justification |
| K − {admin_id} | ✗ NO | Cannot derive admin_id.
|
| K − {pet_id} | ✗ NO | Cannot derive pet attributes (pet_name, photo, age, special_needs, pet_description).
|
| K − {service_id} | ✗ NO | Cannot derive service attributes (service_type, service_description).
|
| K − {review_id} | ✗ NO | Cannot derive review attributes (review_id, rating, comment).
|
| K − {payment_id} | ✗ NO | Cannot derive payment attributes (payment_id, amount, payment_type).
|
Conclusion: K = {admin_id, pet_id, service_id, review_id, payment_id} is minimal and is the strictly mathematically correct candidate key.
Choice of Primary Key
Chosen Primary Key: {admin_id, pet_id, service_id, review_id, payment_id}
Justification: This composite key is the officially proven minimal Candidate Key. Attempting to include other logical identifiers (such as owner_id, sitter_id, or booking_id) would violate the rule of minimality, because the closure proof demonstrates those attributes are functionally dependent on the core transactional IDs.
Current Normal Form Status
Before decomposition, the universal relation R is in 1NF. It satisfies 1NF because all attributes contain atomic values and there are no repeating groups. However, it violates 2NF due to severe partial dependencies on the composite primary key.
Step-by-step decomposition to highest possible normal form
1. Decomposition to 2NF
Relation analyzed: The universal de-normalized relation R.
Issues with higher normal forms: R violates 2NF because non-prime attributes are partially dependent on components of the composite primary key. For example, pet_name depends ONLY on pet_id, not on the entire composite key K.
Action taken: We decompose R into smaller relations by grouping attributes based on their exact determinants (eliminating partial dependencies).
Relations obtained after 2NF decomposition:
Users (user_id, username, first_name, last_name, password, email)
Admins (admin_id, user_id)
PetOwners (owner_id, user_id)
PetSitters (sitter_id, user_id)
PetTypes (pettype_id, species, average_lifespan, needs_outdoor_walk)
Pets (pet_id, pet_name, photo, age, special_needs, pet_description, owner_id, pettype_id)
Services (service_id, service_type, service_description)
Bookings (booking_id, booking_status, date_from, date_to, address, owner_id, sitter_id)
Reviews (review_id, rating, comment, booking_id)
Payments (payment_id, amount, payment_type, booking_id)
AdminManagement (admin_id, user_id)
BookingPets (booking_id, pet_id)
SitterServices (sitter_id, service_id)
BookingServices (booking_id, service_id)
Validation of 2NF Decomposition:
1. Loss-less Join Property: By definition (Heath's Theorem), a decomposition of relation R into R1 and R2 is lossless if the common attributes (R1 ∩ R2) form a superkey for either R1 or R2. Our 14 relations were not created in a single split, but through successive binary decompositions.
- For example, in the first step, R was decomposed into R_Users and R_Remainder based on FD1. The intersection of these two relations is
user_id. Becauseuser_idis the primary key of R_Users, the functional dependencyuser_id → username, first_name, last_name, password, emailis preserved, proving this specific binary split is lossless. - In another step, the remainder was split into R_Pets and a new remainder based on FD6. The intersection is
pet_id. Sincepet_idis the primary key of R_Pets, this split is also lossless. - By recursively applying this exact binary splitting logic along the boundaries of FD2 through FD14, every single decomposition step shared a primary key with the remainder relation. Therefore, the final 14-table schema is strictly lossless.
2. Dependency Preservation: A decomposition is dependency preserving if the union of the functional dependencies in the new tables is equivalent to the original set of FDs (FD1 to FD14). Because we explicitly created our relations based on the exact determinants of our FDs, every single functional dependency is fully localized within a single table.
- For example, FD5 (
pettype_id → species, average_lifespan, needs_outdoor_walk) can be completely verified within the new PetTypes table without needing to perform a JOIN operation with any other table. - Similarly, FD7 (
service_id → service_type, service_description) is entirely contained within the Services table.
Since all 14 original FDs are preserved natively inside the 14 new relations, the decomposition is strictly dependency preserving.
Step 2: Decomposition to 3NF
Relations analyzed: All 14 relations obtained from the 2NF decomposition.
Issues with higher normal forms: A relation violates 3NF if there is a transitive dependency (A → B → C) where a non-prime attribute determines another non-prime attribute.
Action taken: We inspect the 2NF relations to ensure that all non-key attributes depend only on the primary key, and not on any other non-key attributes. We specifically target tables containing Foreign Keys, as this is where transitive chains can be present.
- In Bookings: While
booking_iddeterminesowner_id, we must check ifowner_id(a non-prime attribute in this table) determines any other non-prime attributes within this table. It does not. The descriptive user details for the owner are safely isolated in the Users table. - In Reviews: While
review_iddeterminesbooking_id, we must check ifbooking_id(a non-prime attribute in this table) determines any other non-prime attributes within this table (such as the sitter's details). It does not. Those details are isolated in their respective tables. - In Base Entities (Users, PetTypes, Services): All non-prime attributes logically depend directly on the primary key (example
speciesdepends only onpettype_id), making transitive chains mathematically impossible. - In Junction Tables (BookingPets, AdminManagement): These relations consist purely of composite primary keys with zero non-prime attributes, automatically satisfying 3NF.
Result: Because we did not store redundant, descriptive data alongside our foreign keys, there are no instances where a non-prime attribute determines another non-prime attribute within the same relation.
Status: All 14 relations naturally satisfy 3NF.
3. Decomposition to BCNF
Relations analyzed: All 14 relations currently in 3NF.
Issues with higher normal forms: A relation violates Boyce-Codd Normal Form (BCNF) if a non-trivial functional dependency X → Y exists where X is not a superkey.
Action taken: We evaluate the left side (determinant) of every functional dependency within all 14 relations:
In Users, user_id is a superkey.
In Admins, admin_id is a superkey.
In PetOwners, owner_id is a superkey.
In PetSitters, sitter_id is a superkey.
In PetTypes, pettype_id is a superkey.
In Pets, pet_id is a superkey.
In Services, service_id is a superkey.
In Bookings, booking_id is a superkey.
In Reviews, review_id is a superkey.
In Payments, payment_id is a superkey.
In AdminManagement, the composite key (admin_id, user_id) is the only determinant, meaning it is trivially a superkey.
In BookingPets, the composite key (booking_id, pet_id) is the only determinant, meaning it is trivially a superkey.
In SitterServices, the composite key (sitter_id, service_id) is the only determinant, meaning it is trivially a superkey.
In BookingServices, the composite key (booking_id, service_id) is the only determinant, meaning it is trivially a superkey.
Result: In every single relation, the determinant is a candidate/superkey.
Status: All 14 relations naturally satisfy BCNF. The decomposition process is complete.
Final result and discussion
Final normalized relational design
The final database schema operates in strict BCNF with the following 14 relations:
Users (user_id, username, first_name, last_name, password, email)
Admins (admin_id, user_id)
PetOwners (owner_id, user_id)
PetSitters (sitter_id, user_id)
PetTypes (pettype_id, species, average_lifespan, needs_outdoor_walk)
Pets (pet_id, pet_name, photo, age, special_needs, pet_description, owner_id, pettype_id)
Services (service_id, service_type, service_description)
Bookings (booking_id, booking_status, date_from, date_to, address, owner_id, sitter_id)
Reviews (review_id, rating, comment, booking_id)
Payments (payment_id, amount, payment_type, booking_id)
AdminManagement (admin_id, user_id)
BookingPets (booking_id, pet_id)
SitterServices (sitter_id, service_id)
BookingServices (booking_id, service_id)
Discussion - compare with Phase 2
With completing the normalization process up to BCNF, we have proven that the resulting schema is identical to the logical design created in Phase 2.
