= Normalization = == Initial de-normalized relation and functional dependencies == === Global set of attributes === Single unified de-normalized relation that includes all attributes from the ER model. It is important that there are no duplicate attribute names. '''R = {''' user_id, username, first_name, last_name, password, email, admin_id, owner_id, sitter_id, pettype_id, species, average_lifespan, needs_outdoor_walk, pet_id, pet_name, photo, age, special_needs, pet_description, service_id, service_type, service_description, booking_id, booking_status, date_from, date_to, address, review_id, rating, comment, payment_id, amount, payment_type '''}''' === Functional dependencies === '''FD1:''' {{{user_id}}} → {{{username}}}, {{{first_name}}}, {{{last_name}}}, {{{password}}}, {{{email}}} '''FD2:''' {{{admin_id}}} → {{{user_id}}} '''FD3:''' {{{owner_id}}} → {{{user_id}}} '''FD4:''' {{{sitter_id}}} → {{{user_id}}} '''FD5:''' {{{pettype_id}}} → {{{species}}}, {{{average_lifespan}}}, {{{needs_outdoor_walk}}} '''FD6:''' {{{pet_id}}} → {{{pet_name}}}, {{{photo}}}, {{{age}}}, {{{special_needs}}}, {{{pet_description}}}, {{{owner_id}}}, {{{pettype_id}}} '''FD7:''' {{{service_id}}} → {{{service_type}}}, {{{service_description}}} '''FD8:''' {{{booking_id}}} → {{{booking_status}}}, {{{date_from}}}, {{{date_to}}}, {{{address}}}, {{{owner_id}}}, {{{sitter_id}}} '''FD9:''' {{{review_id}}} → {{{rating}}}, {{{comment}}}, {{{booking_id}}} '''FD10:''' {{{payment_id}}} → {{{amount}}}, {{{payment_type}}}, {{{booking_id}}} '''FD11:''' ({{{admin_id}}}, {{{user_id}}}) → ∅ (represents AdminManagement M:N) '''FD12:''' ({{{booking_id}}}, {{{pet_id}}}) → ∅ (represents BookingPets M:N) '''FD13:''' ({{{sitter_id}}}, {{{service_id}}}) → ∅ (represents SitterServices M:N) '''FD14:''' ({{{booking_id}}}, {{{service_id}}}) → ∅ (represents BookingServices M:N) == Candidate keys and primary key selection == === Determination of Candidate Keys === To find a candidate key, we must find a minimal set of attributes whose closure contains all attributes of the relation. We begin by classifying all attributes based on where they appear in the Functional Dependencies. === Attribute Classification (Left / Right Side) === || '''Attribute''' || '''Left Side''' || '''Right Side''' || '''Classification''' || || '''admin_id''' || ✓ (FD2, FD11) || ✗ || '''Left only''' || || '''pet_id''' || ✓ (FD6, FD12) || ✗ || '''Left only''' || || '''service_id''' || ✓ (FD7, FD13, FD14) || ✗ || '''Left only''' || || '''review_id''' || ✓ (FD9) || ✗ || '''Left only''' || || '''payment_id''' || ✓ (FD10) || ✗ || '''Left only''' || || {{{user_id}}} || ✓ (FD1, FD11) || ✓ (FD2, FD3, FD4) || Both || || {{{owner_id}}} || ✓ (FD3) || ✓ (FD6, FD8) || Both || || {{{sitter_id}}} || ✓ (FD4, FD13) || ✓ (FD8) || Both || || {{{pettype_id}}} || ✓ (FD5) || ✓ (FD6) || Both || || {{{booking_id}}} || ✓ (FD8, FD12, FD14) || ✓ (FD9, FD10) || Both || || {{{username}}}, {{{first_name}}}, {{{last_name}}}, {{{password}}}, {{{email}}} || ✗ || ✓ (FD1) || Right only || || {{{species}}}, {{{average_lifespan}}}, {{{needs_outdoor_walk}}} || ✗ || ✓ (FD5) || Right only || || {{{pet_name}}}, {{{photo}}}, {{{age}}}, {{{special_needs}}}, {{{pet_description}}} || ✗ || ✓ (FD6) || Right only || || {{{service_type}}}, {{{service_description}}} || ✗ || ✓ (FD7) || Right only || || {{{booking_status}}}, {{{date_from}}}, {{{date_to}}}, {{{address}}} || ✗ || ✓ (FD8) || Right only || || {{{rating}}}, {{{comment}}} || ✗ || ✓ (FD9) || Right only || || {{{amount}}}, {{{payment_type}}} || ✗ || ✓ (FD10) || Right only || === Attributes That Appear ONLY on the Left Side === According to relational theory, attributes that appear ''only'' on the left side of functional dependencies can never be derived from any other attribute. Therefore, they '''must''' be part of every candidate key: * '''admin_id''' * '''pet_id''' * '''service_id''' * '''review_id''' * '''payment_id''' === Closure Computation === '''Step 1:''' We start with our mandatory attributes {{{ {admin_id, pet_id, service_id, review_id, payment_id} }}} and compute the mathematical closure: {{{ {admin_id, pet_id, service_id, review_id, payment_id}⁺ }}}: * '''From FD2''' ({{{admin_id}}} → {{{user_id}}}): We obtain {{{user_id}}} * '''From FD1''' ({{{user_id}}} → {{{username}}}, {{{first_name}}}...): We obtain {{{username, first_name, last_name, password, email}}} * '''From FD6''' ({{{pet_id}}} → {{{pet_name}}}, {{{photo}}}... {{{owner_id}}}, {{{pettype_id}}}): We obtain {{{pet_name, photo, age, special_needs, pet_description, owner_id, pettype_id}}} * '''From FD5''' ({{{pettype_id}}} → {{{species}}}...): We obtain {{{species, average_lifespan, needs_outdoor_walk}}} * '''From FD7''' ({{{service_id}}} → {{{service_type}}}...): We obtain {{{service_type, service_description}}} * '''From FD9''' ({{{review_id}}} → {{{rating}}}, {{{comment}}}, {{{booking_id}}}): We obtain {{{rating, comment, booking_id}}} * '''From FD8''' ({{{booking_id}}} → {{{booking_status}}}... {{{sitter_id}}}): We obtain {{{booking_status, date_from, date_to, address, sitter_id}}} * '''From FD10''' ({{{payment_id}}} → {{{amount}}}, {{{payment_type}}}, {{{booking_id}}}): We obtain {{{amount, payment_type}}} ({{{booking_id}}} is already in closure) * '''From FD3 & FD4:''' {{{owner_id}}} and {{{sitter_id}}} map to {{{user_id}}}, which is already in the closure. * '''From FD11-FD14:''' All composite pairings (e.g., {{{booking_id}}} and {{{pet_id}}}) are already present in the closure, satisfying the M:N relations. '''Closure = Universal_Relation ✓''' (all 33 attributes are successfully derived) === Minimality Check === To formally prove this key is minimal, we test proper subsets of '''K = {admin_id, pet_id, service_id, review_id, payment_id}''': || '''Subset''' || '''Closure Equals Universal_Relation?''' || '''Justification''' || || '''K − {admin_id}''' || ✗ NO || Cannot derive {{{admin_id}}}. || || '''K − {pet_id}''' || ✗ NO || Cannot derive pet attributes ({{{pet_name, photo, age, special_needs, pet_description}}}). || || '''K − {service_id}''' || ✗ NO || Cannot derive service attributes ({{{service_type, service_description}}}). || || '''K − {review_id}''' || ✗ NO || Cannot derive review attributes ({{{review_id, rating, comment}}}). || || '''K − {payment_id}''' || ✗ NO || Cannot derive payment attributes ({{{payment_id, amount, payment_type}}}). || '''Conclusion:''' K = {admin_id, pet_id, service_id, review_id, payment_id} is minimal and is the strictly mathematically correct candidate key. === Choice of Primary Key === '''Chosen Primary Key:''' {{{ {admin_id, pet_id, service_id, review_id, payment_id} }}} '''Justification:''' This composite key is the officially proven minimal Candidate Key. Attempting to include other logical identifiers (such as {{{owner_id}}}, {{{sitter_id}}}, or {{{booking_id}}}) would violate the rule of minimality, because the closure proof demonstrates those attributes are functionally dependent on the core transactional IDs. === Current Normal Form Status === Before decomposition, the universal relation '''R''' is in '''1NF'''. It satisfies 1NF because all attributes contain atomic values and there are no repeating groups. However, it violates 2NF due to severe partial dependencies on the composite primary key. == Step-by-step decomposition to highest possible normal form == === 1. Decomposition to 2NF === '''Relation analyzed:''' The universal de-normalized relation '''R'''. '''Issues with higher normal forms:''' R violates 2NF because non-prime attributes are partially dependent on components of the composite primary key. For example, {{{pet_name}}} depends ONLY on {{{pet_id}}}, not on the entire composite key '''K'''. '''Action taken:''' We decompose R into smaller relations by grouping attributes based on their exact determinants (eliminating partial dependencies). '''Relations obtained after 2NF decomposition:''' '''Users''' ({{{user_id}}}, {{{username}}}, {{{first_name}}}, {{{last_name}}}, {{{password}}}, {{{email}}}) '''Admins''' ({{{admin_id}}}, {{{user_id}}}) '''PetOwners''' ({{{owner_id}}}, {{{user_id}}}) '''PetSitters''' ({{{sitter_id}}}, {{{user_id}}}) '''PetTypes''' ({{{pettype_id}}}, {{{species}}}, {{{average_lifespan}}}, {{{needs_outdoor_walk}}}) '''Pets''' ({{{pet_id}}}, {{{pet_name}}}, {{{photo}}}, {{{age}}}, {{{special_needs}}}, {{{pet_description}}}, {{{owner_id}}}, {{{pettype_id}}}) '''Services''' ({{{service_id}}}, {{{service_type}}}, {{{service_description}}}) '''Bookings''' ({{{booking_id}}}, {{{booking_status}}}, {{{date_from}}}, {{{date_to}}}, {{{address}}}, {{{owner_id}}}, {{{sitter_id}}}) '''Reviews''' ({{{review_id}}}, {{{rating}}}, {{{comment}}}, {{{booking_id}}}) '''Payments''' ({{{payment_id}}}, {{{amount}}}, {{{payment_type}}}, {{{booking_id}}}) '''AdminManagement''' ({{{admin_id}}}, {{{user_id}}}) '''BookingPets''' ({{{booking_id}}}, {{{pet_id}}}) '''SitterServices''' ({{{sitter_id}}}, {{{service_id}}}) '''BookingServices''' ({{{booking_id}}}, {{{service_id}}}) '''Validation of 2NF Decomposition:''' '''1. Loss-less Join Property:''' By definition (Heath's Theorem), a decomposition of relation ''R'' into ''R1'' and ''R2'' is lossless if the common attributes (''R1'' ∩ ''R2'') form a superkey for either ''R1'' or ''R2''. Our 14 relations were not created in a single split, but through '''successive binary decompositions'''. * For example, in the first step, ''R'' was decomposed into ''R_Users'' and ''R_Remainder'' based on FD1. The intersection of these two relations is {{{user_id}}}. Because {{{user_id}}} is the primary key of ''R_Users'', the functional dependency {{{user_id → username, first_name, last_name, password, email}}} is preserved, proving this specific binary split is lossless. * In another step, the remainder was split into ''R_Pets'' and a new remainder based on FD6. The intersection is {{{pet_id}}}. Since {{{pet_id}}} is the primary key of ''R_Pets'', this split is also lossless. * By recursively applying this exact binary splitting logic along the boundaries of FD2 through FD14, every single decomposition step shared a primary key with the remainder relation. Therefore, the final 14-table schema is strictly lossless. '''2. Dependency Preservation:''' A decomposition is dependency preserving if the union of the functional dependencies in the new tables is equivalent to the original set of FDs (FD1 to FD14). Because we explicitly created our relations based on the exact determinants of our FDs, every single functional dependency is fully localized within a single table. * For example, FD5 ({{{pettype_id → species, average_lifespan, needs_outdoor_walk}}}) can be completely verified within the new '''PetTypes''' table without needing to perform a JOIN operation with any other table. * Similarly, FD7 ({{{service_id → service_type, service_description}}}) is entirely contained within the '''Services''' table. Since all 14 original FDs are preserved natively inside the 14 new relations, the decomposition is strictly dependency preserving. === Step 2: Decomposition to 3NF === '''Relations analyzed:''' All 14 relations obtained from the 2NF decomposition. '''Issues with higher normal forms:''' A relation violates 3NF if there is a transitive dependency (A → B → C) where a non-prime attribute determines another non-prime attribute. '''Action taken:''' We inspect the 2NF relations to ensure that all non-key attributes depend ''only'' on the primary key, and not on any other non-key attributes. We specifically target tables containing Foreign Keys, as this is where transitive chains can be present. * '''In Bookings:''' While {{{booking_id}}} determines {{{owner_id}}}, we must check if {{{owner_id}}} (a non-prime attribute in this table) determines any other non-prime attributes ''within'' this table. It does not. The descriptive user details for the owner are safely isolated in the '''Users''' table. * '''In Reviews:''' While {{{review_id}}} determines {{{booking_id}}}, we must check if {{{booking_id}}} (a non-prime attribute in this table) determines any other non-prime attributes ''within'' this table (such as the sitter's details). It does not. Those details are isolated in their respective tables. * '''In Base Entities (Users, PetTypes, Services):''' All non-prime attributes logically depend directly on the primary key (example {{{species}}} depends only on {{{pettype_id}}}), making transitive chains mathematically impossible. * '''In Junction Tables (BookingPets, AdminManagement):''' These relations consist purely of composite primary keys with zero non-prime attributes, automatically satisfying 3NF. '''Result:''' Because we did not store redundant, descriptive data alongside our foreign keys, there are no instances where a non-prime attribute determines another non-prime attribute within the same relation. '''Status:''' All 14 relations naturally satisfy '''3NF'''. === 3. Decomposition to BCNF === '''Relations analyzed:''' All 14 relations currently in 3NF. '''Issues with higher normal forms:''' A relation violates Boyce-Codd Normal Form (BCNF) if a non-trivial functional dependency X → Y exists where X is not a superkey. '''Action taken:''' We evaluate the left side (determinant) of every functional dependency within all 14 relations: In '''Users''', {{{user_id}}} is a superkey. In '''Admins''', {{{admin_id}}} is a superkey. In '''PetOwners''', {{{owner_id}}} is a superkey. In '''PetSitters''', {{{sitter_id}}} is a superkey. In '''PetTypes''', {{{pettype_id}}} is a superkey. In '''Pets''', {{{pet_id}}} is a superkey. In '''Services''', {{{service_id}}} is a superkey. In '''Bookings''', {{{booking_id}}} is a superkey. In '''Reviews''', {{{review_id}}} is a superkey. In '''Payments''', {{{payment_id}}} is a superkey. In '''AdminManagement''', the composite key ({{{admin_id}}}, {{{user_id}}}) is the only determinant, meaning it is trivially a superkey. In '''BookingPets''', the composite key ({{{booking_id}}}, {{{pet_id}}}) is the only determinant, meaning it is trivially a superkey. In '''SitterServices''', the composite key ({{{sitter_id}}}, {{{service_id}}}) is the only determinant, meaning it is trivially a superkey. In '''BookingServices''', the composite key ({{{booking_id}}}, {{{service_id}}}) is the only determinant, meaning it is trivially a superkey. '''Result:''' In every single relation, the determinant is a candidate/superkey. '''Status:''' All 14 relations naturally satisfy '''BCNF'''. The decomposition process is complete. == Final result and discussion == === Final normalized relational design === The final database schema operates in strict BCNF with the following 14 relations: '''Users''' ({{{user_id}}}, {{{username}}}, {{{first_name}}}, {{{last_name}}}, {{{password}}}, {{{email}}}) '''Admins''' ({{{admin_id}}}, {{{user_id}}}) '''PetOwners''' ({{{owner_id}}}, {{{user_id}}}) '''PetSitters''' ({{{sitter_id}}}, {{{user_id}}}) '''PetTypes''' ({{{pettype_id}}}, {{{species}}}, {{{average_lifespan}}}, {{{needs_outdoor_walk}}}) '''Pets''' ({{{pet_id}}}, {{{pet_name}}}, {{{photo}}}, {{{age}}}, {{{special_needs}}}, {{{pet_description}}}, {{{owner_id}}}, {{{pettype_id}}}) '''Services''' ({{{service_id}}}, {{{service_type}}}, {{{service_description}}}) '''Bookings''' ({{{booking_id}}}, {{{booking_status}}}, {{{date_from}}}, {{{date_to}}}, {{{address}}}, {{{owner_id}}}, {{{sitter_id}}}) '''Reviews''' ({{{review_id}}}, {{{rating}}}, {{{comment}}}, {{{booking_id}}}) '''Payments''' ({{{payment_id}}}, {{{amount}}}, {{{payment_type}}}, {{{booking_id}}}) '''AdminManagement''' ({{{admin_id}}}, {{{user_id}}}) '''BookingPets''' ({{{booking_id}}}, {{{pet_id}}}) '''SitterServices''' ({{{sitter_id}}}, {{{service_id}}}) '''BookingServices''' ({{{booking_id}}}, {{{service_id}}}) === Discussion - compare with Phase 2 === With completing the normalization process up to BCNF, we have proven that the resulting schema is identical to the logical design created in Phase 2.