= Phase P5: Normalization = == Initial de-normalized relation and functional dependencies == **Universal Relation R** All attributes from the ER model are merged into one flat relation. Attribute names are changed where the same concept appears in multiple entities. {{{ R( admin_id, admin_name, admin_email, admin_password, architect_id, architect_name, building_id, building_name, building_address, building_description, floor_id, floor_number, layout_image, unit_id, unit_number, room_number, floor_area, unit_status, unit_price, unit_image, unit_floorplan, vector_image, agent_id, agent_name, agent_email, agent_password, client_id, client_name, client_email, client_phone, timeslot_id, ts_date, ts_time_start, ts_time_end, ts_status, inquiry_id, inquiry_message, inquiry_status, inquiry_created_at, appointment_id, appt_status ) }}} === Canonical Cover === {{{ FD01: admin_id → admin_name, admin_email, admin_password FD02: admin_email → admin_id FD03: architect_id → architect_name FD04: building_id → building_name, building_address, building_description, admin_id FD05: floor_id → floor_number, layout_image, building_id FD06: (building_id, floor_number) → floor_id FD07: unit_id → unit_number, room_number, floor_area, unit_status, unit_price, unit_image, unit_floorplan, vector_image, floor_id FD08: (floor_id, unit_number) → unit_id FD09: agent_id → agent_name, agent_email, agent_password FD10: agent_email → agent_id FD11: client_id → client_name, client_email, client_phone FD12: client_email → client_id FD13: timeslot_id → ts_date, ts_time_start, ts_time_end, ts_status, agent_id FD14: (agent_id, ts_date, ts_time_start) → timeslot_id FD15: inquiry_id → inquiry_message, inquiry_status, inquiry_created_at, unit_id, client_id, agent_id FD16: appointment_id → appt_status, client_id, unit_id, timeslot_id FD17: timeslot_id → appointment_id }}} ---- == Candidate Keys & Primary Key of R == === Identifying the Candidate Key === A candidate key must functionally determine all attributes in R and be minimal. **Candidate Keys:** * `{architect_id, inquiry_id, appointment_id}` * `{architect_id, inquiry_id, timeslot_id}` **Primary Key:** `{architect_id, inquiry_id, appointment_id}` === Normal Form of R Before Decomposition === R is in 1NF but not in 2NF. * **1NF**: All attributes hold atomic scalar values, and there are no repeating groups. Every row is uniquely identified by the PK. '''R is in 1NF.''' * **2NF**: A relation is in 2NF if it is in 1NF and every non‑key attribute is fully functionally dependent on the entire primary key. '''R does not conform to 2NF.''' **Example of a 2NF violation:** `architect_name` is determined by `architect_id` (FD03), which is only a proper subset of the primary key `{architect_id, inquiry_id, appointment_id}`. If an architect is associated with multiple appointments, their name is repeated unnecessarily which is a clear partial dependency. ---- == Step-by-step Decomposition == === Decompose from 1NF to 2NF === ==== 1.1 Extract `architect_id → architect_name` (FD03) ==== * **FD causing issue:** FD03, where `architect_id` ⊂ PK. * **New relation:** {{{ R_Architect( architect_id PK, architect_name ) FDs: FD03 (architect_id → architect_name) Candidate key / PK: {architect_id} }}} * **Remaining relation R1:** R minus `architect_name`, with PK unchanged `{architect_id, inquiry_id, appointment_id}` and FK `architect_id → R_Architect`. * **Lossless:** Yes – `architect_id` is the key of `R_Architect`, natural join on `architect_id` reconstructs R. * **Dependency preservation:** FD03 is wholly inside `R_Architect`; all other FDs remain in R1. ==== 1.2 Extract `inquiry_id → inquiry_message, inquiry_status, inquiry_created_at, unit_id, client_id, agent_id` (FD15) ==== * **FD causing issue:** FD15, `inquiry_id` ⊂ PK. * **From R1**, extract: {{{ R_Inquiry( inquiry_id PK, inquiry_message, inquiry_status, inquiry_created_at, unit_id, client_id, agent_id ) FDs: FD15 (inquiry_id → inquiry_message, inquiry_status, inquiry_created_at, unit_id, client_id, agent_id) Candidate key / PK: {inquiry_id} }}} * **Remaining relation R2:** R1 minus the dependent attributes {inquiry_message, inquiry_status, inquiry_created_at, unit_id, client_id, agent_id}, PK unchanged, FK `inquiry_id → R_Inquiry`. * **Lossless:** Yes – `inquiry_id` is key of `R_Inquiry`. * **Dependency preservation:** FD15 is in `R_Inquiry`. ==== 1.3 Extract `appointment_id → appt_status, client_id, unit_id, timeslot_id` (FD16) ==== * **FD causing issue:** FD16, `appointment_id` ⊂ PK. * **From R2**, extract: {{{ R_Appointment( appointment_id PK, appt_status, client_id, unit_id, timeslot_id ) FDs: FD16 (appointment_id → appt_status, client_id, unit_id, timeslot_id) FD17 (timeslot_id → appointment_id) Candidate keys: {appointment_id}, {timeslot_id} PK: {appointment_id} }}} * **Remaining relation R3:** R2 minus the dependent attributes {appt_status, client_id, unit_id, timeslot_id}, PK unchanged, FK `appointment_id → R_Appointment`. Note: `timeslot_id` moves entirely into R_Appointment and does NOT remain in R3; it is a dependent attribute of appointment_id, not a property of the residual relation. * **Lossless:** Yes – `appointment_id` is key of `R_Appointment`. * **Dependency preservation:** FD16 and FD17 are in `R_Appointment`. ==== 1.4 Extract `timeslot_id → ts_date, ts_time_start, ts_time_end, ts_status, agent_id` (FD13) ==== `timeslot_id` is the PK of R_Appointment, and FD13 shows it determines further attributes (ts_date, ts_time_start, ts_time_end, ts_status, agent_id) that were placed in R_Appointment in step 1.3 only as part of the dependent set of FD16. Those timeslot attributes are themselves determined by timeslot_id via FD13 — a partial dependency on timeslot_id (a candidate key of R_Appointment, not the chosen PK) — so they must be extracted into their own relation. * **From R_Appointment**, extract: {{{ R_Timeslot( timeslot_id PK, ts_date, ts_time_start, ts_time_end, ts_status, agent_id ) FDs: FD13 (timeslot_id → ts_date, ts_time_start, ts_time_end, ts_status, agent_id) FD14 ((agent_id, ts_date, ts_time_start) → timeslot_id) Candidate keys: {timeslot_id}, {agent_id, ts_date, ts_time_start} PK: {timeslot_id} }}} * **R_Appointment updated:** timeslot attributes removed; timeslot_id is retained as FK → R_Timeslot. R_Appointment now holds: {{{ R_Appointment( appointment_id PK, appt_status, client_id, unit_id, timeslot_id FK → R_Timeslot ) }}} * **Lossless:** Yes – `timeslot_id` is key of `R_Timeslot`. * **Dependency preservation:** FD13 and FD14 are in `R_Timeslot`. ==== 1.5 Extract `unit_id → unit_number, room_number, floor_area, unit_status, unit_price, unit_image, unit_floorplan, vector_image, floor_id` (FD07) ==== In R3, `unit_id` is a non-key attribute (it depends on `inquiry_id` and `appointment_id`, both proper subsets of the PK). Its dependent attributes cause a 2NF violation. * **From R3**, extract: {{{ R_Unit( unit_id PK, unit_number, room_number, floor_area, unit_status, unit_price, unit_image, unit_floorplan, vector_image, floor_id ) FDs: FD07 (unit_id → unit_number, room_number, floor_area, unit_status, unit_price, unit_image, unit_floorplan, vector_image, floor_id) FD08 ((floor_id, unit_number) → unit_id) Candidate keys: {unit_id}, {floor_id, unit_number} PK: {unit_id} }}} * **Remaining relation R4:** R3 minus the dependent attributes, PK unchanged, FK `unit_id → R_Unit`. * **Lossless:** Yes – `unit_id` is key of `R_Unit`. * **Dependency preservation:** FD07 and FD08 are in `R_Unit`. ==== 1.6 Extract `floor_id → floor_number, layout_image, building_id` (FD05) ==== In R4, `floor_id` is a non-key attribute, depending on `unit_id` which is itself a partial determinant — a 2NF violation. * **From R4**, extract: {{{ R_Floor( floor_id PK, floor_number, layout_image, building_id ) FDs: FD05 (floor_id → floor_number, layout_image, building_id) FD06 ((building_id, floor_number) → floor_id) Candidate keys: {floor_id}, {building_id, floor_number} PK: {floor_id} }}} * **Remaining relation R5:** R4 minus the dependent attributes, PK unchanged, FK `floor_id → R_Floor`. * **Lossless:** Yes – `floor_id` is key of `R_Floor`. * **Dependency preservation:** FD05 and FD06 are in `R_Floor`. ==== 1.7 Extract `building_id → building_name, building_address, building_description, admin_id` (FD04) ==== In R5, `building_id` is a non-key attribute depending on `floor_id` — a 2NF violation. * **From R5**, extract: {{{ R_Building( building_id PK, building_name, building_address, building_description, admin_id ) FDs: FD04 (building_id → building_name, building_address, building_description, admin_id) Candidate key / PK: {building_id} }}} * **Remaining relation R6:** R5 minus the dependent attributes, PK unchanged, FK `building_id → R_Building`. * **Lossless:** Yes – `building_id` is key of `R_Building`. * **Dependency preservation:** FD04 is in `R_Building`. ==== 1.8 Extract `admin_id → admin_name, admin_email, admin_password` (FD01, FD02) ==== In R6, `admin_id` is a non-key attribute depending on `building_id` — a 2NF violation. * **From R6**, extract: {{{ R_Admin( admin_id PK, admin_name, admin_email UNIQUE, admin_password ) FDs: FD01 (admin_id → admin_name, admin_email, admin_password) FD02 (admin_email → admin_id) Candidate keys: {admin_id}, {admin_email} PK: {admin_id} }}} * **Remaining relation R7:** R6 minus the dependent attributes, PK unchanged, FK `admin_id → R_Admin`. * **Lossless:** Yes – `admin_id` is key of `R_Admin`. * **Dependency preservation:** FD01 and FD02 are in `R_Admin`. ==== 1.9 Extract `agent_id → agent_name, agent_email, agent_password` (FD09, FD10) ==== `agent_id` appears as a non-key attribute in both R_Inquiry (step 1.2) and R_Timeslot (step 1.4). At the point of R_Inquiry's extraction, agent_id became a non-key attribute in that relation with FD09 applying to it — a partial dependency on inquiry_id. We extract the agent entity from R_Inquiry: {{{ R_Agent( agent_id PK, agent_name, agent_email UNIQUE, agent_password ) FDs: FD09 (agent_id → agent_name, agent_email, agent_password) FD10 (agent_email → agent_id) Candidate keys: {agent_id}, {agent_email} PK: {agent_id} }}} R_Inquiry and R_Timeslot both retain `agent_id` as FK → R_Agent. * **Lossless:** Yes – `agent_id` is key of `R_Agent`. * **Dependency preservation:** FD09 and FD10 are in `R_Agent`. ==== 1.10 Extract `client_id → client_name, client_email, client_phone` (FD11, FD12) ==== `client_id` appears as a non-key attribute in both R_Inquiry (step 1.2) and R_Appointment (step 1.3). At the point of R_Inquiry's extraction, client_id became a non-key attribute with FD11 applying — a partial dependency on inquiry_id. We extract the client entity from R_Inquiry: {{{ R_Client( client_id PK, client_name, client_email UNIQUE, client_phone ) FDs: FD11 (client_id → client_name, client_email, client_phone) FD12 (client_email → client_id) Candidate keys: {client_id}, {client_email} PK: {client_id} }}} R_Inquiry and R_Appointment both retain `client_id` as FK → R_Client. * **Lossless:** Yes – `client_id` is key of `R_Client`. * **Dependency preservation:** FD11 and FD12 are in `R_Client`. ==== 1.11 Materialise the M:N relationship "Designs" ==== After all extractions, R7 still contains `architect_id` (part of the original PK) and `building_id` (now a FK via the floor→building chain). The architect-building relationship is M:N with no non-key attributes of its own. It is extracted as a pure association relation: {{{ R_Designs( architect_id PK, FK → R_Architect, building_id PK, FK → R_Building ) FDs: none beyond the composite PK Candidate key / PK: {architect_id, building_id} }}} After this extraction, the residual relation R8 contains only `{architect_id, inquiry_id, appointment_id}` — the original PK of the universal relation R. This residual has no non-key attributes of its own; it exists solely as a structural artifact of treating all entities as one flat relation. The three attributes are already individually covered as PKs or FKs in R_Architect, R_Inquiry, and R_Appointment respectively, and no domain rule requires tracking the specific combination of (architect, inquiry, appointment) as a meaningful association. R8 is therefore dissolved; all information and all constraints are captured by the 11 extracted relations. **Result after Step 1: 11 relations, all in 2NF.** Lossless join holds because every extraction used a determinant that is the key of the new relation. All functional dependencies FD01-FD17 are preserved locally within exactly one relation. === Decompose from 2NF to 3NF === Each relation from Step 1 is checked below. ==== R_Admin ==== {{{ Attributes : admin_id, admin_name, admin_email, admin_password FDs : FD01 (admin_id → admin_name, admin_email, admin_password) FD02 (admin_email → admin_id) CKs : {admin_id}, {admin_email} PK : admin_id }}} Both FD01 and FD02 have superkeys on the left-hand side. There is no non-key attribute that determines another — admin_email is a full candidate key, not a "middle step" in a transitive chain. '''3NF. No decomposition needed.''' ==== R_Architect ==== {{{ Attributes : architect_id, architect_name FDs : FD03 (architect_id → architect_name) CKs / PK : {architect_id} }}} Only one non-key attribute exists. '''3NF. No decomposition needed.''' ==== R_Building ==== {{{ Attributes : building_id, building_name, building_address, building_description, admin_id FDs : FD04 (building_id → building_name, building_address, building_description, admin_id) CKs / PK : {building_id} }}} admin_id is a non-key attribute (FK to R_Admin). The question is whether admin_id determines any other attribute inside R_Building — it does not. admin_id → admin_name etc. is a fact about R_Admin, not about R_Building. No transitive chain exists within this relation. '''3NF. No decomposition needed.''' ==== R_Designs ==== {{{ Attributes : architect_id, building_id FDs : none beyond the composite PK CKs / PK : {architect_id, building_id} }}} No non-key attributes. '''3NF. No decomposition needed.''' ==== R_Floor ==== {{{ Attributes : floor_id, floor_number, layout_image, building_id FDs : FD05 (floor_id → floor_number, layout_image, building_id) FD06 ((building_id, floor_number) → floor_id) CKs : {floor_id}, {building_id, floor_number} PK : floor_id }}} building_id is a non-key attribute (FK to R_Building). FD06 uses (building_id, floor_number) together as a composite candidate key, which is a superkey. No transitive chain. '''3NF. No decomposition needed.''' ==== R_Unit ==== {{{ Attributes : unit_id, unit_number, room_number, floor_area, unit_status, unit_price, unit_image, unit_floorplan, vector_image, floor_id FDs : FD07 (unit_id → unit_number, room_number, floor_area, unit_status, unit_price, unit_image, unit_floorplan, vector_image, floor_id) FD08 ((floor_id, unit_number) → unit_id) CKs : {unit_id}, {floor_id, unit_number} PK : unit_id }}} floor_id is a non-key attribute (FK to R_Floor). Within R_Unit, floor_id does not determine room_number, floor_area, or any other attribute on its own — those all require unit_id. FD08 (floor_id, unit_number) is a candidate key. No transitive chain. '''3NF. No decomposition needed.''' ==== R_Agent ==== {{{ Attributes : agent_id, agent_name, agent_email, agent_password FDs : FD09 (agent_id → agent_name, agent_email, agent_password) FD10 (agent_email → agent_id) CKs : {agent_id}, {agent_email} PK : agent_id }}} Same structure as R_Admin. agent_email is a candidate key, so FD10 does not introduce a transitive dependency — both attributes are superkeys. '''3NF. No decomposition needed.''' ==== R_Client ==== {{{ Attributes : client_id, client_name, client_email, client_phone FDs : FD11 (client_id → client_name, client_email, client_phone) FD12 (client_email → client_id) CKs : {client_id}, {client_email} PK : client_id }}} client_email is a candidate key, so FD12 is not a transitive dependency. '''3NF. No decomposition needed.''' ==== R_Timeslot ==== {{{ Attributes : timeslot_id, ts_date, ts_time_start, ts_time_end, ts_status, agent_id FDs : FD13 (timeslot_id → ts_date, ts_time_start, ts_time_end, ts_status, agent_id) FD14 ((agent_id, ts_date, ts_time_start) → timeslot_id) CKs : {timeslot_id}, {agent_id, ts_date, ts_time_start} PK : timeslot_id }}} agent_id is a non-key attribute (FK to R_Agent). The composite (agent_id, ts_date, ts_time_start) is a candidate key, so FD14 has a superkey. No transitive chain. '''3NF. No decomposition needed.''' ==== R_Inquiry ==== {{{ Attributes : inquiry_id, inquiry_message, inquiry_status, inquiry_created_at, unit_id, client_id, agent_id FDs : FD15 (inquiry_id → inquiry_message, inquiry_status, inquiry_created_at, unit_id, client_id, agent_id) CK / PK : {inquiry_id} }}} unit_id, client_id, and agent_id are all non-key FK attributes. None of them determines any other attribute within R_Inquiry. All non-key attributes depend directly on inquiry_id. '''3NF. No decomposition needed.''' ==== R_Appointment ==== {{{ Attributes : appointment_id, appt_status, client_id, unit_id, timeslot_id FDs : FD16 (appointment_id → appt_status, client_id, unit_id, timeslot_id) FD17 (timeslot_id → appointment_id) CKs : {appointment_id}, {timeslot_id} PK : appointment_id }}} client_id and unit_id are FKs that determine nothing within R_Appointment. '''3NF. No decomposition needed.''' ---- '''No decomposition was required at the 3NF step.''' All 11 relations produced in Step 1 are already in 3NF. === STEP 3 — Decompose from 3NF to BCNF === Check every FD in each relation: * R_Admin: FD01 (admin_id superkey), FD02 (admin_email superkey) – BCNF. * R_Architect: FD03 (architect_id superkey) – BCNF. * R_Building: FD04 (building_id superkey) – BCNF. * R_Designs: no non‑trivial FDs – BCNF. * R_Floor: FD05 (floor_id superkey), FD06 (composite CK superkey) – BCNF. * R_Unit: FD07 (unit_id superkey), FD08 (composite CK superkey) – BCNF. * R_Agent: FD09 (agent_id superkey), FD10 (agent_email superkey) – BCNF. * R_Client: FD11 (client_id superkey), FD12 (client_email superkey) – BCNF. * R_Timeslot: FD13 (timeslot_id superkey), FD14 (composite CK superkey) – BCNF. * R_Inquiry: FD15 (inquiry_id superkey) – BCNF. * R_Appointment: FD16 (appointment_id superkey), FD17 (timeslot_id superkey) – BCNF. '''All 11 relations are in BCNF. No decomposition needed.''' === 4NF Note === No non‑trivial multi‑valued dependencies exist – every attribute is single‑valued per key. Therefore '''all relations are also in 4NF'''. BCNF / 4NF is the highest normal form achieved. ---- == Final Normalized Design & Discussion == === Final Relations === {{{ Admin( admin_id PK, admin_name, admin_email UNIQUE, admin_password ) Architect( architect_id PK, architect_name ) Building( building_id PK, building_name, building_address, building_description, admin_id FK → Admin ) Designs( architect_id PK, FK → Architect, building_id PK, FK → Building ) Floor( floor_id PK, floor_number, layout_image, building_id FK → Building, UNIQUE (building_id, floor_number) ) Unit( unit_id PK, unit_number, room_number, floor_area, unit_status, unit_price, unit_image, unit_floorplan, vector_image, floor_id FK → Floor, UNIQUE (floor_id, unit_number) ) Agent( agent_id PK, agent_name, agent_email UNIQUE, agent_password ) Client( client_id PK, client_name, client_email UNIQUE, client_phone ) Timeslot( timeslot_id PK, ts_date, ts_time_start, ts_time_end, ts_status, agent_id FK → Agent, UNIQUE (agent_id, ts_date, ts_time_start) ) Inquiry( inquiry_id PK, inquiry_message, inquiry_status, inquiry_created_at, unit_id FK → Unit, client_id FK → Client, agent_id FK → Agent ) Appointment( appointment_id PK, appt_status, client_id FK → Client, unit_id FK → Unit, timeslot_id FK → Timeslot, UNIQUE ) }}} === Differences vs. Phase 2 Design === || '''Aspect''' || '''Phase 2 Design''' || '''P5 Normalized Design''' || || `floor.admin_id` || Present as FK column || Not present as it is reachable via floor→building→admin || || `unit.admin_id` || Present as FK column || Not present as it is reachable via unit→floor→building→admin || || `appointment.agent_id` || Present as FK column || Not present as it is reachable via appointment→timeslot→agent || || Highest normal form || 2NF (three transitive dependencies) || BCNF / 4NF || || Admin reachability || Direct FK on floor and unit || Via FK chain – no data loss, no redundancy || || Agent reachability || Direct FK on appointment || Via FK chain through timeslot || || Update anomaly risk || Changing building’s admin forced UPDATE on all floor & unit rows || Single UPDATE on `Building` row only || === Changes Required for Subsequent Phases === The P5 normalized design (BCNF/4NF) will be used for all following project phases. The Phase 2 DDL must be updated by: 1. Dropping column `admin_id` and its FK constraint from `floor`. 2. Dropping column `admin_id` and its FK constraint from `unit`. 3. Dropping column `agent_id` and its FK constraint from `appointment`. All application queries that previously joined directly on those redundant columns must instead go trough the FK chain (e.g., `JOIN Floor → Building → Admin`).