| Version 3 (modified by , 27 hours ago) ( diff ) |
|---|
Phase P5: Normalization
Initial de-normalized relation and functional dependencies
Universal Relation R
All attributes from the ER model are merged into one flat relation. Attribute names are changed where the same concept appears in multiple entities.
R(
admin_id, admin_name, admin_email, admin_password,
architect_id, architect_name,
building_id, building_name, building_address, building_description,
floor_id, floor_number, layout_image,
unit_id, unit_number, room_number, floor_area, unit_status,
unit_price, unit_image, unit_floorplan, vector_image,
agent_id, agent_name, agent_email, agent_password,
client_id, client_name, client_email, client_phone,
timeslot_id, ts_date, ts_time_start, ts_time_end, ts_status,
inquiry_id, inquiry_message, inquiry_status, inquiry_created_at,
appointment_id, appt_status
)
Canonical Cover
FD01: admin_id → admin_name, admin_email, admin_password FD02: admin_email → admin_id FD03: architect_id → architect_name FD04: building_id → building_name, building_address, building_description, admin_id FD05: floor_id → floor_number, layout_image, building_id FD06: (building_id, floor_number) → floor_id FD07: unit_id → unit_number, room_number, floor_area, unit_status, unit_price, unit_image, unit_floorplan, vector_image, floor_id FD08: (floor_id, unit_number) → unit_id FD09: agent_id → agent_name, agent_email, agent_password FD10: agent_email → agent_id FD11: client_id → client_name, client_email, client_phone FD12: client_email → client_id FD13: timeslot_id → ts_date, ts_time_start, ts_time_end, ts_status, agent_id FD14: (agent_id, ts_date, ts_time_start) → timeslot_id FD15: inquiry_id → inquiry_message, inquiry_status, inquiry_created_at, unit_id, client_id, agent_id FD16: appointment_id → appt_status, client_id, unit_id, timeslot_id FD17: timeslot_id → appointment_id
Candidate Keys & Primary Key of R
Identifying the Candidate Key
A candidate key must functionally determine all attributes in R and be minimal.
Candidate Keys:
{architect_id, inquiry_id, appointment_id}{architect_id, inquiry_id, timeslot_id}
Primary Key: {architect_id, inquiry_id, appointment_id}
Normal Form of R Before Decomposition
R is in 1NF but not in 2NF.
- 1NF: All attributes hold atomic scalar values, and there are no repeating groups. Every row is uniquely identified by the PK. R is in 1NF.
- 2NF: A relation is in 2NF if it is in 1NF and every non‑key attribute is fully functionally dependent on the entire primary key. R does not conform to 2NF.
Example of a 2NF violation:
architect_name is determined by architect_id (FD03), which is only a
proper subset of the primary key {architect_id, inquiry_id, appointment_id}.
If an architect is associated with multiple appointments,
their name is repeated unnecessarily which is a clear partial dependency.
Step-by-step Decomposition
Decompose from 1NF to 2NF
1.1 Extract architect_id → architect_name (FD03)
- FD causing issue: FD03, where
architect_id⊂ PK. - New relation:
R_Architect( architect_id PK, architect_name ) FDs: FD03 (architect_id → architect_name) Candidate key / PK: {architect_id} - Remaining relation R1: R minus
architect_name, with PK unchanged{architect_id, inquiry_id, appointment_id}and FKarchitect_id → R_Architect. - Lossless: Yes –
architect_idis the key ofR_Architect, natural join onarchitect_idreconstructs R. - Dependency preservation: FD03 is wholly inside
R_Architect; all other FDs remain in R1.
1.2 Extract inquiry_id → inquiry_message, inquiry_status, inquiry_created_at, unit_id, client_id, agent_id (FD15)
- FD causing issue: FD15,
inquiry_id⊂ PK. - From R1, extract:
R_Inquiry( inquiry_id PK, inquiry_message, inquiry_status, inquiry_created_at, unit_id, client_id, agent_id ) FDs: FD15 (inquiry_id → inquiry_message, inquiry_status, inquiry_created_at, unit_id, client_id, agent_id) Candidate key / PK: {inquiry_id} - Remaining relation R2: R1 minus the dependent attributes
{inquiry_message, inquiry_status, inquiry_created_at, unit_id, client_id, agent_id},
PK unchanged, FK
inquiry_id → R_Inquiry. - Lossless: Yes –
inquiry_idis key ofR_Inquiry. - Dependency preservation: FD15 is in
R_Inquiry.
1.3 Extract appointment_id → appt_status, client_id, unit_id, timeslot_id (FD16)
- FD causing issue: FD16,
appointment_id⊂ PK. - From R2, extract:
R_Appointment( appointment_id PK, appt_status, client_id, unit_id, timeslot_id ) FDs: FD16 (appointment_id → appt_status, client_id, unit_id, timeslot_id) FD17 (timeslot_id → appointment_id) Candidate keys: {appointment_id}, {timeslot_id} PK: {appointment_id} - Remaining relation R3: R2 minus the dependent attributes
{appt_status, client_id, unit_id, timeslot_id}, PK unchanged,
FK
appointment_id → R_Appointment. Note:timeslot_idmoves entirely into R_Appointment and does NOT remain in R3; it is a dependent attribute of appointment_id, not a property of the residual relation. - Lossless: Yes –
appointment_idis key ofR_Appointment. - Dependency preservation: FD16 and FD17 are in
R_Appointment.
1.4 Extract timeslot_id → ts_date, ts_time_start, ts_time_end, ts_status, agent_id (FD13)
timeslot_idis the PK of R_Appointment, and FD13 shows it determines further attributes (ts_date, ts_time_start, ts_time_end, ts_status, agent_id) that were placed in R_Appointment in step 1.3 only as part of the dependent set of FD16. Those timeslot attributes are themselves determined by timeslot_id via FD13 — a partial dependency on timeslot_id (a candidate key of R_Appointment, not the chosen PK) — so they must be extracted into their own relation.
- From R_Appointment, extract:
R_Timeslot( timeslot_id PK, ts_date, ts_time_start, ts_time_end, ts_status, agent_id ) FDs: FD13 (timeslot_id → ts_date, ts_time_start, ts_time_end, ts_status, agent_id) FD14 ((agent_id, ts_date, ts_time_start) → timeslot_id) Candidate keys: {timeslot_id}, {agent_id, ts_date, ts_time_start} PK: {timeslot_id} - R_Appointment updated: timeslot attributes removed; timeslot_id is retained
as FK → R_Timeslot. R_Appointment now holds:
R_Appointment( appointment_id PK, appt_status, client_id, unit_id, timeslot_id FK → R_Timeslot )
- Lossless: Yes –
timeslot_idis key ofR_Timeslot. - Dependency preservation: FD13 and FD14 are in
R_Timeslot.
1.5 Extract unit_id → unit_number, room_number, floor_area, unit_status, unit_price, unit_image, unit_floorplan, vector_image, floor_id (FD07)
In R3,
unit_idis a non-key attribute (it depends oninquiry_idandappointment_id, both proper subsets of the PK). Its dependent attributes cause a 2NF violation.
- From R3, extract:
R_Unit( unit_id PK, unit_number, room_number, floor_area, unit_status, unit_price, unit_image, unit_floorplan, vector_image, floor_id ) FDs: FD07 (unit_id → unit_number, room_number, floor_area, unit_status, unit_price, unit_image, unit_floorplan, vector_image, floor_id) FD08 ((floor_id, unit_number) → unit_id) Candidate keys: {unit_id}, {floor_id, unit_number} PK: {unit_id} - Remaining relation R4: R3 minus the dependent attributes, PK unchanged,
FK
unit_id → R_Unit. - Lossless: Yes –
unit_idis key ofR_Unit. - Dependency preservation: FD07 and FD08 are in
R_Unit.
1.6 Extract floor_id → floor_number, layout_image, building_id (FD05)
In R4,
floor_idis a non-key attribute, depending onunit_idwhich is itself a partial determinant — a 2NF violation.
- From R4, extract:
R_Floor( floor_id PK, floor_number, layout_image, building_id ) FDs: FD05 (floor_id → floor_number, layout_image, building_id) FD06 ((building_id, floor_number) → floor_id) Candidate keys: {floor_id}, {building_id, floor_number} PK: {floor_id} - Remaining relation R5: R4 minus the dependent attributes, PK unchanged,
FK
floor_id → R_Floor. - Lossless: Yes –
floor_idis key ofR_Floor. - Dependency preservation: FD05 and FD06 are in
R_Floor.
1.7 Extract building_id → building_name, building_address, building_description, admin_id (FD04)
In R5,
building_idis a non-key attribute depending onfloor_id— a 2NF violation.
- From R5, extract:
R_Building( building_id PK, building_name, building_address, building_description, admin_id ) FDs: FD04 (building_id → building_name, building_address, building_description, admin_id) Candidate key / PK: {building_id} - Remaining relation R6: R5 minus the dependent attributes, PK unchanged,
FK
building_id → R_Building. - Lossless: Yes –
building_idis key ofR_Building. - Dependency preservation: FD04 is in
R_Building.
1.8 Extract admin_id → admin_name, admin_email, admin_password (FD01, FD02)
In R6,
admin_idis a non-key attribute depending onbuilding_id— a 2NF violation.
- From R6, extract:
R_Admin( admin_id PK, admin_name, admin_email UNIQUE, admin_password ) FDs: FD01 (admin_id → admin_name, admin_email, admin_password) FD02 (admin_email → admin_id) Candidate keys: {admin_id}, {admin_email} PK: {admin_id} - Remaining relation R7: R6 minus the dependent attributes, PK unchanged,
FK
admin_id → R_Admin. - Lossless: Yes –
admin_idis key ofR_Admin. - Dependency preservation: FD01 and FD02 are in
R_Admin.
1.9 Extract agent_id → agent_name, agent_email, agent_password (FD09, FD10)
agent_idappears as a non-key attribute in both R_Inquiry (step 1.2) and R_Timeslot (step 1.4). At the point of R_Inquiry's extraction, agent_id became a non-key attribute in that relation with FD09 applying to it — a partial dependency on inquiry_id. We extract the agent entity from R_Inquiry:
R_Agent(
agent_id PK,
agent_name,
agent_email UNIQUE,
agent_password
)
FDs: FD09 (agent_id → agent_name, agent_email, agent_password)
FD10 (agent_email → agent_id)
Candidate keys: {agent_id}, {agent_email}
PK: {agent_id}
R_Inquiry and R_Timeslot both retain
agent_idas FK → R_Agent.
- Lossless: Yes –
agent_idis key ofR_Agent. - Dependency preservation: FD09 and FD10 are in
R_Agent.
1.10 Extract client_id → client_name, client_email, client_phone (FD11, FD12)
client_idappears as a non-key attribute in both R_Inquiry (step 1.2) and R_Appointment (step 1.3). At the point of R_Inquiry's extraction, client_id became a non-key attribute with FD11 applying — a partial dependency on inquiry_id. We extract the client entity from R_Inquiry:
R_Client(
client_id PK,
client_name,
client_email UNIQUE,
client_phone
)
FDs: FD11 (client_id → client_name, client_email, client_phone)
FD12 (client_email → client_id)
Candidate keys: {client_id}, {client_email}
PK: {client_id}
R_Inquiry and R_Appointment both retain
client_idas FK → R_Client.
- Lossless: Yes –
client_idis key ofR_Client. - Dependency preservation: FD11 and FD12 are in
R_Client.
1.11 Materialise the M:N relationship "Designs"
After all extractions, R7 still contains
architect_id(part of the original PK) andbuilding_id(now a FK via the floor→building chain). The architect-building relationship is M:N with no non-key attributes of its own. It is extracted as a pure association relation:
R_Designs(
architect_id PK, FK → R_Architect,
building_id PK, FK → R_Building
)
FDs: none beyond the composite PK
Candidate key / PK: {architect_id, building_id}
After this extraction, the residual relation R8 contains only
{architect_id, inquiry_id, appointment_id}— the original PK of the universal relation R. This residual has no non-key attributes of its own; it exists solely as a structural artifact of treating all entities as one flat relation. The three attributes are already individually covered as PKs or FKs in R_Architect, R_Inquiry, and R_Appointment respectively, and no domain rule requires tracking the specific combination of (architect, inquiry, appointment) as a meaningful association. R8 is therefore dissolved; all information and all constraints are captured by the 11 extracted relations.
Result after Step 1: 11 relations, all in 2NF. Lossless join holds because every extraction used a determinant that is the key of the new relation. All functional dependencies FD01-FD17 are preserved locally within exactly one relation.
Decompose from 2NF to 3NF
Each relation from Step 1 is checked below.
R_Admin
Attributes : admin_id, admin_name, admin_email, admin_password
FDs : FD01 (admin_id → admin_name, admin_email, admin_password)
FD02 (admin_email → admin_id)
CKs : {admin_id}, {admin_email}
PK : admin_id
Both FD01 and FD02 have superkeys on the left-hand side. There is no non-key attribute that determines another — admin_email is a full candidate key, not a "middle step" in a transitive chain. 3NF. No decomposition needed.
R_Architect
Attributes : architect_id, architect_name
FDs : FD03 (architect_id → architect_name)
CKs / PK : {architect_id}
Only one non-key attribute exists. 3NF. No decomposition needed.
R_Building
Attributes : building_id, building_name, building_address,
building_description, admin_id
FDs : FD04 (building_id → building_name, building_address,
building_description, admin_id)
CKs / PK : {building_id}
admin_id is a non-key attribute (FK to R_Admin). The question is whether admin_id determines any other attribute inside R_Building — it does not. admin_id → admin_name etc. is a fact about R_Admin, not about R_Building. No transitive chain exists within this relation. 3NF. No decomposition needed.
R_Designs
Attributes : architect_id, building_id
FDs : none beyond the composite PK
CKs / PK : {architect_id, building_id}
No non-key attributes. 3NF. No decomposition needed.
R_Floor
Attributes : floor_id, floor_number, layout_image, building_id
FDs : FD05 (floor_id → floor_number, layout_image, building_id)
FD06 ((building_id, floor_number) → floor_id)
CKs : {floor_id}, {building_id, floor_number}
PK : floor_id
building_id is a non-key attribute (FK to R_Building). FD06 uses (building_id, floor_number) together as a composite candidate key, which is a superkey. No transitive chain. 3NF. No decomposition needed.
R_Unit
Attributes : unit_id, unit_number, room_number, floor_area, unit_status,
unit_price, unit_image, unit_floorplan, vector_image, floor_id
FDs : FD07 (unit_id → unit_number, room_number, floor_area,
unit_status, unit_price, unit_image,
unit_floorplan, vector_image, floor_id)
FD08 ((floor_id, unit_number) → unit_id)
CKs : {unit_id}, {floor_id, unit_number}
PK : unit_id
floor_id is a non-key attribute (FK to R_Floor). Within R_Unit, floor_id does not determine room_number, floor_area, or any other attribute on its own — those all require unit_id. FD08 (floor_id, unit_number) is a candidate key. No transitive chain. 3NF. No decomposition needed.
R_Agent
Attributes : agent_id, agent_name, agent_email, agent_password
FDs : FD09 (agent_id → agent_name, agent_email, agent_password)
FD10 (agent_email → agent_id)
CKs : {agent_id}, {agent_email}
PK : agent_id
Same structure as R_Admin. agent_email is a candidate key, so FD10 does not introduce a transitive dependency — both attributes are superkeys. 3NF. No decomposition needed.
R_Client
Attributes : client_id, client_name, client_email, client_phone
FDs : FD11 (client_id → client_name, client_email, client_phone)
FD12 (client_email → client_id)
CKs : {client_id}, {client_email}
PK : client_id
client_email is a candidate key, so FD12 is not a transitive dependency. 3NF. No decomposition needed.
R_Timeslot
Attributes : timeslot_id, ts_date, ts_time_start, ts_time_end,
ts_status, agent_id
FDs : FD13 (timeslot_id → ts_date, ts_time_start, ts_time_end,
ts_status, agent_id)
FD14 ((agent_id, ts_date, ts_time_start) → timeslot_id)
CKs : {timeslot_id}, {agent_id, ts_date, ts_time_start}
PK : timeslot_id
agent_id is a non-key attribute (FK to R_Agent). The composite (agent_id, ts_date, ts_time_start) is a candidate key, so FD14 has a superkey. No transitive chain. 3NF. No decomposition needed.
R_Inquiry
Attributes : inquiry_id, inquiry_message, inquiry_status,
inquiry_created_at, unit_id, client_id, agent_id
FDs : FD15 (inquiry_id → inquiry_message, inquiry_status,
inquiry_created_at, unit_id, client_id, agent_id)
CK / PK : {inquiry_id}
unit_id, client_id, and agent_id are all non-key FK attributes. None of them determines any other attribute within R_Inquiry. All non-key attributes depend directly on inquiry_id. 3NF. No decomposition needed.
R_Appointment
Attributes : appointment_id, appt_status, client_id, unit_id, timeslot_id
FDs : FD16 (appointment_id → appt_status, client_id, unit_id, timeslot_id)
FD17 (timeslot_id → appointment_id)
CKs : {appointment_id}, {timeslot_id}
PK : appointment_id
client_id and unit_id are FKs that determine nothing within R_Appointment. 3NF. No decomposition needed.
No decomposition was required at the 3NF step. All 11 relations produced in Step 1 are already in 3NF.
STEP 3 — Decompose from 3NF to BCNF
Check every FD in each relation:
- R_Admin: FD01 (admin_id superkey), FD02 (admin_email superkey) – BCNF.
- R_Architect: FD03 (architect_id superkey) – BCNF.
- R_Building: FD04 (building_id superkey) – BCNF.
- R_Designs: no non‑trivial FDs – BCNF.
- R_Floor: FD05 (floor_id superkey), FD06 (composite CK superkey) – BCNF.
- R_Unit: FD07 (unit_id superkey), FD08 (composite CK superkey) – BCNF.
- R_Agent: FD09 (agent_id superkey), FD10 (agent_email superkey) – BCNF.
- R_Client: FD11 (client_id superkey), FD12 (client_email superkey) – BCNF.
- R_Timeslot: FD13 (timeslot_id superkey), FD14 (composite CK superkey) – BCNF.
- R_Inquiry: FD15 (inquiry_id superkey) – BCNF.
- R_Appointment: FD16 (appointment_id superkey), FD17 (timeslot_id superkey) – BCNF.
All 11 relations are in BCNF. No decomposition needed.
4NF Note
No non‑trivial multi‑valued dependencies exist – every attribute is single‑valued per key. Therefore all relations are also in 4NF.
BCNF / 4NF is the highest normal form achieved.
Final Normalized Design & Discussion
Final Relations
Admin( admin_id PK, admin_name, admin_email UNIQUE, admin_password ) Architect( architect_id PK, architect_name ) Building( building_id PK, building_name, building_address, building_description, admin_id FK → Admin ) Designs( architect_id PK, FK → Architect, building_id PK, FK → Building ) Floor( floor_id PK, floor_number, layout_image, building_id FK → Building, UNIQUE (building_id, floor_number) ) Unit( unit_id PK, unit_number, room_number, floor_area, unit_status, unit_price, unit_image, unit_floorplan, vector_image, floor_id FK → Floor, UNIQUE (floor_id, unit_number) ) Agent( agent_id PK, agent_name, agent_email UNIQUE, agent_password ) Client( client_id PK, client_name, client_email UNIQUE, client_phone ) Timeslot( timeslot_id PK, ts_date, ts_time_start, ts_time_end, ts_status, agent_id FK → Agent, UNIQUE (agent_id, ts_date, ts_time_start) ) Inquiry( inquiry_id PK, inquiry_message, inquiry_status, inquiry_created_at, unit_id FK → Unit, client_id FK → Client, agent_id FK → Agent ) Appointment( appointment_id PK, appt_status, client_id FK → Client, unit_id FK → Unit, timeslot_id FK → Timeslot, UNIQUE )
Differences vs. Phase 2 Design
| Aspect | Phase 2 Design | P5 Normalized Design |
floor.admin_id | Present as FK column | Not present as it is reachable via floor→building→admin |
unit.admin_id | Present as FK column | Not present as it is reachable via unit→floor→building→admin |
appointment.agent_id | Present as FK column | Not present as it is reachable via appointment→timeslot→agent |
| Highest normal form | 2NF (three transitive dependencies) | BCNF / 4NF |
| Admin reachability | Direct FK on floor and unit | Via FK chain – no data loss, no redundancy |
| Agent reachability | Direct FK on appointment | Via FK chain through timeslot |
| Update anomaly risk | Changing building’s admin forced UPDATE on all floor & unit rows | Single UPDATE on Building row only
|
Changes Required for Subsequent Phases
The P5 normalized design (BCNF/4NF) will be used for all following project phases. The Phase 2 DDL must be updated by:
- Dropping column
admin_idand its FK constraint fromfloor. - Dropping column
admin_idand its FK constraint fromunit. - Dropping column
agent_idand its FK constraint fromappointment.
All application queries that previously joined directly on those redundant columns
must instead go trough the FK chain (e.g., JOIN Floor → Building → Admin).
