| Version 5 (modified by , 24 hours ago) ( diff ) |
|---|
Phase P5: Normalization
Initial de-normalized relation and functional dependencies
Universal Relation R
All attributes from the ER model are merged into one flat relation. Attribute names are changed where the same concept appears in multiple entities.
R(
admin_id, admin_name, admin_email, admin_password,
architect_id, architect_name,
building_id, building_name, building_address, building_description,
floor_id, floor_number, layout_image,
unit_id, unit_number, room_number, floor_area, unit_status,
unit_price, unit_image, unit_floorplan, vector_image,
agent_id, agent_name, agent_email, agent_password,
client_id, client_name, client_email, client_phone,
timeslot_id, ts_date, ts_time_start, ts_time_end, ts_status,
inquiry_id, inquiry_message, inquiry_status, inquiry_created_at,
appointment_id, appt_status
)
Canonical Cover
FD01: admin_id → admin_name, admin_email, admin_password FD02: admin_email → admin_id FD03: architect_id → architect_name FD04: building_id → building_name, building_address, building_description, admin_id FD05: floor_id → floor_number, layout_image, building_id FD06: (building_id, floor_number) → floor_id FD07: unit_id → unit_number, room_number, floor_area, unit_status, unit_price, unit_image, unit_floorplan, vector_image, floor_id FD08: (floor_id, unit_number) → unit_id FD09: agent_id → agent_name, agent_email, agent_password FD10: agent_email → agent_id FD11: client_id → client_name, client_email, client_phone FD12: client_email → client_id FD13: timeslot_id → ts_date, ts_time_start, ts_time_end, ts_status, agent_id FD14: (agent_id, ts_date, ts_time_start) → timeslot_id FD15: inquiry_id → inquiry_message, inquiry_status, inquiry_created_at, unit_id, client_id, agent_id FD16: appointment_id → appt_status, client_id, unit_id, timeslot_id FD17: timeslot_id → appointment_id
Candidate Keys & Primary Key of R
Identifying the Candidate Key
Before we start the aim of a candidate key is to satisfy the following requirements:
- The candidate key closure must be able to cover every attribute in the relation R.
- And for the key to be minimal, we only keep ones that will be able to satisfy the above mentioned rule.
The starting point would be finding every attributes that are never found on the right side in any of the functional dependencies.
This means that these attributes don’t get determined by any other. So due to this we need to include them always in the candidate keys.
architect_id — nothing in Fc produces it
inquiry_id — nothing in Fc produces it
appointment_id — appears on the RHS of FD17 (timeslot_id → appointment_id),
but timeslot_id itself appears on the RHS of FD16
(appointment_id → timeslot_id), meaning the two mutually
determine each other. Neither can be derived from anything
outside that pair, so one of them must be in every key.
This gives us the base set to test: {architect_id, inquiry_id, appointment_id}.
Testing the coverage of this candidate key results in the complete cover of all attributes in R {architect_id, inquiry_id, appointment_id} is a superkey.
If we try to remove each to get a minimal candidate key the following happens:
By removing architect_id we lose the ability to reach architect_name which is only reachable via FD03 and that requires an architect_id.
By removing inquiry_id we lose the ability to reach inquiry_message, inquiry_status, inquiry_created_at which are only reachable via FD15.
By removing appointment_id we lose the ability to reach multiple attributes (appt_status, timeslot_id, ts_date, ts_time_start, ts_time_end, ts_status).
Another viable candidate key is {architect_id, inquiry_id, timeslot_id}:
From the closure test above we know appointment_id is required to reach appt_status and the timeslot chain. However, appointment_id and timeslot_id mutually determine each other:
FD16: appointment_id → timeslot_id FD17: timeslot_id → appointment_id
This means timeslot_id can substitute for appointment_id. Testing the closure of {architect_id, inquiry_id, timeslot_id} results in all attributes being covered.
There arent any other substitutable attributes left, so that leaves us with the 2 candidate keys we tested.
A candidate key must functionally determine all attributes in R and be minimal.
Candidate Keys:
{architect_id, inquiry_id, appointment_id}{architect_id, inquiry_id, timeslot_id}
Primary Key: {architect_id, inquiry_id, appointment_id}
Normal Form of R Before Decomposition
R is in 1NF but not in 2NF.
- 1NF: All attributes hold atomic scalar values, and there are no repeating groups. Every row is uniquely identified by the PK. R is in 1NF.
- 2NF: A relation is in 2NF if it is in 1NF and every non‑key attribute is fully functionally dependent on the entire primary key. R does not conform to 2NF.
Example of a 2NF violation:
architect_name is determined by architect_id (FD03), which is only a
proper subset of the primary key {architect_id, inquiry_id, appointment_id}.
If an architect is associated with multiple appointments,
their name is repeated unnecessarily which is a clear partial dependency.
Step-by-step Decomposition
Decompose from 1NF to 2NF
1.1 Extract architect_id → architect_name (FD03)
- FD causing issue: FD03, where
architect_id⊂ PK. - New relation:
R_Architect( architect_id PK, architect_name ) FDs: FD03 (architect_id → architect_name) Candidate key / PK: {architect_id} - Remaining relation R1: R minus
architect_name, with PK unchanged{architect_id, inquiry_id, appointment_id}and FKarchitect_id → R_Architect. - Lossless: Yes –
architect_idis the key ofR_Architect, natural join onarchitect_idreconstructs R. - Dependency preservation: FD03 is wholly inside
R_Architect; all other FDs remain in R1.
1.2 Extract inquiry_id → inquiry_message, inquiry_status, inquiry_created_at, unit_id, client_id, agent_id (FD15)
- FD causing issue: FD15,
inquiry_id⊂ PK. - From R1, extract:
R_Inquiry( inquiry_id PK, inquiry_message, inquiry_status, inquiry_created_at, unit_id, client_id, agent_id ) FDs: FD15 (inquiry_id → inquiry_message, inquiry_status, inquiry_created_at, unit_id, client_id, agent_id) Candidate key / PK: {inquiry_id} - Remaining relation R2: R1 minus the dependent attributes
{inquiry_message, inquiry_status, inquiry_created_at, unit_id, client_id, agent_id},
PK unchanged, FK
inquiry_id → R_Inquiry. - Lossless: Yes –
inquiry_idis key ofR_Inquiry. - Dependency preservation: FD15 is in
R_Inquiry.
1.3 Extract appointment_id → appt_status, client_id, unit_id, timeslot_id (FD16)
- FD causing issue: FD16,
appointment_id⊂ PK. - From R2, extract:
R_Appointment( appointment_id PK, appt_status, client_id, unit_id, timeslot_id ) FDs: FD16 (appointment_id → appt_status, client_id, unit_id, timeslot_id) FD17 (timeslot_id → appointment_id) Candidate keys: {appointment_id}, {timeslot_id} PK: {appointment_id} - Remaining relation R3: R2 minus the dependent attributes
{appt_status, client_id, unit_id, timeslot_id}, PK unchanged,
FK
appointment_id → R_Appointment. Note:timeslot_idmoves entirely into R_Appointment and does NOT remain in R3; it is a dependent attribute of appointment_id, not a property of the residual relation. - Lossless: Yes –
appointment_idis key ofR_Appointment. - Dependency preservation: FD16 and FD17 are in
R_Appointment.
1.4 Extract timeslot_id → ts_date, ts_time_start, ts_time_end, ts_status, agent_id (FD13)
timeslot_idis the PK of R_Appointment, and FD13 shows it determines further attributes (ts_date, ts_time_start, ts_time_end, ts_status, agent_id) that were placed in R_Appointment in step 1.3 only as part of the dependent set of FD16. Those timeslot attributes are themselves determined by timeslot_id via FD13 — a partial dependency on timeslot_id (a candidate key of R_Appointment, not the chosen PK) — so they must be extracted into their own relation.
- From R_Appointment, extract:
R_Timeslot( timeslot_id PK, ts_date, ts_time_start, ts_time_end, ts_status, agent_id ) FDs: FD13 (timeslot_id → ts_date, ts_time_start, ts_time_end, ts_status, agent_id) FD14 ((agent_id, ts_date, ts_time_start) → timeslot_id) Candidate keys: {timeslot_id}, {agent_id, ts_date, ts_time_start} PK: {timeslot_id} - R_Appointment updated: timeslot attributes removed; timeslot_id is retained
as FK → R_Timeslot. R_Appointment now holds:
R_Appointment( appointment_id PK, appt_status, client_id, unit_id, timeslot_id FK → R_Timeslot )
- Lossless: Yes –
timeslot_idis key ofR_Timeslot. - Dependency preservation: FD13 and FD14 are in
R_Timeslot.
1.5 Extract unit_id → unit_number, room_number, floor_area, unit_status, unit_price, unit_image, unit_floorplan, vector_image, floor_id (FD07)
In R3,
unit_idis a non-key attribute (it depends oninquiry_idandappointment_id, both proper subsets of the PK). Its dependent attributes cause a 2NF violation.
- From R3, extract:
R_Unit( unit_id PK, unit_number, room_number, floor_area, unit_status, unit_price, unit_image, unit_floorplan, vector_image, floor_id ) FDs: FD07 (unit_id → unit_number, room_number, floor_area, unit_status, unit_price, unit_image, unit_floorplan, vector_image, floor_id) FD08 ((floor_id, unit_number) → unit_id) Candidate keys: {unit_id}, {floor_id, unit_number} PK: {unit_id} - Remaining relation R4: R3 minus the dependent attributes, PK unchanged,
FK
unit_id → R_Unit. - Lossless: Yes –
unit_idis key ofR_Unit. - Dependency preservation: FD07 and FD08 are in
R_Unit.
1.6 Extract floor_id → floor_number, layout_image, building_id (FD05)
In R4,
floor_idis a non-key attribute, depending onunit_idwhich is itself a partial determinant — a 2NF violation.
- From R4, extract:
R_Floor( floor_id PK, floor_number, layout_image, building_id ) FDs: FD05 (floor_id → floor_number, layout_image, building_id) FD06 ((building_id, floor_number) → floor_id) Candidate keys: {floor_id}, {building_id, floor_number} PK: {floor_id} - Remaining relation R5: R4 minus the dependent attributes, PK unchanged,
FK
floor_id → R_Floor. - Lossless: Yes –
floor_idis key ofR_Floor. - Dependency preservation: FD05 and FD06 are in
R_Floor.
1.7 Extract building_id → building_name, building_address, building_description, admin_id (FD04)
In R5,
building_idis a non-key attribute depending onfloor_id— a 2NF violation.
- From R5, extract:
R_Building( building_id PK, building_name, building_address, building_description, admin_id ) FDs: FD04 (building_id → building_name, building_address, building_description, admin_id) Candidate key / PK: {building_id} - Remaining relation R6: R5 minus the dependent attributes, PK unchanged,
FK
building_id → R_Building. - Lossless: Yes –
building_idis key ofR_Building. - Dependency preservation: FD04 is in
R_Building.
1.8 Extract admin_id → admin_name, admin_email, admin_password (FD01, FD02)
In R6,
admin_idis a non-key attribute depending onbuilding_id— a 2NF violation.
- From R6, extract:
R_Admin( admin_id PK, admin_name, admin_email UNIQUE, admin_password ) FDs: FD01 (admin_id → admin_name, admin_email, admin_password) FD02 (admin_email → admin_id) Candidate keys: {admin_id}, {admin_email} PK: {admin_id} - Remaining relation R7: R6 minus the dependent attributes, PK unchanged,
FK
admin_id → R_Admin. - Lossless: Yes –
admin_idis key ofR_Admin. - Dependency preservation: FD01 and FD02 are in
R_Admin.
1.9 Extract agent_id → agent_name, agent_email, agent_password (FD09, FD10)
agent_idappears as a non-key attribute in both R_Inquiry (step 1.2) and R_Timeslot (step 1.4). At the point of R_Inquiry's extraction, agent_id became a non-key attribute in that relation with FD09 applying to it — a partial dependency on inquiry_id. We extract the agent entity from R_Inquiry:
R_Agent(
agent_id PK,
agent_name,
agent_email UNIQUE,
agent_password
)
FDs: FD09 (agent_id → agent_name, agent_email, agent_password)
FD10 (agent_email → agent_id)
Candidate keys: {agent_id}, {agent_email}
PK: {agent_id}
R_Inquiry and R_Timeslot both retain
agent_idas FK → R_Agent.
- Lossless: Yes –
agent_idis key ofR_Agent. - Dependency preservation: FD09 and FD10 are in
R_Agent.
1.10 Extract client_id → client_name, client_email, client_phone (FD11, FD12)
client_idappears as a non-key attribute in both R_Inquiry (step 1.2) and R_Appointment (step 1.3). At the point of R_Inquiry's extraction, client_id became a non-key attribute with FD11 applying — a partial dependency on inquiry_id. We extract the client entity from R_Inquiry:
R_Client(
client_id PK,
client_name,
client_email UNIQUE,
client_phone
)
FDs: FD11 (client_id → client_name, client_email, client_phone)
FD12 (client_email → client_id)
Candidate keys: {client_id}, {client_email}
PK: {client_id}
R_Inquiry and R_Appointment both retain
client_idas FK → R_Client.
- Lossless: Yes –
client_idis key ofR_Client. - Dependency preservation: FD11 and FD12 are in
R_Client.
1.11 Materialise the M:N relationship "Designs"
After all extractions, R7 still contains
architect_id(part of the original PK) andbuilding_id(now a FK via the floor→building chain). The architect-building relationship is M:N with no non-key attributes of its own. It is extracted as a pure association relation:
R_Designs(
architect_id PK, FK → R_Architect,
building_id PK, FK → R_Building
)
FDs: none beyond the composite PK
Candidate key / PK: {architect_id, building_id}
After this extraction, the residual relation R8 contains only
{architect_id, inquiry_id, appointment_id}— the original PK of the universal relation R. This residual has no non-key attributes of its own; it exists solely as a structural artifact of treating all entities as one flat relation. The three attributes are already individually covered as PKs or FKs in R_Architect, R_Inquiry, and R_Appointment respectively, and no domain rule requires tracking the specific combination of (architect, inquiry, appointment) as a meaningful association. R8 is therefore dissolved; all information and all constraints are captured by the 11 extracted relations.
Result after Step 1: 11 relations, all in 2NF. Lossless join holds because every extraction used a determinant that is the key of the new relation. All functional dependencies FD01-FD17 are preserved locally within exactly one relation.
Decompose from 2NF to 3NF
Each relation from 2NF is checked below.
R_Admin
Attributes : admin_id, admin_name, admin_email, admin_password
FDs : FD01 (admin_id → admin_name, admin_email, admin_password)
FD02 (admin_email → admin_id)
CKs : {admin_id}, {admin_email}
PK : admin_id
Both FD01 and FD02 have superkeys on the left-hand side. There is no non-key attribute that determines another — admin_email is a full candidate key, not a "middle step" in a transitive chain. 3NF. No decomposition needed.
R_Architect
Attributes : architect_id, architect_name
FDs : FD03 (architect_id → architect_name)
CKs / PK : {architect_id}
Only one non-key attribute exists. 3NF. No decomposition needed.
R_Building
Attributes : building_id, building_name, building_address,
building_description, admin_id
FDs : FD04 (building_id → building_name, building_address,
building_description, admin_id)
CKs / PK : {building_id}
admin_id is a non-key attribute (FK to R_Admin). The question is whether admin_id determines any other attribute inside R_Building — it does not. admin_id → admin_name etc. is a fact about R_Admin, not about R_Building. No transitive chain exists within this relation. 3NF. No decomposition needed.
R_Designs
Attributes : architect_id, building_id
FDs : none beyond the composite PK
CKs / PK : {architect_id, building_id}
No non-key attributes. 3NF. No decomposition needed.
R_Floor
Attributes : floor_id, floor_number, layout_image, building_id
FDs : FD05 (floor_id → floor_number, layout_image, building_id)
FD06 ((building_id, floor_number) → floor_id)
CKs : {floor_id}, {building_id, floor_number}
PK : floor_id
building_id is a non-key attribute (FK to R_Building). FD06 uses (building_id, floor_number) together as a composite candidate key, which is a superkey. No transitive chain. 3NF. No decomposition needed.
R_Unit
Attributes : unit_id, unit_number, room_number, floor_area, unit_status,
unit_price, unit_image, unit_floorplan, vector_image, floor_id
FDs : FD07 (unit_id → unit_number, room_number, floor_area,
unit_status, unit_price, unit_image,
unit_floorplan, vector_image, floor_id)
FD08 ((floor_id, unit_number) → unit_id)
CKs : {unit_id}, {floor_id, unit_number}
PK : unit_id
floor_id is a non-key attribute (FK to R_Floor). Within R_Unit, floor_id does not determine room_number, floor_area, or any other attribute on its own — those all require unit_id. FD08 (floor_id, unit_number) is a candidate key. No transitive chain. 3NF. No decomposition needed.
R_Agent
Attributes : agent_id, agent_name, agent_email, agent_password
FDs : FD09 (agent_id → agent_name, agent_email, agent_password)
FD10 (agent_email → agent_id)
CKs : {agent_id}, {agent_email}
PK : agent_id
Same structure as R_Admin. agent_email is a candidate key, so FD10 does not introduce a transitive dependency — both attributes are superkeys. 3NF. No decomposition needed.
R_Client
Attributes : client_id, client_name, client_email, client_phone
FDs : FD11 (client_id → client_name, client_email, client_phone)
FD12 (client_email → client_id)
CKs : {client_id}, {client_email}
PK : client_id
client_email is a candidate key, so FD12 is not a transitive dependency. 3NF. No decomposition needed.
R_Timeslot
Attributes : timeslot_id, ts_date, ts_time_start, ts_time_end,
ts_status, agent_id
FDs : FD13 (timeslot_id → ts_date, ts_time_start, ts_time_end,
ts_status, agent_id)
FD14 ((agent_id, ts_date, ts_time_start) → timeslot_id)
CKs : {timeslot_id}, {agent_id, ts_date, ts_time_start}
PK : timeslot_id
agent_id is a non-key attribute (FK to R_Agent). The composite (agent_id, ts_date, ts_time_start) is a candidate key, so FD14 has a superkey. No transitive chain. 3NF. No decomposition needed.
R_Inquiry
Attributes : inquiry_id, inquiry_message, inquiry_status,
inquiry_created_at, unit_id, client_id, agent_id
FDs : FD15 (inquiry_id → inquiry_message, inquiry_status,
inquiry_created_at, unit_id, client_id, agent_id)
CK / PK : {inquiry_id}
unit_id, client_id, and agent_id are all non-key FK attributes. None of them determines any other attribute within R_Inquiry. All non-key attributes depend directly on inquiry_id. 3NF. No decomposition needed.
R_Appointment
Attributes : appointment_id, appt_status, client_id, unit_id, timeslot_id
FDs : FD16 (appointment_id → appt_status, client_id, unit_id, timeslot_id)
FD17 (timeslot_id → appointment_id)
CKs : {appointment_id}, {timeslot_id}
PK : appointment_id
client_id and unit_id are FKs that determine nothing within R_Appointment. 3NF. No decomposition needed.
No decomposition was required at the 3NF step. All 11 relations produced in Step 1 are already in 3NF.
Decompose from 3NF to BCNF
Check every FD in each relation:
- R_Admin: FD01 (admin_id superkey), FD02 (admin_email superkey) – BCNF.
- R_Architect: FD03 (architect_id superkey) – BCNF.
- R_Building: FD04 (building_id superkey) – BCNF.
- R_Designs: no non‑trivial FDs – BCNF.
- R_Floor: FD05 (floor_id superkey), FD06 (composite CK superkey) – BCNF.
- R_Unit: FD07 (unit_id superkey), FD08 (composite CK superkey) – BCNF.
- R_Agent: FD09 (agent_id superkey), FD10 (agent_email superkey) – BCNF.
- R_Client: FD11 (client_id superkey), FD12 (client_email superkey) – BCNF.
- R_Timeslot: FD13 (timeslot_id superkey), FD14 (composite CK superkey) – BCNF.
- R_Inquiry: FD15 (inquiry_id superkey) – BCNF.
- R_Appointment: FD16 (appointment_id superkey), FD17 (timeslot_id superkey) – BCNF.
All 11 relations are in BCNF. No decomposition needed.
Final Normalized Design & Discussion
Final Relations
Admin( admin_id PK, admin_name, admin_email UNIQUE, admin_password ) Architect( architect_id PK, architect_name ) Building( building_id PK, building_name, building_address, building_description, admin_id FK → Admin ) Designs( architect_id PK, FK → Architect, building_id PK, FK → Building ) Floor( floor_id PK, floor_number, layout_image, building_id FK → Building, UNIQUE (building_id, floor_number) ) Unit( unit_id PK, unit_number, room_number, floor_area, unit_status, unit_price, unit_image, unit_floorplan, vector_image, floor_id FK → Floor, UNIQUE (floor_id, unit_number) ) Agent( agent_id PK, agent_name, agent_email UNIQUE, agent_password ) Client( client_id PK, client_name, client_email UNIQUE, client_phone ) Timeslot( timeslot_id PK, ts_date, ts_time_start, ts_time_end, ts_status, agent_id FK → Agent, UNIQUE (agent_id, ts_date, ts_time_start) ) Inquiry( inquiry_id PK, inquiry_message, inquiry_status, inquiry_created_at, unit_id FK → Unit, client_id FK → Client, agent_id FK → Agent ) Appointment( appointment_id PK, appt_status, client_id FK → Client, unit_id FK → Unit, timeslot_id FK → Timeslot, UNIQUE )
Differences compared to Phase 2
| Aspect | Phase 2 Design | P5 Normalized Design |
floor.admin_id | Present as FK column | Not present as it is reachable via floor→building→admin |
unit.admin_id | Present as FK column | Not present as it is reachable via unit→floor→building→admin |
appointment.agent_id | Present as FK column | Not present as it is reachable via appointment→timeslot→agent |
| Highest normal form | 2NF (three transitive dependencies) | BCNF / 4NF |
| Admin reachability | Direct FK on floor and unit | Via FK chain – no data loss, no redundancy |
| Agent reachability | Direct FK on appointment | Via FK chain through timeslot |
| Update anomaly risk | Changing building’s admin forced UPDATE on all floor & unit rows | Single UPDATE on Building row only
|
Changes Required for Subsequent Phases
The P5 normalized design (BCNF/4NF) will be used for all following project phases. The Phase 2 DDL must be updated by:
- Dropping column
admin_idand its FK constraint fromfloor. - Dropping column
admin_idand its FK constraint fromunit. - Dropping column
agent_idand its FK constraint fromappointment.
All application queries that previously joined directly on those redundant columns
must instead go trough the FK chain (e.g., JOIN Floor → Building → Admin).
