wiki:Normalization

Version 2 (modified by 233144, 3 weeks ago) ( diff )

--

Normalization

Our database has the following identificators: user_id, component_id, build_id, review_id and suggestion_id.
Some of the relations have composite primary keys, such as:
(build_id, component_id) - build_component
(build_id, user_id) - rating_build
(build_id, user_id) - favorite_build
The component subtypes like CPU, GPU, Motherboard etc. do not have separate identificators, because they use component_id as both primary and foreign key referencing the components table.

Functional Dependencies

user_id → username, password, email
user_id → / (admins is a subtype of user with no additional attributes)

component_id → name, brand, price, type, img_url

component_id → socket, cores, threads, base_clock, boost_clock, tdp (cpu)
component_id → vram, tdp, base_clock, boost_clock, chipset, length (gpu)
component_id → memory_type, speed, capacity, modules (memory)
component_id → storage_type, capacity, form_factor (storage)
component_id → psu_type, wattage, form_factor (power_supply)
component_id → socket, chipset, form_factor, ram_type, num_ram_slots, max_ram_capacity, pci_express_slots (motherboard)
component_id → cooler_max_height, gpu_max_length (pc_case)
component_id → cooler_type, height, max_tdp_supported (cooler)
component_id → num_slots, interface (memory_card)
component_id → form_factor, type, interface, write_speed, read_speed (optical_drive)
component_id → sample_rate, bit_depth, chipset, interface, channel (sound_carad)
component_id → length_cm, type (cable)
component_id → wifi_version, interface, num_antennas (network_adapter)
component_id → num_ports, speed, interface (network_card)

build_id → user_id, name, created_at, description, total_price, is_approved
build_id, component_id → / (build_component)

build_id, user_id → / (favorite_build)
build_id, user_id → value (rating_build)

review_id → build_id, user_id, content, created_at

suggestion_id → user_id, admin_id, link, description, admin_comment, status, component_type

Universal Relation (R)

R = { user_id, username, password, email, component_id, name, brand, price, type, img_url, cpu_socket, cpu_cores, cpu_threads, cpu_base_clock, cpu_boost_clock, cpu_tdp, gpu_vram, gpu_tdp, gpu_base_clock, gpu_boost_clock, gpu_chipset, gpu_length, memory_type, memory_speed, memory_capacity, memory_modules, storage_type, storage_capacity, storage_form_factor, psu_type, psu_wattage, psu_form_factor, motherboard_socket, motherboard_chipset, motherboard_form_factor, ram_type, num_ram_slots, max_ram_capacity, pci_express_slots, case_cooler_max_height, case_gpu_max_length, cooler_type, cooler_height, max_tdp_supported, build_id, build_name, created_at, description, total_price, is_approved, rating_value, review_id, review_content, review_created_at, suggestion_id, link, admin_comment, status, component_type }

Left (determinants):
user_id
component_id
build_id
review_id
suggestion_id

Right (descriptive attributes), attributes with similar names are prefixed (cpu_, gpu_, build_, etc.):
username, password, email,
name, brand, price, type, img_url,
cpu_*, gpu_*, memory_*, storage_*, psu_*, motherboard_*, case_*, cooler_*,
build_name, created_at, description, total_price, is_approved,
rating_value,
review_content, review_created_at,
link, admin_comment, status, component_type

Left and right:
user_id
component_id
build_id

Attribute Closures

user_id+ = { user_id, username, password, email }

component_id+ = { component_id, name, brand, price, type, img_url, cpu_socket, cpu_cores, cpu_threads, cpu_base_clock, cpu_boost_clock, cpu_tdp, gpu_vram, gpu_tdp, gpu_base_clock, gpu_boost_clock, gpu_chipset, gpu_length, memory_type, memory_speed, memory_capacity, memory_modules, storage_type, storage_capacity, storage_form_factor, psu_type, psu_wattage, psu_form_factor, motherboard_socket, motherboard_chipset, motherboard_form_factor, ram_type, num_ram_slots, max_ram_capacity, pci_express_slots, case_cooler_max_height, case_gpu_max_length, cooler_type, cooler_height, max_tdp_supported, num_slots, interface, form_factor, type, write_speed, read_speed, sample_rate, bit_depth, chipset, interface, channel, length_cm, type, wifi_version, interface, num_antennas, num_ports, speed, interface }

build_id+ = { build_id, user_id, name, created_at, description, total_price, is_approved }

(build_id, component_id)+ = { build_id, component_id } -- Build_Component

(build_id, user_id)+ = { build_id, user_id, rating_value } -- RatingBuild

review_id+ = { review_id, build_id, user_id, content, created_at }

suggestion_id+ = { suggestion_id, user_id, admin_id, link, description, admin_comment, status, component_type }

The union of
{ user_id, component_id, build_id, review_id, suggestion_id }+
covers all attributes of R, so 1NF (First Normal Form) is satisfied. Now in this (lets call it) "super-table", if we consider component_id + other IDs as a composite key for the super-table, attributes like cpu_socket depend on component_id only, not on other IDs like build_id or user_id. This is a partial dependency, which means that 2NF is not satisfied.

2NF Decomposition

First we will identify the partial dependencies:

user_id → username, email, password, is_admin

component_id → name, brand, price, type, img_url, cpu_socket, cpu_cores, cpu_threads, cpu_base_clock, cpu_boost_clock, cpu_tdp, gpu_vram, gpu_tdp, gpu_base_clock, gpu_boost_clock, gpu_chipset, gpu_length, memory_type, memory_speed, memory_capacity, memory_modules, storage_type, storage_capacity, storage_form_factor, psu_type, psu_wattage, psu_form_factor, motherboard_socket, motherboard_chipset, motherboard_form_factor, ram_type, num_ram_slots, max_ram_capacity, pci_express_slots, case_cooler_max_height, case_gpu_max_length, cooler_type, cooler_height, max_tdp_supported

build_id → user_id, name, created_at, description, total_price, is_approved

(build_id, user_id) → rating_value

review_id → review_build_id (FK build_id), review_user_id (FK user_id), review_content

suggestion_id → suggestion_user_id, suggestion_admin_id, suggestion_link, suggestion_admin_comment, suggestion_description, suggestion_status, suggestion_component_type

(build_id, component_id) → /

(build_id, user_id) → /

Then we will extract the proper relations:

users { user_id, username, email, password } - BCNF
admins { user_id } → FK to users - BCNF

components { component_id, name, brand, price, type, img_url } - BCNF

cpu { component_id, socket, cores, threads, base_clock, boost_clock, tdp } → FK to components - BCNF
gpu { component_id, vram, tdp, base_clock, boost_clock, chipset, length } → FK to components - BCNF
memory { component_id, type, speed, capacity, modules } → FK to components - BCNF
storage { component_id, type, capacity, form_factor } → FK to components - BCNF
power_supply { component_id, type, wattage, form_factor } → FK to components - BCNF
motherboard{ component_id, socket, chipset, form_factor, ram_type, num_ram_slots, max_ram_capacity, pci_express_slots } → FK to components - BCNF
pc_case { component_id, cooler_max_height, gpu_max_length } → FK to components - BCNF
case_storage_form_factors { case_id, form_factor, num_slots } → FK to pc_case - BCNF
case_ps_form_factors { case_id, form_factor } → FK to pc_case - BCNF
case_mobo_form_factors { case_id, form_factor } → FK to pc_case - BCNF
cooler { component_id, type, height, max_tdp_supported } → FK to components - BCNF
cooler_cpu_sockets { cooler_id, socket } → FK to cooler - BCNF
memory_card { component_id, num_slots, interface } → FK to components - BCNF
optical_drive { component_id, form_factor, type, interface, write_speed, read_speed } → FK to components - BCNF
sound_card { component_id, sample_rate, bit_depth, chipset, interface, channel } → FK to components - BCNF
cables { component_id, length_cm, type } → FK to components - BCNF
network_adapter { component_id, wifi_version, interface, num_antennas } → FK to components - BCNF
network_card { component_id, num_ports, speed, interface } → FK to components - BCNF

build { build_id, build_user_id, build_name, created_at, description, total_price, is_approved } → FK to users - BCNF
build_component { build_id, component_id } → FK to build and components - BCNF

favorite_build { build_id, user_id } → FK to build and users - BCNF
rating_build { build_id, user_id, value } → FK to build and users - BCNF
review { review_id, build_id, user_id, content, created_at } → FK to build and users - BCNF

suggestions { id, user_id, admin_id, link, admin_comment, description, status, component_type } → FK to users/admins - BCNF

Check 3NF and BCNF

We can see that all foreign keys point to primary keys, so no transitive dependencies can be detected here. All component subtypes are in separate tables which removes the partial dependencies from before. We also separated the user roles, and in the relations all non-key attributes fully depend on their primary key.

3NF/BCNF is satisfied.

Note: See TracWiki for help on using the wiki.