Changes between Version 3 and Version 4 of Normalization


Ignore:
Timestamp:
01/28/26 01:07:32 (12 days ago)
Author:
233144
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Normalization

    v3 v4  
    11= Normalization
    22
    3 Our database has the following identificators: user_id, component_id, build_id, review_id and suggestion_id.\\
    4 Some of the relations have composite primary keys, such as:\\
    5 (build_id, component_id) - build_component\\
    6 (build_id, user_id) - rating_build\\
    7 (build_id, user_id) - favorite_build\\
    8 The component subtypes like CPU, GPU, Motherboard etc. do not have separate identificators, because they use component_id as both primary and foreign key referencing the **components** table.
    9 
    10 == Functional Dependencies
    11 
    12 user_id → username, password, email\\
    13 user_id → / (admins is a subtype of user with no additional attributes)\\
    14 
    15 component_id → name, brand, price, type, img_url\\
    16 
    17 component_id → socket, cores, threads, base_clock, boost_clock, tdp                 (cpu)\\
    18 component_id → vram, tdp, base_clock, boost_clock, chipset, length                (gpu)\\
    19 component_id → memory_type, speed, capacity, modules                              (memory)\\
    20 component_id → storage_type, capacity, form_factor                                (storage)\\
    21 component_id → psu_type, wattage, form_factor                                     (power_supply)\\
    22 component_id → socket, chipset, form_factor, ram_type, num_ram_slots, max_ram_capacity, pci_express_slots                                (motherboard)\\
    23 component_id → cooler_max_height, gpu_max_length                                   (pc_case)\\
    24 component_id → cooler_type, height, max_tdp_supported                               (cooler)\\
    25 component_id → num_slots, interface                                                (memory_card)\\
    26 component_id → form_factor, type, interface, write_speed, read_speed               (optical_drive)\\
    27 component_id → sample_rate, bit_depth, chipset, interface, channel                 (sound_carad)\\
    28 component_id → length_cm, type                                                     (cable)\\
    29 component_id → wifi_version, interface, num_antennas                               (network_adapter)\\
    30 component_id → num_ports, speed, interface                                        (network_card)\\
    31 
    32 build_id → user_id, name, created_at, description, total_price, is_approved\\
    33 build_id, component_id → / (build_component)\\
    34 
    35 build_id, user_id → / (favorite_build)\\
    36 build_id, user_id → value (rating_build)\\
    37 
    38 review_id → build_id, user_id, content, created_at\\
    39 
    40 suggestion_id → user_id, admin_id, link, description, admin_comment, status, component_type\\
    41 
    42 == Universal Relation (R)
    43 
    44 R = {
    45 user_id, username, password, email,
    46 component_id, name, brand, price, type, img_url,
    47 cpu_socket, cpu_cores, cpu_threads, cpu_base_clock, cpu_boost_clock, cpu_tdp,
    48 gpu_vram, gpu_tdp, gpu_base_clock, gpu_boost_clock, gpu_chipset, gpu_length,
    49 memory_type, memory_speed, memory_capacity, memory_modules,
    50 storage_type, storage_capacity, storage_form_factor,
    51 psu_type, psu_wattage, psu_form_factor,
    52 motherboard_socket, motherboard_chipset, motherboard_form_factor,
    53 ram_type, num_ram_slots, max_ram_capacity, pci_express_slots,
    54 case_cooler_max_height, case_gpu_max_length,
    55 cooler_type, cooler_height, max_tdp_supported,
    56 build_id, build_name, created_at, description, total_price, is_approved,
    57 rating_value,
    58 review_id, review_content, review_created_at,
    59 suggestion_id, link, admin_comment, status, component_type
    60 }
    61 
    62 Left (determinants):\\
    63 user_id\\
    64 component_id\\
    65 build_id\\
    66 review_id\\
    67 suggestion_id\\
    68 
    69 Right (descriptive attributes), attributes with similar names are prefixed (cpu_, gpu_, build_, etc.):\\
    70 username, password, email,\\
    71 name, brand, price, type, img_url,\\
    72 cpu_*, gpu_*, memory_*, storage_*, psu_*, motherboard_*, case_*, cooler_*,\\
    73 build_name, created_at, description, total_price, is_approved,\\
    74 rating_value,\\
    75 review_content, review_created_at,\\
    76 link, admin_comment, status, component_type\\
    77 
    78 Left and right:\\
    79 user_id\\
    80 component_id\\
    81 build_id\\
    82 
    83 == Attribute Closures
    84 
    85 user_id+ = { user_id, username, password, email }
    86 
    87 component_id+ = { component_id, name, brand, price, type, img_url,
    88 cpu_socket, cpu_cores, cpu_threads, cpu_base_clock, cpu_boost_clock, cpu_tdp,
    89 gpu_vram, gpu_tdp, gpu_base_clock, gpu_boost_clock, gpu_chipset, gpu_length,
    90 memory_type, memory_speed, memory_capacity, memory_modules,
    91 storage_type, storage_capacity, storage_form_factor,
    92 psu_type, psu_wattage, psu_form_factor,
    93 motherboard_socket, motherboard_chipset, motherboard_form_factor,
    94 ram_type, num_ram_slots, max_ram_capacity, pci_express_slots,
    95 case_cooler_max_height, case_gpu_max_length,
    96 cooler_type, cooler_height, max_tdp_supported,
    97 num_slots, interface, form_factor, type, write_speed, read_speed,
    98 sample_rate, bit_depth, chipset, interface, channel,
    99 length_cm, type,
    100 wifi_version, interface, num_antennas,
    101 num_ports, speed, interface }
    102 
    103 build_id+ = { build_id, user_id, name, created_at, description, total_price, is_approved }
    104 
    105 (build_id, component_id)+ = { build_id, component_id } (build_component)
    106 
    107 (build_id, user_id)+ = { build_id, user_id, rating_value }
    108 
    109 review_id+ = { review_id, build_id, user_id, content, created_at }
    110 
    111 suggestion_id+ = { suggestion_id, user_id, admin_id, link, description, admin_comment, status, component_type }
    112 
    113 
    114 The union of\\
    115 { user_id, component_id, build_id, review_id, suggestion_id }+\\
    116 covers all attributes of R, so 1NF (First Normal Form) is satisfied. Now in this (lets call it) "super-table", if we consider **component_id** + other IDs as a composite key for the super-table, attributes like **cpu_socket** depend on **component_id** only, not on other IDs like **build_id** or **user_id**. This is a partial dependency, which means that 2NF is not satisfied.
    117 
    118 == 2NF Decomposition
    119 
    120 **First we will identify the partial dependencies:**\\
    121 
    122 user_id → username, email, password, is_admin\\
    123 
    124 component_id → name, brand, price, type, img_url, cpu_socket, cpu_cores, cpu_threads, cpu_base_clock, cpu_boost_clock, cpu_tdp,
    125 gpu_vram, gpu_tdp, gpu_base_clock, gpu_boost_clock, gpu_chipset, gpu_length,
    126 memory_type, memory_speed, memory_capacity, memory_modules,
    127 storage_type, storage_capacity, storage_form_factor,
    128 psu_type, psu_wattage, psu_form_factor,
    129 motherboard_socket, motherboard_chipset, motherboard_form_factor,
    130 ram_type, num_ram_slots, max_ram_capacity, pci_express_slots,
    131 case_cooler_max_height, case_gpu_max_length,
    132 cooler_type, cooler_height, max_tdp_supported
    133 
    134 build_id → user_id, name, created_at, description, total_price, is_approved
     3== Initial de-normalized relation (R) and functional dependencies
     4
     5R(
     6  user_id, username, password, email,
     7  component_id, component_name, brand, price, component_type, img_url,
     8  cpu_socket, cpu_cores, cpu_threads, cpu_base_clock, cpu_boost_clock, cpu_tdp,
     9  gpu_vram, gpu_tdp, gpu_base_clock, gpu_boost_clock, gpu_chipset, gpu_length,
     10  memory_type, memory_speed, memory_capacity, memory_modules,
     11  storage_type, storage_capacity, storage_form_factor,
     12  psu_type, psu_wattage, psu_form_factor,
     13  motherboard_socket, motherboard_chipset, motherboard_form_factor,
     14  ram_type, num_ram_slots, max_ram_capacity, pci_express_slots,
     15  case_cooler_max_height, case_gpu_max_length,
     16  case_storage_form_factor, case_storage_slots,
     17  case_ps_form_factor,
     18  case_mobo_form_factor,
     19  cooler_type, cooler_height, cooler_max_tdp,
     20  cooler_socket,
     21  memory_card_slots, memory_card_interface,
     22  optical_form_factor, optical_type, optical_interface,
     23  optical_write_speed, optical_read_speed,
     24  sound_sample_rate, sound_bit_depth, sound_chipset, sound_interface, sound_channel,
     25  cable_length, cable_type,
     26  net_adapter_wifi, net_adapter_interface, net_adapter_antennas,
     27  net_card_ports, net_card_speed, net_card_interface,
     28  build_id, build_name, build_created_at, build_description,
     29  build_total_price, build_is_approved,
     30  rating_value, review_content, review_created_at,
     31  suggestion_id, suggestion_link, suggestion_description,
     32  suggestion_status, suggestion_admin_comment, suggestion_component_type
     33)
     34
     35Functional dependencies:
     36
     37user_id → username, password, email
     38
     39component_id → name, brand, price, component_type, img_url
     40
     41component_id → cpu_socket, cpu_cores, cpu_threads, cpu_base_clock, cpu_boost_clock, cpu_tdp
     42
     43component_id → gpu_vram, gpu_tdp, gpu_base_clock, gpu_boost_clock, gpu_chipset, gpu_length
     44
     45component_id → memory_type, memory_speed, memory_capacity, memory_modules
     46
     47component_id → storage_type, storage_capacity, storage_form_factor
     48
     49component_id → psu_type, psu_wattage, psu_form_factor
     50
     51component_id → motherboard_socket, motherboard_chipset, motherboard_form_factor, ram_type, num_ram_slots, max_ram_capacity, pci_express_slots
     52
     53component_id → case_cooler_max_height, case_gpu_max_length
     54
     55component_id → cooler_type, cooler_height, cooler_max_tdp
     56
     57component_id → memory_card_slots, memory_card_interface
     58
     59component_id → optical_form_factor, optical_type, optical_interface, optical_write_speed, optical_read_speed
     60
     61component_id → sound_sample_rate, sound_bit_depth, sound_chipset, sound_interface, sound_channel
     62
     63component_id → cable_length, cable_type
     64
     65component_id → net_adapter_wifi, net_adapter_interface, net_adapter_antennas
     66
     67component_id → net_card_ports, net_card_speed, net_card_interface
     68 
     69(component_id, form_factor) → /, for case motherboard and power supply form factors
     70
     71(component_id, form_factor) → num_slots, for case storage form factors
     72
     73(component_id, socket) → /, for cooler cpu sockets
     74
     75build_id → user_id, build_name, build_created_at, build_description, build_total_price, build_is_approved
    13576
    13677(build_id, user_id) → rating_value
    13778
    138 review_id → review_build_id (FK build_id), review_user_id (FK user_id), review_content
    139 
    140 suggestion_id → suggestion_user_id, suggestion_admin_id, suggestion_link, suggestion_admin_comment, suggestion_description, suggestion_status, suggestion_component_type
    141 
    142 (build_id, component_id) → /
    143 
    144 (build_id, user_id) → /
    145 
    146 **Then we will extract the proper relations:**\\
    147 
    148 users { user_id, username, email, password } - BCNF\\
    149 admins { user_id } → FK to users - BCNF\\
    150 
    151 components { component_id, name, brand, price, type, img_url } - BCNF
    152 
    153 cpu { component_id, socket, cores, threads, base_clock, boost_clock, tdp } → FK to components - BCNF\\
    154 gpu { component_id, vram, tdp, base_clock, boost_clock, chipset, length } → FK to components - BCNF\\
    155 memory { component_id, type, speed, capacity, modules } → FK to components - BCNF\\
    156 storage { component_id, type, capacity, form_factor } → FK to components - BCNF\\
    157 power_supply { component_id, type, wattage, form_factor } → FK to components - BCNF\\
    158 motherboard{ component_id, socket, chipset, form_factor, ram_type, num_ram_slots, max_ram_capacity, pci_express_slots } → FK to components - BCNF\\
    159 pc_case { component_id, cooler_max_height, gpu_max_length } → FK to components - BCNF\\
    160 case_storage_form_factors { case_id, form_factor, num_slots } → FK to pc_case - BCNF\\
    161 case_ps_form_factors { case_id, form_factor } → FK to pc_case - BCNF\\
    162 case_mobo_form_factors { case_id, form_factor } → FK to pc_case - BCNF\\
    163 cooler { component_id, type, height, max_tdp_supported } → FK to components - BCNF\\
    164 cooler_cpu_sockets { cooler_id, socket } → FK to cooler - BCNF\\
    165 memory_card { component_id, num_slots, interface } → FK to components - BCNF\\
    166 optical_drive { component_id, form_factor, type, interface, write_speed, read_speed } → FK to components - BCNF\\
    167 sound_card { component_id, sample_rate, bit_depth, chipset, interface, channel } → FK to components - BCNF\\
    168 cables { component_id, length_cm, type } → FK to components - BCNF\\
    169 network_adapter { component_id, wifi_version, interface, num_antennas } → FK to components - BCNF\\
    170 network_card { component_id, num_ports, speed, interface } → FK to components - BCNF\\
    171 
    172 build { build_id, build_user_id, build_name, created_at, description, total_price, is_approved } → FK to users - BCNF\\
    173 build_component { build_id, component_id } → FK to build and components - BCNF\\
    174 
    175 favorite_build { build_id, user_id } → FK to build and users - BCNF\\
    176 rating_build { build_id, user_id, value } → FK to build and users - BCNF\\
    177 review { review_id, build_id, user_id, content, created_at } → FK to build and users - BCNF\\
    178 
    179 suggestions { id, user_id, admin_id, link, admin_comment, description, status, component_type } → FK to users/admins - BCNF\\
    180 
    181 == Check 3NF and BCNF
    182 
    183 We can see that all foreign keys point to primary keys, so no transitive dependencies can be detected here. All component subtypes are in separate tables which removes the partial dependencies from before. We also separated the user roles, and in the relations all non-key attributes fully depend on their primary key.
    184 
    185 3NF/BCNF is satisfied.
     79review_id → build_id, user_id, review_content, review_created_at
     80
     81suggestion_id → user_id, admin_id, link, description, status, admin_comment, component_type
     82
     83The de-normalized relation violates 1NF because it contains multi-valued attributes.
     84
     85== Candidate keys of the de-normalized relation
     86
     87Because the universal relation combines users, builds, components, reviews, and suggestions, a minimal key must uniquely identify all of them.
     88
     89The artifical candidate key is {user_id, build_id, component_id, review_id, suggestion_id}
     90
     91Universal relation primary key = {user_id, build_id, component_id, review_id, suggestion_id}
     92
     93== 1NF Decomposition
     94
     95Here we have to separate the multi-valued attributes into individual relations. This will ensure that all values are atomic.
     96
     97case_storage_form_factors(case_id, form_factor, num_slots)
     98
     99case_ps_form_factors(case_id, form_factor)
     100
     101case_mobo_form_factors(case_id, form_factor)
     102
     103cooler_cpu_sockets(cooler_id, socket)
     104
     105build_component(build_id, component_id)
     106
     107
     108We separate obvious entities in individual relations:
     109
     110users(user_id, username, password, email)
     111
     112admins(user_id)
     113
     114components(component_id, name, brand, price, component_type, img_url)
     115
     116cpu(component_id, cpu_socket, cpu_cores, cpu_threads, cpu_base_clock, cpu_boost_clock, cpu_tdp)
     117
     118gpu(component_id, gpu_vram, gpu_tdp, gpu_base_clock, gpu_boost_clock, gpu_chipset, gpu_length)
     119
     120memory(component_id, memory_type, memory_speed, memory_capacity, memory_modules)
     121
     122storage(component_id, storage_type, storage_capacity, storage_form_factor)
     123
     124power_supply(component_id, psu_type, psu_wattage, psu_form_factor)
     125
     126motherboard(component_id, motherboard_socket, motherboard_chipset, motherboard_form_factor, ram_type, num_ram_slots, max_ram_capacity, pci_express_slots)
     127
     128pc_case(component_id, case_cooler_max_height, case_gpu_max_length)
     129
     130cooler(component_id, cooler_type, cooler_height, cooler_max_tdp)
     131
     132memory_card(component_id, memory_card_slots, memory_card_interface)
     133
     134optical_drive(component_id, optical_form_factor, optical_type, optical_interface, optical_write_speed, optical_read_speed)
     135
     136sound_card(component_id, sound_sample_rate, sound_bit_depth, sound_chipset, sound_interface, sound_channel)
     137
     138cables(component_id, cable_length, cable_type)
     139
     140network_adapter(component_id, net_adapter_wifi, net_adapter_interface, net_adapter_antennas)
     141
     142network_card(component_id, net_card_ports, net_card_speed, net_card_interface)
     143
     144build(build_id, user_id, build_name, build_created_at, build_description, build_total_price, build_is_approved)
     145
     146review(review_id, build_id, user_id, review_content, review_created_at)
     147
     148rating_build(build_id, user_id, rating_value)
     149
     150suggestions(suggestion_id, user_id, admin_id, link, description, status, admin_comment, component_type)
     151
     152all of which satisfy 1NF.
     153
     154
     155Candidate keys & primary keys:
     156
     157users → user_id
     158
     159admins → user_id
     160
     161components → component_id
     162
     163component subclasses (like cpu, gpu etc.) → component_id
     164
     165build → build_id
     166
     167review → review_id
     168
     169rating_build → (build_id, user_id)
     170
     171favorite_build → (build_id, user_id)
     172
     173suggestions → suggestion_id
     174
     175== Second Normal Form (2NF)
     176
     1772NF focuses on partial dependencies, which can only occur when a relation has a composite primary key and a non-key attribute depends on only part of that key. Relations with a single-attribute primary key were considered to be automatically in 2NF. Most of the relations obtained after 1NF like users, components, build, review and suggestions, have single-attribute primary keys so partial dependencies cannot exist in these relations. The remaining relations that have composite primary keys build_component, favorite_build, rating_build were analyzed separately.
     178
     179build_component(build_id, component_id) - doesn't have non-key attributes so it is already 2NF
     180
     181favorite_build(build_id, user_id) - doesn't have non-key attributes so it is already 2NF
     182
     183rating_build(build_id, user_id, rating_value) - rating_value depends on full key so it is 2NF
     184
     185In the case of associative tables like build_component and favorite_build no non-key attributes are present so partial dependencies are not possible. For rating_build the non-key attribute depend on the full composite key (a rating is uniquely determined by both the user and the build). Since no non-key attribute was found to be partially dependent all relations satisfy 2NF.
     186
     187== Third normal Form (3NF)
     188
     189users(user_id, username, password, email) - all attributes depend directly on user_id so 3NF is satisfied
     190
     191components(component_id, name, brand, price, component_type, img_url) - all attributes depend directly on component_id so this is 3NF
     192
     193build(build_id, user_id, build_name, build_created_at, build_description, build_total_price, build_is_approved) - all attributes depend on build_id which means that 3NF is satisfied
     194
     195component subclasses (like cpu, gpu, memory etc.) - all attributes depend directly on component_id so 3NF is satisfied
     196
     197review(review_id, build_id, user_id, review_content, review_created_at) - attributes depend on review_id, 3NF is satisfied
     198
     199rating_build(build_id, user_id, rating_value) → rating_value depends on (build_id, user_id), 3NF is satisfied
     200
     201favorite_build(build_id, user_id) - no non-key attributes so this automatically satisfies 3NF
     202
     203suggestions(suggestion_id, user_id, admin_id, link, description, status, admin_comment, component_type) - attributes depend on suggestion_id, 3NF is satisfied
     204
     205case_storage_form_factors(case_id, form_factor, num_slots) - no transitive dependencies, 3NF is satisfied
     206
     207case_ps_form_factors(case_id, form_factor) - no transitive dependencies, 3NF is satisfied
     208
     209case_mobo_form_factors(case_id, form_factor) - no transitive dependencies, 3NF is satisfied
     210
     211cooler_cpu_sockets(cooler_id, socket) - no transitive dependencies, 3NF is satisfied
     212
     213build_component(build_id, component_id) - no transitive dependencies because there are no non-key attributes, therefore 3NF is satisfied.
     214
     215== BCNF decomposition
     216
     217BCNF checks if every determinant is a super-key so we need to analyze all 3NF relations.
     218
     219users(user_id, username, password, email):
     220
     221 FD: user_id → username, password, email
     222
     223 Determinant user_id is the primary key (superkey), BCNF
     224
     225
     226admins(user_id):
     227
     228 Determinant user_id is the primary key (superkey), BCNF
     229
     230
     231components(component_id, name, brand, price, component_type, img_url):
     232
     233 FD: component_id → name, brand, price, component_type, img_url
     234
     235 Determinant component_id is the primary key (superkey), BCNF
     236
     237
     238Component specialization tables (cpu, gpu, memory, etc.):
     239
     240 Each table: primary key = component_id
     241
     242 All attributes depend only on component_id
     243
     244 Determinant is primary key(superkey), BCNF
     245
     246
     247build(build_id, user_id, build_name, build_created_at, build_description, build_total_price, build_is_approved):
     248
     249 FD: build_id → user_id, build_name, build_created_at, build_description, build_total_price, build_is_approved
     250
     251 Determinant build_id is primary key(superkey), BCNF
     252
     253
     254build_component(build_id, component_id):
     255
     256 FD: composite key (build_id, component_id) → /
     257
     258 No non-key attributes so its by default BCNF
     259
     260
     261favorite_build(build_id, user_id):
     262
     263 FD: Composite key (build_id, user_id) → /
     264
     265 No non-key attributes so its by default BCNF
     266
     267
     268rating_build(build_id, user_id, rating_value):
     269
     270 FD: (build_id, user_id) → rating_value
     271
     272 Determinant is full primary key (superkey), BCNF
     273
     274
     275review(review_id, build_id, user_id, review_content, review_created_at):
     276
     277 FD: review_id → build_id, user_id, review_content, review_created_at
     278
     279 Determinant = primary key (superkey), BCNF
     280
     281
     282suggestions(suggestion_id, user_id, admin_id, link, description, status, admin_comment, component_type):
     283
     284 FD: suggestion_id → user_id, admin_id, link, description, status, admin_comment, component_type
     285
     286 Determinant is primary key (superkey)
     287
     288
     289case_storage_form_factors(case_id, form_factor, num_slots):
     290
     291 FD: (case_id, form_factor) → num_slots
     292
     293 Full composite key is determinant (superkey), BCNF
     294
     295
     296case_ps_form_factors(case_id, form_factor):
     297
     298 FD: (case_id, form_factor) → /
     299
     300 Full composite key is determinant (superkey), BCNF
     301
     302
     303case_mobo_form_factors(case_id, form_factor):
     304 
     305 FD: (case_id, form_factor) → /
     306
     307 Full composite key is determinant (superkey), BCNF
     308
     309 
     310cooler_cpu_sockets(cooler_id, socket):
     311
     312 FD: (cooler_id, socket) → /
     313
     314 Full composite key is determinant (superkey), BCNF
     315
     316
     317We can now see that all determinants are superkeys, so BCNF is satisfied.
     318
     319