Normalization and design improvements
In my initial DDL design, several entities shared the same attribute names, so I renamed them to make the design clearer and eliminate any ambiguous information. The changes I made follow the format (relation_name_specification):
content → story_content, chapter_content, comment_content, notification_content, list_content
created_at → user_created_at, story_created_at, chapter_created_at, list_created_at, notification_created_at, like_created_at, comment_created_at, collab_created_at, suggestion_created_at
updated_at → user_updated_at, story_updated_at, chapter_updated_at, list_updated_at, comment_updated_at
name → user_name, genre_name, list_name
After these changes, I obtained the following de-normalized relation:
Global Relation
R = {user_id, username, email, user_name, surname, password, user_created_at, user_updated_at, story_id, mature_content, short_description, image, story_content, story_created_at, story_updated_at, status, chapter_id, chapter_number, chapter_name, title, chapter_content, word_count, rating, published_at, view_count, chapter_created_at, chapter_updated_at, genre_id, genre_name, list_id, list_name, list_content, is_public, list_created_at, list_updated_at, added_at, notification_id, notification_content, is_read, type, link, notification_created_at, content_type, suggestion_id, original_text, suggested_text, accepted, applied_at, suggestion_created_at, suggestion_type, comment_id, comment_content, comment_created_at, comment_updated_at, roles, permission_level, collab_created_at, like_created_at}
Functional Dependencies:
- FD1: user_id → username, email, user_name, surname, password, user_created_at, user_updated_at
- FD2: username → user_id
- FD3: email → user_id
- FD4: story_id → mature_content, short_description, image, story_content, user_id, story_created_at, story_updated_at
- FD5: chapter_id → chapter_number, chapter_name, title, chapter_content, word_count, rating, published_at, view_count, story_id, chapter_created_at, chapter_updated_at
- FD6: {story_id, chapter_number} → chapter_id
- FD7: genre_id → genre_name
- FD8: genre_name → genre_id
- FD9: list_id → list_name, list_content, is_public, user_id, list_created_at, list_updated_at
- FD10: notification_id → notification_content, is_read, recipient_user_id, type, link, notification_created_at
- FD11: suggestion_id → original_text, suggested_text, accepted, suggestion_created_at, applied_at, story_id
- FD12:comment_id → comment_content, user_id, story_id, comment_created_at, comment_updated_at
- FD13: {user_id, story_id} → like_created_at
- FD14: {user_id, story_id} → collab_created_at
- FD15:{list_id, story_id} → added_at
Closures
user_id+ = {user_id, username, email, user_name, surname, password,user_created_at, user_updated_at} → Does NOT contain all attributes of R
username+ = {username, user_id, email, user_name, surname, password, user_created_at, user_updated_at} → Does NOT contain all attributes of R
email+ = {email, user_id, username, user_name, surname, password, user_created_at, user_updated_at} → Does NOT contain all attributes of R
story_id+ = {story_id, mature_content, short_description, image, story_content, user_id, story_created_at, story_updated_at, username, email, user_name, surname, password, user_created_at, user_updated_at} → Does NOT contain all attributes of R
chapter_id+ = {chapter_id, chapter_number, chapter_name, title, chapter_content, word_count, rating, published_at, view_count, story_id, chapter_created_at, chapter_updated_at, mature_content, short_description, image, story_content, user_id, story_created_at, story_updated_at, username, email, user_name, surname, password, user_created_at, user_updated_at} → Does NOT contain all attributes of R
{story_id, chapter_number}+ = {story_id, chapter_number, chapter_id, chapter_name, title, chapter_content, word_count, rating, published_at, view_count, chapter_created_at, chapter_updated_at, mature_content, short_description, image, story_content, user_id, story_created_at, story_updated_at, username, email, user_name, surname, password, user_created_at, user_updated_at} → Does NOT contain all attributes of R
genre_id+ = {genre_id, genre_name} → Does NOT contain all attributes of R
genre_name+ = {genre_name, genre_id} → Does NOT contain all attributes of R
list_id+ = {list_id, list_name, list_content, is_public, user_id, list_created_at, list_updated_at, username, email, user_name, surname, password, user_created_at, user_updated_at} → Does NOT contain all attributes of R
notification_id+ = {notification_id, notification_content, is_read, recipient_user_id, type, link, notification_created_at} → Does NOT contain all attributes of R
suggestion_id+ = {suggestion_id, original_text, suggested_text, accepted, suggestion_created_at, applied_at, story_id, mature_content, short_description, image, story_content, user_id, story_created_at, story_updated_at, username, email, user_name, surname, password, user_created_at, user_updated_at} → Does NOT contain all attributes of R
comment_id+ = {comment_id, comment_content, user_id, story_id, comment_created_at, comment_updated_at, mature_content, short_description, image, story_content, story_created_at, story_updated_at, username, email, user_name, surname, password, user_created_at, user_updated_at} → Does NOT contain all attributes of R
{user_id, story_id}+ = {user_id, story_id, like_created_at, collab_created_at, mature_content, short_description, image, story_content, story_created_at, story_updated_at, username, email, user_name, surname, password, user_created_at, user_updated_at} → Does NOT contain all attributes of R
{list_id, story_id}+ = {list_id, story_id, added_at, list_name, list_content, is_public, list_created_at, list_updated_at, mature_content, short_description, image, story_content,user_id, story_created_at, story_updated_at, username, email, user_name, surname, password, user_created_at, user_updated_at} → Does NOT contain all attributes of R
1NF Check
Every column in the database holds just one value, so there are no lists or multi-value attributes in the cells. There are no repeating columns, each table has a primary key, and all attributes are atomic, so my relations satisfy 1NF.
2NF Check
Every table that has a simple (non-composite) primary key is already in 2NF, so we only need to check the tables that have composite keys.
STATUS {story_id, status} → There are no other attributes, so it satisfies 2NF.
CONTENT_TYPE {notification_id, content_type} → There are no other attributes, so it satisfies 2NF.
SUGGESTION_TYPE {suggestion_id, suggestion_type} → There are no other attributes, so it satisfies 2NF.
LIKES {user_id, story_id} → like_created_at depends on {user_id, story_id} together, so it satisfies 2NF.
COLLABORATION {user_id, story_id} → collab_created_at depends on {user_id, story_id} together, so it satisfies 2NF.
ROLES {user_id, story_id, roles} → There are no other attributes, so it satisfies 2NF.
PERMISSION_LEVEL {user_id, story_id, permission_level} → There are no other attributes, so it satisfies 2NF.
HAS_GENRE {story_id, genre_id} → There are no other attributes, so it satisfies 2NF.
READING_LIST_ITEMS {list_id, story_id} → added_at depends on {list_id, story_id} together, so it satisfies 2NF.
NOTIFY {user_id, story_id, notification_id} → There are no other attributes, so it satisfies 2NF.
NEED_APPROVAL {suggestion_id, story_id, chapter_id} → There are no other attributes, so it satisfies 2NF.
Since there are no partial dependencies in any of the tables, this schema satisfies 2NF.
3NF Check
We checked all FDs where the left side is a primary key, so we can skip those. We only need to check: FD6: {story_id, chapter_number} → chapter_id → here we can see that there is a UNIQUE constraint on {story_id, chapter_number}, which tells us that it is also a superkey. After this we can safely say that the schema satisfies 3NF.
BCNF Check
- USERS(user_id, username, email, user_name, surname, password,
user_created_at, user_updated_at) → satisfies BCNF
- ADMINS(user_id) → satisfies BCNF
- REGULAR_USER(user_id) → satisfies BCNF
- WRITER(user_id) → satisfies BCNF
- STORY(story_id, mature_content, short_description, image, story_content, user_id, story_created_at, story_updated_at) → satisfies BCNF
- STATUS(story_id, status) → satisfies BCNF
- CHAPTER(chapter_id, chapter_number, chapter_name, title, chapter_content, word_count, rating, published_at, view_count, story_id, chapter_created_at, chapter_updated_at) → satisfies BCNF
- GENRE(genre_id, genre_name) → satisfies BCNF
- READING_LIST(list_id, list_name, list_content, is_public, user_id, list_created_at, list_updated_at) → satisfies BCNF
- NOTIFICATION(notification_id, notification_content, is_read, recipient_user_id, type, link, notification_created_at) → satisfies BCNF
- CONTENT_TYPE(notification_id, content_type) → satisfies BCNF
- AI_SUGGESTION(suggestion_id, original_text, suggested_text, accepted, suggestion_created_at, applied_at, story_id) → satisfies BCNF
- SUGGESTION_TYPE(suggestion_id, suggestion_type) → satisfies BCNF
- LIKES(user_id, story_id, like_created_at) → satisfies BCNF
- COMMENT(comment_id, comment_content, user_id, story_id, comment_created_at, comment_updated_at) → satisfies BCNF
- COLLABORATION(user_id, story_id, collab_created_at) → satisfies BCNF
- ROLES(user_id, story_id, roles) → satisfies BCNF
- PERMISSION_LEVEL(user_id, story_id, permission_level) → satisfies BCNF
- HAS_GENRE(story_id, genre_id) → satisfies BCNF
- READING_LIST_ITEMS(list_id, story_id, added_at) → satisfies BCNF
- NOTIFY(user_id, story_id, notification_id) → satisfies BCNF
- NEED_APPROVAL(suggestion_id, story_id, chapter_id) → satisfies BCNF
Conclusion
After normalization we get same relations from phase 2. So this mean that the database was well modeled.
