= DatabaseCreation =

== Introduction ==

The database schema is implemented in PostgreSQL using the DDL from `DDL_manual.sql`. This represents the final, refined version of the relational model after addressing tool limitations and incorporating feedback from professor consultations.

== Key Design Decisions ==

=== Primary Key Strategy ===

Two distinct primary key strategies were employed:

1. Surrogate keys for excessively long composite natural keys - Entities that would inherit composite keys from multiple parent entities (potentially 5‑20 attributes) receive a single id column with `GENERATED ALWAYS AS IDENTITY`. The natural key columns are retained with `UNIQUE NOT NULL` constraints to preserve semantic integrity without propagating keys further down the chain.

2. Surrogate keys as genuine identifiers – For many‑to‑many relationships where the foreign key pair alone does not uniquely identify a record (e.g., `Exam_Attempt` where a student may attempt the same exam multiple times), a surrogate `id` is added because no natural key exists.

=== `GENERATED ALWAYS AS IDENTITY` vs `SERIAL` ===

All surrogate keys use `INTEGER GENERATED ALWAYS AS IDENTITY` instead of `SERIAL` for the following reasons:

|| '''Feature''' || '''SERIAL''' || '''GENERATED ALWAYS AS IDENTITY''' ||
|| SQL Standard || Non‑standard (PostgreSQL specific) || SQL Standard compliant ||
|| Under the Hood || Creates a separate SEQUENCE object implicitly || Links a SEQUENCE directly to the column ||
|| Accidental Overwrites || Easy to accidentally bypass or break || Strictly protected (unless forced) ||
|| Permissions || Requires separate permissions on the sequence || Permissions are tied directly to the table ||

`SERIAL` is syntactic sugar that creates a sequence and sets the column’s `DEFAULT` to fetch from it. Because it is just a default, PostgreSQL does not prevent manual value insertion. `GENERATED ALWAYS AS IDENTITY` enforces strict safety rules – the database is solely responsible for generating IDs. If overriding is absolutely necessary (e.g., during data migration), `OVERRIDING SYSTEM VALUE` can be explicitly specified.

This choice ensures portability across enterprise database engines (Oracle, DB2, SQL Server) that support the SQL:2003 standard.

=== Constraint Naming Convention ===

Constraints follow a consistent naming pattern:

- Primary Keys: `pk_{table_name}`
- Foreign Keys: `fk_{table_name}_{referenced_table}`
- Not Null: `nn_{table_name}_{attribute}`
- Unique: `uq_{table_name}_{attribute(s)}`
- Check: `ck_{table_name}_{constraint_description}`

Example:

{{{
#!sql
CONSTRAINT pk_exercise PRIMARY KEY(id)
CONSTRAINT fk_exercise_course_edition_id FOREIGN KEY(course_edition_id) REFERENCES Course_Edition(id)
CONSTRAINT nn_exercise_title CHECK(title IS NOT NULL)
CONSTRAINT uq_semester_enrollment_peidays UNIQUE(program_enrollment_id, academic_year, semester)
CONSTRAINT ck_exercise_max_grade CHECK(max_grade > 0)
}}}

=== `ON DELETE SET NULL` for Historical Data Preservation ===

All foreign keys use `ON DELETE SET NULL ON UPDATE CASCADE` to preserve historical data. When a referenced record is deleted, dependent records remain in the database with a `NULL` foreign key, maintaining the audit trail. This is critical for academic systems where grade records, exam attempts, and submissions must be preserved even if the associated course edition or student record is removed.

=== Enum for Exam Types ===

Exam types are defined as a PostgreSQL `ENUM` type:

{{{
#!sql
CREATE TYPE EXAM_TYPE AS ENUM (
    'partial_1',
    'partial_2',
    'session_jan',
    'session_jun',
    'session_sep'
);
}}}

PostgreSQL stores only a 4‑byte numeric OID per value rather than the literal string. In contrast, a `TEXT` column with a `CHECK` constraint would store variable‑length strings (10‑12 bytes per value).