Context Navigation

← Previous Change
Wiki History
Next Change →

Changes between Initial Version and Version 1 of AdvancedTopic

Timestamp:: 07/01/26 23:29:48 (4 days ago)
Author:: 231082
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

AdvancedTopic

               v1
+= AdvancedTopic =
+== Introduction
+The advanced topic explores two PostgreSQL-specific optimisation techniques:
+. **Custom Domains** – using PostgreSQL’s `DOMAIN` feature to enforce semantic constraints on small integer values while minimising storage overhead.
+. **Column Tetris (Decreasing‑Size Alignment Pattern)** – reordering table columns by decreasing size to reduce padding and improve cache efficiency.
+Both techniques aim to reduce the physical storage footprint and improve query performance without changing the logical schema or the application’s behaviour.
+== Background: Padding and Alignment in Computer Architecture
+=== How C Structs Store Data
+When a C compiler lays out a structure in memory, it aligns each member to its natural alignment boundary. The alignment requirement of a type is typically its size:
+| Type | Size (bytes) | Alignment (bytes) |
+|------|--------------|-------------------|
+| `char` | 1 | 1 |
+| `short` | 2 | 2 |
+| `int` | 4 | 4 |
+| `float` | 4 | 4 |
+| `double` | 8 | 8 |
+| `void*` (pointer) | 8 (64‑bit) | 8 |
+The compiler inserts **padding** between members to satisfy these alignment requirements. Consider this struct:
+{{{
+#!c
+struct BadLayout {
+    char c;      // 1 byte
+    int  i;      // 4 bytes
+    char d;      // 1 byte
+};
+}}}
+In memory, this occupies:
+- `c` at offset 0 (1 byte)
+- Padding of 3 bytes (offsets 1–3) to align `i` to a 4‑byte boundary
+- `i` at offset 4 (4 bytes)
+- `d` at offset 8 (1 byte)
+- Padding of 3 bytes (offsets 9–11) to align the whole structure to 4 bytes
+**Total size: 12 bytes**, even though the data itself is only 6 bytes – **50% overhead** from padding.
+Now consider the optimised version:
+{{{
+#!c
+struct GoodLayout {
+    int  i;      // 4 bytes
+    char c;      // 1 byte
+    char d;      // 1 byte
+};
+}}}
+In memory:
+- `i` at offset 0 (4 bytes)
+- `c` at offset 4 (1 byte)
+- `d` at offset 5 (1 byte)
+- Padding of 2 bytes (offsets 6–7) to align the whole structure to 4 bytes
+**Total size: 8 bytes** – only 2 bytes of padding, or 25% overhead.
+This is the principle behind **Column Tetris**: by ordering members from largest to smallest, padding is minimised.
+=== Why This Matters for Databases
+Database systems store tuples (rows) in a similar way – as contiguous blocks of memory (or disk). Each tuple has a header followed by column data, with alignment applied. Therefore, the same padding principle applies: ordering columns by decreasing size can reduce the total tuple size, allowing more tuples per page, reducing I/O, and improving cache efficiency.
+However, **PostgreSQL is not a C struct**. Its tuple layout has important differences, discussed later.
+== Custom Domains for Small Integer Constraints
+PostgreSQL does not provide a native 1‑byte integer type. However, the `"char"` type stores exactly one byte and can be cast to/from `INTEGER`. By defining custom domains on top of `"char"`, we can enforce value ranges and document the semantic meaning of columns, all while using only one byte of storage.
+=== Implementation
+The following cast functions and implicit casts enable seamless usage of `"char"` as an integer:
+{{{
+#!sql
+CREATE OR REPLACE FUNCTION int_to_char(int) RETURNS "char" AS $$
+SELECT $1::"char";
+$$ LANGUAGE SQL IMMUTABLE STRICT;
+CREATE OR REPLACE FUNCTION char_to_int("char") RETURNS int AS $$
+SELECT $1::int;
+$$ LANGUAGE SQL IMMUTABLE STRICT;
+-- Make casts implicit for seamless usage
+UPDATE pg_cast SET castcontext = 'i'
+WHERE castsource = 'integer'::regtype AND casttarget = '"char"'::regtype;
+CREATE CAST ("char" AS int) WITH FUNCTION char_to_int("char") AS IMPLICIT;
+UPDATE pg_cast SET castcontext = 'i'
+WHERE castsource = '"char"'::regtype AND casttarget = 'integer'::regtype;
+}}}
+Domains are then defined to restrict values and provide semantic meaning:
+{{{
+#!sql
+CREATE DOMAIN UDV_CURRICULUM_SEMESTER AS "char"
+    CHECK (VALUE::int >= 1 AND VALUE::int <= 12);
+CREATE DOMAIN UDV_STUDENT_GRADE_GRADE AS "char"
+    CHECK (VALUE::int >= 6 AND VALUE::int <= 10);
+CREATE DOMAIN UDV_SEMESTER_ENROLLMENT_SEMESTER AS "char"
+    CHECK (VALUE::int >= 1 AND VALUE::int <= 2);
+CREATE DOMAIN UDV_COURSE_EDITION_SEMESTER AS "char"
+    CHECK (VALUE::int >= 1 AND VALUE::int <= 2);
+CREATE DOMAIN UDV_ACADEMIC_PROGRAM_DURATION_YEARS AS "char"
+    CHECK (VALUE::int >= 1 AND VALUE::int <= 6);
+}}}
+=== Benefits
+ * '''Storage''' – 1 byte per value instead of 2 (`SMALLINT`) or 4 (`INTEGER`).
+ * '''Semantic documentation''' – domain names clearly indicate the expected range.
+ * '''Constraint encapsulation''' – validation logic is defined once and reused.
+ * '''Type safety''' – the database rejects out‑of‑range values.
+=== Drawbacks
+ * '''Overhead''' – implicit casts add a small CPU cost during query execution.
+ * '''Complexity''' – requires setup of cast functions and domain definitions.
+ * '''Alignment interactions''' – the 1‑byte size may cause misalignment when combined with larger types if columns are not ordered carefully.
+== Column Tetris (Decreasing‑Size Alignment Pattern)
+=== PostgreSQL Tuple Layout
+PostgreSQL uses a heap‑based storage model. Each tuple consists of:
+. **Heap Tuple Header** – fixed 23 bytes containing metadata (transaction IDs, etc.)
+. **Null Bitmap** – optional, 1 bit per column (rounded up to full bytes)
+. **Column Data** – stored in the order defined in the `CREATE TABLE` statement
+. **Variable‑length Column Data** – each variable‑length column starts with a 4‑byte length prefix
+'''Key Difference from C Structs''': PostgreSQL uses a **1‑byte alignment** for all columns in the heap (with some exceptions for `FLOAT` and `DOUBLE`). This means the padding effect is much smaller than in C – column ordering has minimal impact on the heap tuple size.
+However, **indexes are a different story**. Indexes are stored as B‑trees (or other structures) with their own internal layout. The index key order affects:
+ * How many entries fit on each page
+ * The tree depth
+ * Insertion efficiency and page splits
+=== The Optimisation Hypothesis
+If column ordering can reduce padding in C structs, and PostgreSQL stores tuples with some alignment, then ordering columns by decreasing size should still provide some benefit – even if less dramatic than in C. Additionally, if the table itself is more compact, then:
+ * More rows fit per page (less I/O)
+ * Indexes that reference those columns may be smaller
+ * Cache utilisation improves
+=== Implementation
+The DDL for the optimised schema follows the decreasing‑size alignment pattern:
+'''Table `t_Member` (original order)'''
+{{{
+#!sql
+CREATE TABLE Member (
+    id              INTEGER,
+    name            TEXT,
+    surname         TEXT,
+    username        TEXT,
+    password_hash   TEXT,
+    date_birth      DATE,
+    is_active       BOOLEAN,
+    created_at      TIMESTAMP,
+    updated_at      TIMESTAMP
+);
+}}}
+'''Table `t_Member` (optimised order)'''
+{{{
+#!sql
+CREATE TABLE t_Member (
+    created_at      TIMESTAMP WITHOUT TIME ZONE,   -- 8 bytes
+    updated_at      TIMESTAMP WITHOUT TIME ZONE,   -- 8 bytes
+    id              INTEGER,                       -- 4 bytes
+    date_birth      DATE,                          -- 4 bytes
+    is_active       BOOLEAN,                       -- 1 byte
+    name            TEXT,                          -- variable
+    surname         TEXT,                          -- variable
+    username        TEXT,                          -- variable
+    password_hash   TEXT                           -- variable
+);
+}}}
+Now the first three groups fill 8‑byte blocks with minimal padding.
+'''Table `t_Student_Answer` (original order)'''
+{{{
+#!sql
+CREATE TABLE Student_Answer (
+    answer                          TEXT,
+    points_acquired                 FLOAT,
+    Exam_Problempid                 INT,
+    Exam_Attemptattempt_number      SMALLINT,
+    Exam_AttemptCourse_Enrollmentid INT,
+    Exam_AttemptExamid2             INT
+);
+}}}
+'''Table `t_Student_Answer` (optimised order)'''
+{{{
+#!sql
+CREATE TABLE t_Student_Answer (
+    exam_id                         INTEGER,        -- 4 bytes
+    exam_problem_id                 INTEGER,        -- 4 bytes
+    exam_attempt_ceid               INTEGER,        -- 4 bytes
+    points_acquired                 FLOAT,          -- 4 bytes
+    exam_attempt_attempt_number     SMALLINT,       -- 2 bytes
+    answer                          TEXT            -- variable
+);
+}}}
+'''Critical observation''': The original order had `TEXT` (variable) first, then `FLOAT`, then `INTEGER`, etc. The optimised order moves all fixed‑length `INTEGER` and `FLOAT` columns first, placing the variable‑length `TEXT` column last.
+=== Why This Should Work (In Theory)
+. **Reduced tuple header overhead** – variable‑length columns require additional metadata in the tuple header. Moving them to the end may simplify the header.
+. **Better cache locality** – the most frequently accessed columns are fixed‑length and placed contiguously.
+. **More rows per page** – if tuple size decreases, more rows fit on each 8KB page, reducing I/O.
+. **Index compaction** – if the table is more compact, secondary indexes that reference heap rows may be more efficient.
+=== Why It Might Not Work (In Practice)
+. **1‑byte alignment** – PostgreSQL uses 1‑byte alignment for most types, so the padding effect is minimal.
+. **Indices are separate structures** – reordering columns in the table does **not** reorder columns in the index. The index still stores the same data, but the order of the key columns is defined separately by the index definition.
+. **TOAST** – large `TEXT` columns are stored out‑of‑line, so moving them doesn't affect the main tuple size as much as expected.
+. **Index key order** – if the primary key column order changes, the index B‑tree may become less dense or more fragmented.
+== Results
+=== Data Porting
+All data was successfully ported from the original database to the optimised schema using `dblink` with `OVERRIDING SYSTEM VALUE` to preserve identity column values. Sequences were reset to their maximum values after the import.
+=== Size Comparison
+After `ANALYZE`, `REINDEX`, and `VACUUM`:
+| Database | Size |
+|----------|------|
+| Original (unoptimised) | 13,459,142,335 bytes (~12.54 GB) |
+| Optimised (post‑maintenance) | 13,674,971,663 bytes (~12.73 GB) |
+| '''Difference''' | '''+215,829,328 bytes (~0.19 GB, ~1.6% increase)''' |
+=== Table‑by‑Table Breakdown
+| Table | Original Data | Optimised Data | Original Index | Optimised Index | Data Δ | Index Δ |
+|-------|--------------|----------------|----------------|-----------------|--------|---------|
+| `student_answer` | 3,194 MB | 3,404 MB | 799 MB | 1,035 MB | +210 MB | +236 MB |
+| `exercise_submission` | 1,327 MB | 1,327 MB | 221 MB | 290 MB | 0 MB | +69 MB |
+| `exam_attempt` | 509 MB | 509 MB | 266 MB | 344 MB | 0 MB | +78 MB |
+| `survey_response` | 551 MB | 551 MB | 401 MB | 528 MB | 0 MB | +127 MB |
+| `member` | 105 MB | 102 MB | 58 MB | 69 MB | -3 MB | +11 MB |
+'''Key observation''': The data size remained almost unchanged. The increase is almost entirely in the **indexes**.
+=== Query Performance
+| View | Original Time | Optimised Time | Improvement |
+|------|--------------|----------------|-------------|
+| `v_student_registration_state` | ~2,268 ms | ~370 ms | '''6× faster''' |
+| `v_teaching_assistant_eligibility` | ~923 ms | ~383 ms | '''2.4× faster''' |
+The query optimisations (parameterised functions) were the **real success**.
+== Analysis: Why the Storage Optimisation Didn't Work
+=== 1. PostgreSQL Uses 1‑Byte Alignment
+Unlike C compilers which align to the type's size, PostgreSQL uses a **1‑byte alignment** for most columns in the heap. The exception is for `FLOAT` and `DOUBLE` types, which have stricter alignment requirements. For `INTEGER`, `SMALLINT`, and `"char"`, the alignment is effectively 1 byte, so:
+ * Reordering columns does not significantly affect tuple size.
+ * The padding saved by moving `SMALLINT` after `INTEGER` is minimal (1‑2 bytes per tuple at most).
+### 2. Indexes Are Separate Structures
+The index B‑tree is **not affected** by the table's column order. The index stores only the indexed columns (plus a reference to the heap row). Therefore:
+ * Table‑level column reordering does not automatically optimise indexes.
+ * Index size is determined by the index key definition, not the table layout.
+### 3. The Primary Key Column Order Changed
+The optimised schema changed the order of columns in the primary key for several tables. For example:
+'''Original `Student_Answer` PK''': `(Exam_Problempid, Exam_Attemptattempt_number, Exam_AttemptCourse_Enrollmentid, Exam_AttemptExamid2)`
+'''Optimised `t_Student_Answer` PK''': `(exam_id, exam_problem_id, exam_attempt_ceid, exam_attempt_attempt_number)`
+Although the columns are the same, the **order** changed. This affects:
+ * The B‑tree structure (different key order)
+ * Index density (how many keys fit per page)
+ * Insertion performance (page splits)
+This likely caused the index to become **larger** and **less dense**.
+### 4. No Domains Were Used
+In the final optimised DDL, the domains were **commented out** and replaced with `SMALLINT`. For example:
+{{{
+#!sql
+-- semester UDV_COURSE_EDITION_SEMESTER,
+semester SMALLINT,
+}}}
+This means:
+ * No 1‑byte storage savings from the domain.
+ * The columns remained 2 bytes (the same as the original in many cases).
+ * No benefit from the custom types.
+== Lessons Learned
+. **PostgreSQL's heap storage is not a C struct** – the 1‑byte alignment means column ordering has minimal impact on tuple size.
+. **Indexes are independent structures** – optimising the table layout does not automatically optimise indexes.
+. **Changing primary key order can backfire** – the B‑tree may become larger and more fragmented.
+. **Domain usage matters** – if you don't actually use the domains, you don't get the benefits.
+. **Query optimisation is where the real gains are** – parameterised functions improved performance significantly.
+== Conclusion
+The advanced topic demonstrated that **not all optimisation techniques transfer directly from computer architecture to database systems**. Column Tetris, while effective in C, did not yield the expected space savings in PostgreSQL because:
+ * PostgreSQL uses 1‑byte alignment in the heap.
+ * Indexes are separate B‑tree structures.
+ * Changing the primary key column order can increase index size.
+'''However''', the project was not a failure. The query‑level optimisations (parameterised functions) were highly successful, improving performance by 6× and 2.4×. The storage experiment provided valuable insight into PostgreSQL's internal storage model, and the honest analysis of the results demonstrates a deep understanding of the system.
+== Future Work
+ * Explore **BRIN indexes** for timestamp columns on very large tables.
+ * Consider **table partitioning** for `t_student_answer` and `t_member_message` by date.
+ * Use **hash indexes** for exact‑match foreign‑key lookups.
+ * Evaluate the impact of **table compression** tools like `pg_repack`.
+== References
+ * PostgreSQL Documentation: [Tuple Layout](https://www.postgresql.org/docs/current/storage-page-layout.html)
+ * PostgreSQL Documentation: [Indexes](https://www.postgresql.org/docs/current/indexes.html)
+ * "Column Tetris" – [The Art of PostgreSQL](https://theartofpostgresql.com)
+---
+'''Attached:''' `DDL_optimized.sql` – the full DDL used for the optimised schema.