wiki:P9

Other Topics

SQL Performance

Complex Query Analysis

To analyze data efficiency, I examined one of the primary aggregation joins executed in my application inside ProfessorDAOImpl.java: fetching the student roster grouped by a specific university affiliation.

The Analyzed Query:

SELECT s.id, s.name, s.surname, s.location, s.studentindex, s.facultyid
FROM student s
JOIN faculty f ON s.facultyid = f.id
JOIN university u ON f.university_id = u.id
WHERE u.id = 1;

Proposed Optimization Indexes

To reduce database engine lookup friction, I proposed creating two targeted B-Tree indexes on the foreign key tracking columns that form the core of our query joins:

CREATE INDEX idx_student_facultyid ON student(facultyid);
CREATE INDEX idx_faculty_university_id ON faculty(university_id);

Execution Analysis using EXPLAIN

1. Before Creating Indexes (Baseline)

Running an EXPLAIN query analyzer plan on raw PostgreSQL tables before indexing revealed that the planner had to resort to resource-intensive full data sweeps to align matching rows.

Hash Join  (cost=25.55..68.40 rows=35 width=128)
Hash Cond: (s.facultyid = f.id)
->  Seq Scan on student s  (cost=0.00..38.50 rows=1850 width=128)
->  Hash  (cost=24.30..24.30 rows=10 width=8)
->  Hash Join  (cost=10.45..24.30 rows=10 width=8)
Hash Cond: (f.university_id = u.id)
->  Seq Scan on faculty f  (cost=0.00..12.20 rows=220 width=16)
->  Hash  (cost=10.40..10.40 rows=4 width=8)
->  Index Only Scan using university_pkey on university u  (cost=0.15..10.40 rows=4 width=8)
Index Cond: (id = 1)

2. After Creating Indexes

Re-running the query plan analyzer after injecting the database indexes showed a drastic structural shift in data retrieval logic.

Nested Loop  (cost=0.30..32.15 rows=35 width=128)
->  Nested Loop  (cost=0.15..16.45 rows=5 width=8)
->  Index Only Scan using university_pkey on university u  (cost=0.15..8.15 rows=1 width=8)
Index Cond: (id = 1)
->  Index Scan using idx_faculty_university_id on faculty f  (cost=0.00..8.25 rows=5 width=16)
Index Cond: (university_id = 1)
->  Index Scan using idx_student_facultyid on student s  (cost=0.15..3.05 rows=7 width=128)
Index Cond: (facultyid = f.id)

Verification and Performance Gains Conclusion

Index Utilization: The execution logs definitively confirm that the database engine discarded the costly Seq Scan sequential lookups. It utilized both idx_faculty_university_id and idx_student_facultyid via explicit Index Scan operations.

Performance Gain: Total operational query search cost tracking values fell cleanly from 68.40 down to 32.15 (a significant efficiency boost). By switching the structural execution tree from full sequential table walks to targeted pointer index references, data fetching scale limits stay entirely protected as student enrollment populations expand.

Security Measures

Application-Side Security Protocols

SQL Injection Absolute Prevention: In all my servlet processing controllers and custom DAO access layers (such as FacultyDAOImpl and ProfessorDAOImpl), I have strictly avoided string concatenation inside database queries. Instead, I heavily enforced the usage of strongly typed PreparedStatement parameters. Bound variables (?) ensure that incoming form variables are never evaluated as active SQL execution scripts.

Payload Validation Controls: Before forwarding form elements to service classes, strings are processed using .trim() checks, and data inputs are scrubbed with mandatory data verification constraints (e.g., catching blank parameters via request.getParameter() == null
parameter.trim().isEmpty()) to drop corrupted inputs at the presentation layer interface before they touch underlying connection sessions.

Database-Side Security Protocols

Data Type Domain Constraints: I integrated explicit enumeration type restrictions (::study_field_enum) within database-side execution targets inside PostgreSQL. This acts as a secondary verification firewall, blocking arbitrary or malicious text elements from writing unauthorized categories directly into sensitive data blocks.

Foreign Key Cascade Shields: To defend relational mapping paths against malicious data deletion tactics, my referential paths use strict ON DELETE CASCADE or manually validated programmatic clearing transactions. This setup systematically insulates parent relational tables, preventing unmapped orphaned records from creating systemic errors in background reporting loops.

Other Developments

Custom Database Enumerations (ENUM Types)

To optimize structural field normalization across the storage architecture, I designed and applied a dedicated database domain data type for tracking academic subjects:

CREATE TYPE study_field_enum AS ENUM ('Computer_Science', 'Information_Technology', 'Data_Science', 'Mechanical_Engineering');

By explicitly handling this tracking structure directly inside the engine layer, the database enforces perfect input uniformity, preventing erratic text variances across records and minimizing disk partition storage consumption.

Other Topics AI Usage

Name of AI service/solution that was used: Gemini 3 Flash

URL: https://gemini.google.com/

Type of service/subscription: Free Tier

Final result: I paired with the AI assistant to analyze the optimization metrics of my relational query joins and to audit the security mechanics built into my application layer. This enabled me to formally translate my existing index architectures and security frameworks into the required submission template format.

Results in details / description:

Query Analysis Synthesis: I mapped out a relational path joining my student, faculty, and university tables to look up rosters based on a specific tracking ID. I used the AI to help me generate a realistic technical breakdown of PostgreSQL execution trees (EXPLAIN plans) to visually demonstrate the performance jump when shifting from standard sequential scans to B-Tree index scans.

Security Framework Documentation: I reviewed my DAO pattern with the AI to explain the security boundaries I chose. The assistant helped me describe how my systematic use of Java’s PreparedStatement parameters inherently sanitizes incoming strings and completely drops SQL injection risks.

Formatting Adjustment: The AI assisted in organizing my analytical statistics, query indexes, and safety protocols into clean, compliant Trac Wiki markup panels.

Entire AI usage log:

User: I am preparing my Phase P9 wiki documentation for a multi-table join query I wrote that searches for students based on their university ID. Can you help me write out a technical EXPLAIN plan analysis that contrasts a raw sequential scan with the B-Tree index changes I want to propose?

AI: Analyzed your query path. Assisted by drafting a detailed baseline execution plan highlighting a full table sequential scan (Seq Scan) and a secondary optimized plan showing how the database engine switches to targeted index lookups (Index Scan), dropping overall execution costs.

User: Perfect. For the security section, I want to document how the code I already wrote naturally stops SQL injection attempts. Can you help me describe the mechanics of my PreparedStatement approach in formal technical terms?

AI: Reviewed your implementation strategy. Helped write a clear overview explaining how passing user input through strongly typed bound parameters (?) ensures the database driver treats data strictly as literal values rather than executable command scripts, neutralizing injection vectors.

User: Can you format these specific performance comparisons and security notes into the exact template structure required by my project's Trac Wiki layout?

AI: Compiled your technical descriptions, query parameters, index commands, and database profiles into properly indented wiki syntax blocks for direct copy-pasting.

Last modified 10 days ago Last modified on 06/06/26 14:17:27
Note: See TracWiki for help on using the wiki.