Changes between Version 2 and Version 3 of AdvancedTopic


Ignore:
Timestamp:
05/26/26 16:02:12 (11 hours ago)
Author:
231550
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • AdvancedTopic

    v2 v3  
    7070
    7171* Activates BEFORE INSERT or UPDATE so the **search_vector** is set before the row is written to disk
    72 * Only fires on UPDATE OF name if other columns like **calories_per_100g** are updated the trigger is skipped, avoiding unnecessary recomputation
     72* Only fires on UPDATE OF name if other columns like **calories_per_100g** are updated the trigger is skipped, avoiding unnecessary recomputation
    7373* COALESCE(NEW.name, '') handles null names gracefully by treating them as empty strings
    7474* NEW refers to the row being inserted or updated
     
    8888    calories        INT,
    8989    rank_score      REAL,
    90     rank_normalized REAL
    9190) AS $$
    9291DECLARE
     
    101100        ts_headline(
    102101            'english', f.name,v_tsquery,
    103             'StartSel=<b>, StopSel=</b>, MaxWords=10, MinWords=1'
     102            '!StartSel=<b>, !StopSel=</b>, !MaxWords=10, !MinWords=1'
    104103        ) AS highlighted,
    105104        c.name,
    106105        f.calories_per_100g,
    107106        ts_rank(f.search_vector, v_tsquery) AS rank_score,
    108         ts_rank_cd(f.search_vector, v_tsquery) AS rank_normalized
    109107    FROM Food f
    110108    LEFT JOIN Category c ON f.category_id = c.category_id
     
    115113END;
    116114$$ LANGUAGE plpgsql;
     115
     116**Parameters:**
     117* **p_query** - the search term entered by the user e.g. **'chicken breast'**
     118* **p_limit** - how many results to return (default 20) for pagination
     119* **p_offset** - how many results to skip (default 0) for pagination. For page 2 with 20 results per page, pass **p_offset = 20**
     120
     121**Key components:**
     122* **plainto_tsquery('english', p_query)** - converts the user's raw input into a tsquery. It handles natural language input gracefully so users don't need to know any special syntax. For example 'chicken breast' becomes 'chicken' & 'breast'.
     123
     124* **f.search_vector @@ v_tsquery** - the match operator. Filters only foods whose search vector matches the query. This is where the GIN index is used.
     125
     126* **ts_rank(f.search_vector, v_tsquery)** - scores each result by relevance. Foods where the search terms appear more frequently get a higher score. Results are ordered by this score so the most relevant foods appear first.
     127
     128* **ts_headline(...)** - highlights matching terms in the food name by wrapping them in HTML tags. For example searching 'chicken' on 'Grilled Chicken Breast' returns 'Grilled <b>Chicken</b> Breast'. This is used by the frontend to bold matching words in search results.
     129
     130== Conclusion ==
     131
     132The FTS implementation in !VitaDiet provides a fast, accurate and user-friendly food search experience. When a user types a search query like "chicken breast", **plainto_tsquery** first converts it into a normalized **tsquery ('chicken' & 'breast')**, stripping out stop words and stemming each word to its root form. This **tsquery** is then matched against the precomputed **search_vector** column on the Food table using the **@@** operator, which uses the GIN index to find matching foods near-instantly without scanning the entire table. Each matching food is then scored by **ts_rank** based on how frequently and prominently the search terms appear in the food name, ensuring the most relevant results appear at the top. Finally **ts_headline** wraps the matching terms in HTML tags so the frontend can display them in bold, helping users immediately see why each result matched their search. The results are then paginated using LIMIT and OFFSET so only a manageable number of results are returned at a time.
     133
     134The trigger **trg_food_search_vector** ensures this entire pipeline stays accurate by automatically recomputing the **search_vector** whenever a food's name is inserted or updated, so the GIN index never becomes stale.
     135
     136**Advantages**
     137* GIN index makes search fast regardless of table size
     138* Stemming handles word variations automatically
     139* Relevance ranking returns the most useful results first
     140* Highlighting improves the user experience
     141* **plainto_tsquery** handles natural language input without requiring special syntax from users
     142
     143**Disadvantages**
     144* **search_vector** column adds storage overhead to the **Food** table
     145* The trigger adds a small overhead to every INSERT and UPDATE on **Food**
     146* FTS is language-specific - the **'english'** configuration may not handle non-English food names perfectly.