Changes between Version 4 and Version 5 of AnalyticalStatisticalQuerying


Ignore:
Timestamp:
08/27/25 16:23:15 (6 days ago)
Author:
221507
Comment:

This change was done regarding the use of LIMIT, such that we avoid using hard limits for the queries 3, 5 and 6. In query 3, we use weighted sum to calculate a score with which we order the purchased tickets. In 5, we use the mean of the number of passengers to retrieve routes that have a larger number of passengers compared to it. In the 6th query, instead of arbitrarily selecting the top 3 routes, the query uses a relative threshold (80% of the maximum) and this is the simplest approach of all.

Legend:

Unmodified
Added
Removed
Modified
  • AnalyticalStatisticalQuerying

    v4 v5  
    7575
    7676== 3. Top user-purchased routes ==
    77 This SQL query retrieves the top three routes most frequently purchased by a specified user, based on ticket counts, including route details and transport company, sorted by tickets purchased in descending order. Useful for fast recommendation of routes per user.
    78 
    79 {{{
    80 SELECT
    81     r.route_id,
    82     l_from.name AS from_location,
    83     l_to.name AS to_location,
    84     COUNT(t.ticket_id) AS tickets_purchased,
    85     to_.company_name AS transport_company
    86 FROM ticket t
    87 JOIN trip tr ON t.trip_id = tr.trip_id
    88 JOIN route r ON tr.route_id = r.route_id
    89 JOIN location l_from ON r.from_location_id = l_from.location_id
    90 JOIN location l_to ON r.to_location_id = l_to.location_id
    91 JOIN transport_organizer to_ ON r.transport_organizer_id = to_.transport_organizer_id
    92 WHERE  t.account_id = 300 -- this can be changed arbitrarily
    93 GROUP BY r.route_id, l_from.name, l_to.name, to_.company_name
    94 ORDER BY tickets_purchased DESC
    95 LIMIT 3;
     77This SQL query calculates a weighted score for routes based on ticket purchases by a specific account over the past three months (with weights of 0.5, 0.3, and 0.2 for the most recent to oldest month), returning routes with scores at or above the average, ordered by weighted score in descending order.
     78
     79
     80{{{
     81WITH route_weighted_usage AS (
     82    SELECT
     83        r.route_id,
     84        l_from.name AS from_location,
     85        l_to.name AS to_location,
     86        to_.company_name AS transport_company,
     87        SUM(
     88            CASE
     89                WHEN DATE_TRUNC('month', t.date_purchased) = DATE_TRUNC('month', CURRENT_DATE)
     90                    THEN 0.5
     91                WHEN DATE_TRUNC('month', t.date_purchased) = DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month')
     92                    THEN 0.3
     93                WHEN DATE_TRUNC('month', t.date_purchased) = DATE_TRUNC('month', CURRENT_DATE - INTERVAL '2 months')
     94                    THEN 0.2
     95                ELSE 0
     96            END -- this will be used as a weighted sum to give edge to recently bought tickets
     97        ) AS weighted_score
     98    FROM ticket t
     99    JOIN trip tr ON t.trip_id = tr.trip_id
     100    JOIN route r ON tr.route_id = r.route_id
     101    JOIN location l_from ON r.from_location_id = l_from.location_id
     102    JOIN location l_to ON r.to_location_id = l_to.location_id
     103    JOIN transport_organizer to_ ON r.transport_organizer_id = to_.transport_organizer_id
     104    WHERE t.account_id = 300 AND t.date_purchased >= CURRENT_DATE - INTERVAL '3 months'
     105    GROUP BY r.route_id, l_from.name, l_to.name, to_.company_name
     106)
     107SELECT
     108    route_id,
     109    from_location,
     110    to_location,
     111    transport_company,
     112    weighted_score
     113FROM route_weighted_usage
     114WHERE weighted_score >= (SELECT AVG(weighted_score) FROM route_weighted_usage)
     115ORDER BY weighted_score DESC;
    96116}}}
    97117