I have two tables - Sites (<100k records) and Visitors (>50 mil records). The queries below produce identical results, but the first one takes 150 milliseconds, and the second takes over a minute.
Query #1:
WITH neighbours AS (SELECT s2."Id", s2."PolyId", s2."Distance"FROM "Sites" s, "Sites" s2WHERE s."Id" = 19656 AND s2."Id" <> s."Id"ORDER BY s2."Distance"LIMIT 10)SELECT AVG("VisitsEstimate") AS "Visits", v."Id" FROM "Visitors" v RIGHT OUTER JOIN neighbours ON neighbours."PolyId" = v."Id"WHERE v."Month" BETWEEN '2022-01' AND '2022-09'GROUP BY v."Id"
Query #2:
WITH avg_visits AS (SELECT AVG("Visits") AS "Visits", v."Id" FROM "Visitors" v WHERE v."Month" BETWEEN '2022-01' AND '2022-09'GROUP BY v."Id")SELECT s2."Id", s2."PolyId", s2."Distance"FROM "Sites" s, "Sites" s2LEFT OUTER JOIN avg_visits ON avg_visits."Id" = s2."PolyId"WHERE s."Id" = 19656 AND s2."Id" <> s."Id"ORDER BY s2."Distance"LIMIT 10
Why do they produce such a drastically different performance? Does the CTE in query #2 run in parallel to the main query, so no filters are applied to it before the results are returned?
Is there a way to improve the performance of #2 to match #1? Due to the processing limitations of my workflow, #2 is much more usable.