Quantcast
Channel: Active questions tagged cte - Database Administrators Stack Exchange
Viewing all articles
Browse latest Browse all 207

PostgreSQL Query Performance: Sorting Array Length vs. CTE

$
0
0

I'm having a significant performance difference between two PostgreSQL queries that I'm trying to understand and optimize. I have a table with around 2TB of data, and both queries use the sequential scan. The first query executes quickly, while the second one takes more than 2 hours to complete. Here are the queries and some details:

1st Query

WITH cte_mytable AS (   SELECT *, coalesce(array_length(tags, 1), 0) as size FROM my_table limit 100)SELECT  user_id, ...some other columnsFROM  cte_mytableWHERE  cte_mytable.user_id=user_idORDER BY  size desc;

Explain log:

|QUERY PLAN                                                                                         ||---------------------------------------------------------------------------------------------------||Sort  (cost=13.02..13.27 rows=100 width=593)                                                       ||  Sort Key: cte_my_table.size DESC                                                                     ||  ->  Subquery Scan on cte_my_table  (cost=0.00..9.70 rows=100 width=593)                              ||        Filter: ((cte_my_table.user_id)::text IS NOT NULL)                                             ||        ->  Limit  (cost=0.00..8.70 rows=100 width=593)                                            ||              ->  Seq Scan on my_table  (cost=0.00..197608426.93 rows=2272637194 width=593)|

2nd Query

SELECT *FROM my_tableORDER BY array_length(tags, 1) descLIMIT 100;

Explain log

|QUERY PLAN                                                                                                 ||-----------------------------------------------------------------------------------------------------------||Limit  (cost=217229180.49..217229192.16 rows=100 width=593)                                                ||  ->  Gather Merge  (cost=217229180.49..438195445.87 rows=1893864328 width=593)                            ||        Workers Planned: 2                                                                                 ||        ->  Sort  (cost=217228180.47..219595510.88 rows=946932164 width=593)                               ||              Sort Key: (array_length(tags, 1)) DESC                                                       ||              ->  Parallel Seq Scan on my_table  (cost=0.00..181037114.05 rows=946932164 width=593)|

Version

PostgreSQL 14.6

Can someone help me understand why the second query is so slow and the first query is faster?


Viewing all articles
Browse latest Browse all 207

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>