Quantcast
Channel: Active questions tagged cte - Database Administrators Stack Exchange
Viewing all articles
Browse latest Browse all 207

How to eliminate duplicates made by recursive CTE [closed]

$
0
0

I have following schema in PostgreSQL 12 (objects and their relations are versioned by date ranges):

CREATE TABLE tmp_deps (    id bigint,    code text,    name text,    start_date date,    end_date date);CREATE TABLE tmp_dep_parents (    department_id bigint,    parent_id bigint,    start_date date,    end_date date);INSERT INTO tmp_deps (id, code, name, start_date, end_date)VALUES     (1, '11111111', 'Root Dep', '1970-01-01', '9999-12-31'),    (2, '11111112', 'Sub Root Dep', '1970-01-01', '9999-12-31'),    (3, '11111113', 'Dep 1', '1970-01-01', '2021-02-23'),    (3, '11111113', 'Dep 1', '2021-02-24', '9999-12-31'),    (4, '11111114', 'Dep 2', '1970-01-01', '2020-01-01'),    (4, '11111114', 'Dep 2', '2020-01-02', '9999-12-31'),    (5, '11111115', 'Sub Dep 1', '1970-01-01', '9999-12-31'),    (6, '11111116', 'Sub Dep 2', '1970-01-01', '9999-12-31');INSERT INTO tmp_dep_parents (department_id, parent_id, start_date, end_date)VALUES     (2, 1, '1970-01-01', '9999-12-31'),    (3, 2, '1970-01-01', '2005-12-31'),    (3, 1, '2006-01-01', '2006-12-31'),    (3, 2, '2007-01-01', '9999-12-31'),    (4, 2, '1970-01-01', '2010-05-01'),    (4, 3, '2010-05-02', '2010-05-03'),    (4, 2, '2010-05-04', '9999-12-31'),    (5, 3, '1970-01-01', '9999-12-31'),    (6, 4, '1970-01-01', '9999-12-31');

objects and their relations are versioned by date ranges

It means that object with id=1 can be named as "Root Department" from 1 Jan 2000 to 31 Dec 2021, and can be named as "Super Department" from 1 Jan 2022 to 31 Dec 9999.

Object with id=3 can belong to object with id=2 from 1 Jan 1970 to 15 May 2003 and can belong to object with id=1 from 16 May 2003 to 31 Dec 9999


The hierarchy of objects can be changed at any time.I am using recursive CTE to create MATERIALIZED VIEW and I noticed duplicates when I filter objects and their relations by date. And also I need an ability to view historical data too that is the reason why I can't just filter objects and their relations by date inside CTE.My CTE is:

WITH RECURSIVE deps AS (    SELECT        dep.id,        dep.start_date as dep_start_date,        dep.end_date as dep_end_date,        dep.name,        dep.code,        1::int AS "depth",        dep.parent_id,        dep.dep_par_start_date,        dep.dep_par_end_date    FROM all_deps dep    WHERE dep.code = '11111111'    UNION    SELECT        dep.id,        dep.start_date as dep_start_date,        dep.end_date as dep_end_date,        dep.name,        dep.code,        s.depth + 1 AS "depth",        dep.parent_id,        dep.dep_par_start_date,        dep.dep_par_end_date    FROM all_deps dep    INNER JOIN deps s ON dep.parent_id = s.id),all_deps AS (    SELECT        dep.*,        dp.parent_id,        dp.start_date AS dep_par_start_date,        dp.end_date AS dep_par_end_date    FROM tmp_deps dep    LEFT JOIN tmp_dep_parents dp ON dp.department_id = dep.id)SELECT * FROM deps

The problem with duplicates occurs when I add the "depth" output in CTE.And as I understand it correctly, the duplicates are hidden when I remove the "depth" output from CTE, but I can still get wrong data.My CTE contains all versions of objects and their relations.And I don't understand how to get the CTE to follow the correct hierarchy path.


I want to see the hierarchy of departments and the departments themselves as of 15 May 2020 or 9 June 2021 or any other date, that is why I put all data states (versions) to CTE.

Materialized view is created from RCTE:

CREATE MATERIALIZED VIEW departments_mat_view ASWITH RECURSIVE deps AS (    SELECT        dep.id,        dep.start_date as dep_start_date,        dep.end_date as dep_end_date,        dep.name,        dep.code,        1::int AS "depth",        dep.parent_id,        dep.dep_par_start_date,        dep.dep_par_end_date    FROM all_deps dep    WHERE dep.code = '11111111'    UNION    SELECT        dep.id,        dep.start_date as dep_start_date,        dep.end_date as dep_end_date,        dep.name,        dep.code,        s.depth + 1 AS "depth",        dep.parent_id,        dep.dep_par_start_date,        dep.dep_par_end_date    FROM all_deps dep    INNER JOIN deps s ON dep.parent_id = s.id),all_deps AS (    SELECT        dep.*,        dp.parent_id,        dp.start_date AS dep_par_start_date,        dp.end_date AS dep_par_end_date    FROM tmp_deps dep    LEFT JOIN tmp_dep_parents dp ON dp.department_id = dep.id)SELECT * FROM deps

And here is select from MATERIALIZED VIEW:

SELECT * FROM departments_mat_viewWHERE    dep_start_date <= '2021-06-09'    AND dep_end_date >= '2021-06-09'    AND (dep_par_start_date IS NULL OR dep_par_start_date <= '2021-06-09')    AND (dep_par_end_date IS NULL OR dep_par_end_date >= '2021-06-09')

Here is output from CTE:

"id"    "dep_start_date"    "dep_end_date"  "name"          "code"      "depth" "parent_id"     "dep_par_start_date"    "dep_par_end_date"1       "1970-01-01"        "9999-12-31"    "Root Dep"      11111111    1       NULL            NULL                    NULL2       "1970-01-01"        "9999-12-31"    "Sub Root Dep"  11111112    2       1               "1970-01-01"            "9999-12-31"3       "2021-02-24"        "9999-12-31"    "Dep 1"         11111113    3       2               "2007-01-01"            "9999-12-31"4       "2020-01-02"        "9999-12-31"    "Dep 2"         11111114    3       2               "2010-05-04"            "9999-12-31"5       "1970-01-01"        "9999-12-31"    "Sub Dep 1"     11111115    3       3               "1970-01-01"            "9999-12-31"5       "1970-01-01"        "9999-12-31"    "Sub Dep 1"     11111115    4       3               "1970-01-01"            "9999-12-31"6       "1970-01-01"        "9999-12-31"    "Sub Dep 2"     11111116    4       4               "1970-01-01"            "9999-12-31"6       "1970-01-01"        "9999-12-31"    "Sub Dep 2"     11111116    5       4               "1970-01-01"            "9999-12-31"

Desired output is:

"id"    "dep_start_date"    "dep_end_date"  "name"          "code"      "depth" "parent_id"     "dep_par_start_date"    "dep_par_end_date"1       "1970-01-01"        "9999-12-31"    "Root Dep"      11111111    1       NULL            NULL                    NULL2       "1970-01-01"        "9999-12-31"    "Sub Root Dep"  11111112    2       1               "1970-01-01"            "9999-12-31"3       "2021-02-24"        "9999-12-31"    "Dep 1"         11111113    3       2               "2007-01-01"            "9999-12-31"4       "2020-01-02"        "9999-12-31"    "Dep 2"         11111114    3       2               "2010-05-04"            "9999-12-31"5       "1970-01-01"        "9999-12-31"    "Sub Dep 1"     11111115    4       3               "1970-01-01"            "9999-12-31"6       "1970-01-01"        "9999-12-31"    "Sub Dep 2"     11111116    4       4               "1970-01-01"            "9999-12-31"

Viewing all articles
Browse latest Browse all 207

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>