2

I have a representation of a (binary) tree as a table in PostgreSQL. Each row is a node of the tree and each row has a parent column and a color.

The nodes at he leaves of the tree have a color (their color column is not null and can be either green or red).

My goal is to color the whole tree. With these rules:

  • If a parent has one child, then the parent's color is the child's color.
  • If a parent has two children and both have the same color then the parent color is its children color.
  • If a parent has two children and they have different color, then the parent color is gray.

How can I write a recursive query in PostgreSQL for this algorithm?

Example of a tree:

CREATE TEMPORARY TABLE family (
  node text
, parent text
, color text
);

INSERT INTO family (parent, node, color)
VALUES
  (null , 'A'   , null)
, ('A'  , 'AB'  , null)
, ('A'  , 'AC'  , null)
, ('AB' , 'ABD' , null)
, ('ABD', 'ABDE', 'green')
, ('AC' , 'ACF' , null)
, ('AC' , 'ACG' , null)
, ('ACF', 'ACFH', 'red')
, ('ACF', 'ACFI', 'green')
, ('ACG', 'ACGJ', 'red')
, ('ACG', 'ACGK', 'red')
;

This tree, fully colored:

A colored tree

2 Answers 2

1

A recursive CTE does not allow aggregation in the recursive term.

I suggest to update the original table one level at a time, top-down:

DO
$do$
DECLARE
   _max_depth int := (SELECT max(length(node)) - 1 FROM family);
   _depth int;
BEGIN
   FOR _depth IN REVERSE _max_depth .. 1 LOOP
      UPDATE family p
      SET    color = c.color
      FROM  (
         SELECT parent, CASE WHEN min(color) = max(color) THEN min(color) ELSE 'gray' END AS color
         FROM   family
         WHERE  length(node) = _depth + 1
         GROUP  BY 1
         ) c
      WHERE  p.node = c.parent
      AND    p.color IS DISTINCT FROM c.color;  -- optional
   END LOOP;
END
$do$;

fiddle

If you don't want to touch the original table, operate on a copy of the table, maybe a TEMPORARY table for best performance.

I use length(node) as indicator of the nesting level. If that's unreliable, replace it with a reliable nesting level. Either way, if the table is big and there are many levels, an index on the level indicator will help performance. For the demo that would be:

CREATE INDEX family_level_idx ON family (length(node));

The added WHERE clause p.color IS DISTINCT FROM c.color skips empty updates. See:

Related:

Sign up to request clarification or add additional context in comments.

Comments

1

You can use a recursive cte, aggregating after you perform the join on the parent node in a subquery:

with recursive cte(parent, node, color) as (
   select t.parent, t.node, case when t.c1 = t.c2 then t.c1 else 'gray' end    
   from (select f1.parent, f1.node, max(f.color) c1, min(f.color) c2 
         from family f 
         join family f1 on f1.node = f.parent where f.color is not null 
         group by f1.parent, f1.node) t
   union all
   select t1.parent, t1.node, case when t1.c1 = t1.c2 then t1.c1 else 'gray' end  
   from (select t.parent, t.node, max(t.color) c1, min(t.color) c2 
         from (select f.parent, f.node, c.color from cte c 
               join family f on f.node = c.parent) t 
         group by t.parent, t.node) t1
)
select c.parent, c.node, c.color from cte c
union all
select f.parent, f.node, f.color from family f where f.color is not null

See fiddle.

3 Comments

Thanks, but something seems to be wrong with this algorithm. Here's a sample of my actual data: dbfiddle.uk/wynbYEbp Node '91.73.128.0/17' must be gray, but it is repeated three times in the result set, once blue, other times red.
@homam Does your dataset represent a complete binary tree?
No, the table contains many distinct trees. So there are several roots.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.