Search Our Knowledge Base…

Search Our Knowledge Base…

Data Engineering

All you need to know about Recursive CTEs in SQL

All you need to know about Recursive CTEs in SQL

Written By: Sajagan Thirugnanam and Austin Levine

Last Updated on October 6, 2024

The purpose of this article is to help you gain a deep understanding of how recursive tables work. After reading this article, you should be able to construct a recursive table and have a comprehensive grasp of the underlying logic.

Definition of CTE

First things first, though. Before we dive into the world of recursive CTEs, it is essential to ensure that you understand what CTE stands for. CTE is an abbreviation for Common Table Expression, which represents a temporary result set. It is similar to a subquery in terms of logic; however, it is much simpler to read.

You can define a CTE by using the WITH clause. The WITH clause is followed by a list of one or more CTEs, each of which is defined by a SELECT statement.

There are two types of CTEs: non-recursive CTEs and recursive CTEs. Let's explore them separately.

Non-Recursive CTE

The general form we described before it refers mainly to non-recursive CTEs. To recap, it begins with the keyword "WITH." You then provide a name for your CTE, followed by the "AS" keyword, and you define the CTE within parentheses. The second part of the syntax consists of a simple SELECT statement, written immediately after the recursive CTE, without any commas, semicolons, or similar punctuation marks. As mentioned earlier, the CTE is used in another query just like any other table, and this is precisely what the SELECT statement accomplishes. Here is the general syntax:

WITH cte_name AS (cte_query_definition) SELECT * FROM cte_name;

Source: mariadb.com

And here is an example:

WITH recent_orders AS ( SELECT customer_id, order_date FROM orders WHERE order_date >= CURRENT_DATE - INTERVAL 1 MONTH ) SELECT customer_id, COUNT(*) AS num_orders FROM recent_orders GROUP BY customer_id

Recursive CTE

The main difference of a recursive CTE is that it reference itself in the SELECT statement and it eventually creates a loop. Other than that the overall syntax is the same., except from the fact that after the With clause, you need to also add the word RECURSIVE. When it does is it constantly repeats itself and it only stops when it returns all the results.

WITH RECURSIVE cte_name AS ( cte_query_definition (the anchor member) UNION ALL cte_query_definition (the recursive member) ) SELECT * FROM   cte_name;

The primary purpose of recursive CTEs is to be used when you need to query hierarchical data. A good example would be having an employees table and wanting to determine the hierarchy of each employee based on their superiors. Let's assume we have an employees table that includes the following information:

  • employee_id

  • manager_id

  • name

WITH RECURSIVE company_hierarchy AS ( SELECT employee_id, manager_id, name, 1 FROM employees WHERE manager_id IS NULL UNION ALL SELECT e.employee_id, e.manager_id, e.name, ch.level + 1 FROM employees e JOIN company_hierarchy ch ON e.manager_id = ch.employee_id ) SELECT * FROM company_hierarchy;

The final outcome of the above table would be a table with all the employees and their respective managers along with their names and the hierarchy level.

Recursive CTEs can be used to solve a wide variety of problems, such as:

  • Traversing hierarchical data structures

  • Finding all paths between two nodes in a graph

  • Computing transitive relationships

  • Calculating cumulative totals

In conclusion, recursive Common Table Expressions (CTEs) in SQL open up a world of possibilities for handling hierarchical data structures and solving complex problems with elegant, efficient solutions.