Hello. In this tutorial, we will learn the common table expressions CTE in Postgres which help break down complex SQL queries. For illustration, we will use the PostgreSQL database running on docker.
SQL stands for Structured Query Language and is used to extract and organize data stored in relational databases like MySQL, PostgreSQL, Oracle, etc. A relational database consists of rows and columns that allow fetching specific information from databases that can be used later for analysis. In real-time SQL manages a large amount of data that is written and read simultaneously and any query that reaches the SQL server is processed into three parts –
- Parsing – Process to check the SQL query syntax
- Binding – Process to check the SQL query semantics
- Optimization – Process to generate the SQL query execution plan
1.1 Usage of SQL
Structured Query Language (popularly known as SQL) is commonly used by data analysts and data science professionals and is helpful too –
- Execute queries against the database
- Retrieve data from the database
- Insert new records into the database
- Update existing records in the database
- Created stored procedures, functions, and materialized views in the database
- Create users and grant permissions
- Set permissions on tables, stored procedures, functions, and materialized views
1.2 What is a CTE
A common table expression popularly called CTE is a temporary result set that you can refer to with another SQL statement including
DELETE. The below statement represents the CTE syntax –
with cte_name (column_list) as ( CTE_query_definition ) statement;
In this syntax –
- Specify the optional column list followed by the CTE keyword
- Inside the CTE body specify the query that returns a result set
- Use the CTE like a table or a view in the statement
2. Postgres CTE
Let us dive into some practice implementation on the PostgreSQL database.
2.1 Pre-requirement – Postgres Setup
Usually, setting up the database is a tedious step but with Docker, it is a simple process. You can watch the video available at this link to understand the Docker installation on Windows OS. Once done open the terminal and trigger the below command to set and run PostgreSQL.
-- command to run postgres on docker – -- remember to change the password -- docker run -d -p 5433:5432 -e POSTGRES_PASSWORD= --name postgres postgres -- command to stop the Postgres docker container -- docker stop postgres -- command to remove the Postgres docker container -- docker rm postgres
Remember to enter the password of your choice. If everything goes well the postgresql database server would be up and running on a port number –
5433 and you can connect with the Dbeaver GUI tool for connecting to the server.
2.2 Setting up pre-requisite data
To play with the CTE I have exported the sample
dvdrental database into my Postgres instance running on docker. You can use the same by downloading the data dump from this link and exporting it into the PostgreSQL instance.
2.3 Postgres CTE
Use the below SQL to understand the CTE. Here the query definition inside the CTE will be responsible to calculate the rentals, invested amount, sales, profit and loss amount, etc for each record. Once the data is calculated we will use the statement to get the total of all the films. The column names given in the statement should match the ones used inside the query definition of CTE. You can explicitly trigger the
select statement written inside the CTE block to explore the SQL query and understand the result set.
Query for cte
-- understanding common table expression (cte) in postgresql using the dvdrental database with ctedata as ( select f.title, f.film_id, f.replacement_cost, f.rental_rate, count(distinct( i1.inventory_id )) as numberofcds, count (r.rental_id) as numberofrentals, count(distinct( i1.inventory_id )) * ( f.replacement_cost * 0.33 ) as invested, count(r.rental_id) * f.rental_rate as sales, ( count(r.rental_id) * f.rental_rate ) - count(distinct( i1.inventory_id )) * ( f.replacement_cost * 0.33 ) as profitorloss from film f join inventory i1 on ( i1.film_id = f.film_id ) join rental r on ( r.inventory_id = i1.inventory_id ) group by f.title, f.film_id, f.replacement_cost, f.rental_rate ) select sum(cte.invested) as invested_amount, sum(cte.sales) as gross_sales, sum(cte.profitorloss) as profit_or_loss from ctedata cte;
If everything goes well the request for the sql query will be shown on the console.
In this tutorial, we learned the Postgres CTE brief and a practical implementation of it. You can download the SQL scripts used in this tutorial from the Downloads section.
This was a tutorial to understand the Postgres CTE in SQL.
You can download the full source code of this example here: Postgres cte