SQL Subquery
Hello. In this tutorial, we will learn subqueries in sql and will use the postgresql database.
1. Introduction
SQL stands for Structured Query Language and is used to extract and organize data stored in relational databases like MySQL, PostgreSQL, Oracle, etc. A relational database consists of rows and columns that allow fetching specific information from databases that can be used later for analysis. In real-time SQL manages a large amount of data that is written and read simultaneously and any query that reaches the SQL server is processed into three parts –
- Parsing – Process to check the SQL query syntax
- Binding – Process to check the SQL query semantics
- Optimization – Process to generate the SQL query execution plan
1.1 Usage of SQL
Structured Query Language (popularly known as SQL) is commonly used by data analysts and data science professionals and is helpful too –
- Execute queries against the database
- Retrieve data from the database
- Insert new records into the database
- Update existing records into the database
- Created stored procedures, functions, and materialized views in the database
- Create users and grant permissions
- Set permissions on tables, stored procedures, functions, and materialized views
1.2 What is SubQuery?
A Subquery is also known as an inner query or inner select while the statement containing a subquery is also called an outer query or outer select. The inner query is executed first before its parent query and the results of an inner query are passed to the parent query. A subquery follows certain rules i.e.
- Must be enclosed within parentheses
- Can be used with operators like >, <, =, IN, ANY, or ALL
- An ORDER By command cannot be used in the subquery
1.2.1 Types of Sub queries
There are different types of subqueries –
- Single row subquery – Return zero or one row
- Multiple rows subquery – Return one or more rows
- Insert into sql query via Select subquery
- Delete via Select subquery
- Correlated subquery – The subquery depends on the other query of its values
2. Subquery in sql
Let us dive into some practice implementation on the postgresql database.
2.1 Pre-requirement – Postgres Setup
Usually, setting up the database is a tedious step but with Docker, it is a simple process. You can watch the video available at this link to understand the Docker installation on Windows OS. Once done open the terminal and trigger the below command to set and run postgresql.
Docker commands
-- command to run postgres on docker -- -- remember to change the password -- docker run -d -p 5433:5432 -e POSTGRES_PASSWORD= --name postgres postgres -- command to stop the Postgres docker container -- docker stop postgres -- command to remove the Postgres docker container -- docker rm postgres
Remember to enter the password of your choice. If everything goes well the postgresql database server would be up and running on a port number – 5433
and you can connect with the Dbeaver GUI tool for connecting to the server.
2.2. Creating a database
To practice this tutorial I will be using the learning
database. You can below command to create a database.
Create database sql query
-- create database create database learning
2.3 Creating a Student table and inserting data
Create a student
table in the learning
database. You can use the below script to create a table, insert some data into it and view it.
Student sql script
drop table student; create table student( id serial primary key, student_name varchar(50), city_name varchar(50) ); insert into student (student_name, city_name) values ('Ram', 'Agra'); insert into student (student_name, city_name) values ('Aarav', 'Mumbai'); insert into student (student_name, city_name) values ('Abhilash', 'Hyderabad'); insert into student (student_name, city_name) values ('Tessa', 'Delhi'); insert into student (student_name, city_name) values ('Daniel', 'Chennai'); insert into student (student_name, city_name) values ('Jane', 'Agra'); insert into student (student_name, city_name) values ('Nolan', 'Jaipur'); insert into student (student_name, city_name) values ('Ankit', 'Jaipur'); insert into student (student_name, city_name) values ('Robin', 'Hyderabad'); insert into student (student_name, city_name) values ('Tony', 'Agra'); select * from student s;
Once the data is viewed in the student
table it will be shown as below.
2.4 Creating a Student Marks table and inserting data
Create a student_marks
table in the learning
database. You can use the below script to create a table, insert some data into it and view it.
Student marks sql script
drop table student_marks; create table student_marks( id serial primary key, science int, maths int, english int ); insert into student_marks (science, maths, english) values (67, 60, 73); insert into student_marks (science, maths, english) values (4, 9, 16); insert into student_marks (science, maths, english) values (8, 20, 55); insert into student_marks (science, maths, english) values (38, 34, 37); insert into student_marks (science, maths, english) values (49, 25, 8); insert into student_marks (science, maths, english) values (71, 52, 94); insert into student_marks (science, maths, english) values (34, 30, 70); insert into student_marks (science, maths, english) values (39, 55, 92); insert into student_marks (science, maths, english) values (72, 91, 49); insert into student_marks (science, maths, english) values (97, 40, 93); select * from student_marks sm;
Once the data is viewed in the student_marks
table it will be shown as below.
2.5 Playing with Subqueries
Let us play around with the subqueries in sql.
2.5.1 Single row subquery
Single row subquery return one or more rows. The below sql query will get the 1 record from the student_marks
table for the student whose name is Ram and who lives in Agra city. The single row subquery makes use of the =
operator.
Single row subquery
-- SINGLE ROW SUBQUERY -- returns zeros or one row -- q: find student marks whose full name is 'Ram' and lives in 'Agra'; select * from student_marks sm where sm.id = (select s.id from student s where s.student_name='Ram' and s.city_name='Agra');
Once the sql query is executed the below result will be shown.
2.5.2 Multiple rows subquery
Multiple rows subquery return multiple rows. The below sql query will get multiple records from the student_marks
table for the students who live in Agra city. The multiple rows subquery makes use of the IN
keyword.
Multiple rows subquery
-- MULTIPLE ROWS SUBQUERY -- returns multiple rows -- q: find student marks who lives in 'Agra' select * from student_marks sm where sm.id in (select s.id from student s where s.city_name='Agra');
Sometimes the multiple rows subquery can use the NOT IN
or ANY
keyword. Once the sql query is executed the below result will be shown.
2.5.3 Insert into sql query via Select subquery
Insert into sql query via Select subquery is used to select the data from one table and insert it into another table. In the below query we will select the data from the student_marks
table and insert it into student_marks_bk
table.
Insert into sql query via Select subquery
-- INSERT INTO SUBQUERY -- to select data from one table and insert it into another table create table student_marks_bk( id int, science int, maths int, english int, total int ); -- insert into a table using select -- since the 'total' column is not present in the 'student_marks_bk' hence in the select query for 'student_marks' we are having a calculated column 'science+maths+english' -- to copy the total data in the 'student_marks_bk' table insert into student_marks_bk(id, science, maths, english, total) select id, science, maths, english, science+maths+english from student_marks; select * from student_marks_bk smb;
Once the sql query is executed the below result will be shown. The values in the total
column is a calculated column (i.e. addition of values in the science
, maths
, and english
columns of the student_marks
table).
2.5.4 Delete via Select subquery
In the below sql query we will perform the deletion of records from the student_marks_bk
table based on select subquery.
Delete via Select subquery
-- delete from 'student_marks_bk' where calculated column 'science+maths+english' value is < 150; delete from student_marks_bk smb where smb.id in (select sm.id from student_marks sm where (sm.science+sm.maths+sm.english) < 150); select * from student_marks_bk smb;
Once the sql query is executed the below result will be shown. The delete query will delete the records from the student_marks_bk
table where the total is less than 150.
2.5.5 Correlated subquery
Correlated subquery uses row-by-row processing i.e. each subquery is executed once for every row of the outer query. To understand the correlated sql query I will be using an employee
table that has two columns i.e. id
and salary
. The below sql query will fetch the third highest salary from the table.
Correlated subquery
-- q: find 3rd highest salary from employee select * from employee e1 where 3=(select count(distinct salary) from employee e2 where e1.salary<=e2.salary);
Once the sql query is executed the below result will be shown.
3. Summary
In this tutorial, we learned the brief of sql, subqueries, and its practical implementation via different examples. You can download the sql scripts and docker commands used in this tutorial from the Downloads section.
4. Download the Scripts
This was a tutorial to understand subqueries in sql.
You can download the full source code of this example here: Subquery in sql