SQL Count Function
In this article, we will discuss the COUNT function in SQL, different ways of using it, and how to combine it with other keywords like GROUP BY
and HAVING
.
1. Introduction
Many times, the information we want from a database table is not available in a single row but in a group of rows or in the result of a mathematical calculation. That is, the data in the group of rows need to be combined or a value expression calculated to obtain a single value of useful business information. To extract data in such cases, SQL
(Structured Query Language) provides us with operations that are called aggregate functions: AVG
, COUNT
, MIN
, MAX
, SUM
. All these aggregate functions return a single value.
2. Salient Features
COUNT
is an aggregate function that counts the number of rows in each group. The most basic aggregation is when you specify a *, which will tell you how many rows are there in the result set, including rows containing duplicate and Null values.
If you specify a column name, COUNT
tells how many non-null values are present in the column.
The GROUP BY
clause runs after the WHERE
clause has been evaluated. Here, we want the database server to group the result set into groups based on some column(s). These are specified using the GROUP BY
keywords. The system will collect the rows with the same values in these specified columns, into a group. It is important to remember that the GROUP BY
clause refers to columns generated by the FROM
and WHERE
clauses and cannot use any expression that appears in the SELECT
clause.
For any queries, we will need to get extra columns along with columns generated by aggregate functions. These columns are added to the SELECT
clause and should also be added in the GROUP BY
clause.
We need not specify columns in the GROUP BY
clause. Instead, we can use ordinal positions that correspond to each column position in the SELECT
part of the query.
While the COUNT
function determines the number of elements in each group, sometimes we may not want the duplicate values of a column from all members of the group. To achieve this, we use the DISTINCT
keyword along with the column name. However, we cannot use DISTINCT
with COUNT(*)
.
Sometimes, want to do second-level filtering on the grouped rows. For this, we use the HAVING
operator to apply filter condition(s) on the grouped result set. The HAVING BY
clause can be used without a GROUP BY
clause in which case, the HAVING BY
clause filters all rows returned from the WHERE
clause as if they are a single group. One key point here is that the WHERE
clause cannot have aggregate functions whereas the HAVING
clause can contain.
The ORDER BY
keywords can be put at the end of a SQL statement. It sorts the data based on the columns specified to it. The default sorting is ascending order, so if we want to get the result set in reverse order, we use the keyword DESC
.
3. Tables and Data
We will use three tables to demonstrate the usage of queries using COUNT
. The tables are: departments
and employees
pertaining to the HR function and a book_sales
table of a book_store
database.
The following diagram shows the structure of these tables.
In the employees
table, each employee (including managers) has his/her own employee_id
and they have their manager’s employee_id
in the manager_id
column. The department managers have their manager_id
as NULL
.
The book_store
database has six titles and monthly sales of each book are recorded in the book_sales
table. The month column is an integer, so 1 is used for January and 12 for December. The quantity
column has the sales number and if there are no sales of a particular book in the month, the quantity
is zero. In effect, there are six rows every month, one per book.
4. SQL Count Function – Examples
In this section, we demonstrate the usage of the COUNT
function in SELECT
queries on the data in the employees
and departments
tables as well as the book_sales
table. For each example, we first state the business requirement and then show its implementation as a SELECT
query. Along with some notes, screenshots of the result set are given below each query.
4.1 How many employees are there in each department?
SELECT department_id "Department Id", COUNT(*) "No. of employees" FROM employees GROUP BY department_id;
4.2 Which departments have employee count more than three?
For this requirement, we need to use GROUP BY
and HAVING
clauses.
SELECT department_id "Department Id", COUNT(*) "No. of employees" FROM employees GROUP BY department_id HAVING COUNT(*) > 3;
4.3 Department and employee count
Let’s say the business requirement is: We need department names with employee count arranged in decreasing order of employee count. For this requirement, we have to use the DESC keyword.
SELECT name Department, COUNT(employee_id) "No. of employees" FROM employees e INNER JOIN departments d ON e.department_id = d.department_id GROUP BY e.department_id ORDER BY COUNT(employee_id) DESC;
4.4 Department and manager names
Suppose management asks that we want a more refined report than the previous one. Instead of department_id
s, we want department names with department manager names. This requirement needs two JOIN
s. We need to take two instances of the employees
table and join each with a separate instance of the departments
table.
SELECT e1.first_name "Manager first name", e1.last_name "Manager last name", d1.name Department, COUNT(e2.employee_id) "No. of employees" FROM employees e1 JOIN departments d1 ON e1.department_id = d1.department_id, employees e2 JOIN departments d2 on e2.department_id = d2.department_id WHERE d1.department_id = d2.department_id AND e1.manager_id IS NULL AND e2.manager_id IS NOT NULL GROUP BY e1.first_name, e1.last_name, e2.department_id ORDER BY COUNT(e2.employee_id) DESC;
4.5 Zero-sale months
Let’s say we are asked to find out, for each book, how many months were there with no sales? For this case, the required rows can be retrieved by checking if quantity
is zero.
SELECT book_id, COUNT(quantity) "Zero sales months" FROM book_sales WHERE quantity = 0 GROUP BY book_id;
4.6 Monthly sales
Suppose we are asked to report in each month, how many of the six books were sold and what are the total copies sold in that month. We combine COUNT
with SUM
aggregate function.
SELECT month Month, COUNT(quantity) "Num books sold", SUM(quantity) "Monthly sales" FROM book_sales WHERE quantity > 0 GROUP BY month;
4.7 Months in which books were sold and annual sales
If the management tells us they want to know for each book, the number of months in which sales happened and its annual sales reported with book title in decreasing order of sales. For this requirement, we combine COUNT(*)
with SUM
and use DESC
.
SELECT title Title, COUNT(*) "Sale months", SUM(quantity) "Annual sales" FROM book_sales bs JOIN books b ON b.book_id = bs.book_id WHERE quantity > 0 GROUP BY b.book_id ORDER BY SUM(quantity) DESC;
4.8 Last quarter sales
Let’s say the requirement is to get sales for the last quarter only. To get the groups for the last quarter of the year, we need to use the filter month
greater than 9.
SELECT month Month, title Title, SUM(quantity) "Monthly sales" FROM book_sales bs JOIN books b ON b.book_id = bs.book_id GROUP BY month, title HAVING month > 9 ORDER BY month, SUM(quantity) DESC
5. Download the Source Code
This article was a tutorial on various use cases of the SQL COUNT function. The SQL
source code for creating the tables, inserting the sample data, and all queries discussed in this article are available in a zip file.
You can download the full source code of this example here: SQL Count Function
Last updated on Feb. 24th, 2022
Very informative Thanks Mahboob ji for sharing