SQL Data Types Tutorial
1. Introduction
SQL is the language used to communicate with Relational Databases such as MySQL, Oracle, MS SQL Server, PostgreSQL, etc. In this post, we will examine the various SQL Data Types and provide examples for MySQL, which is the most widely used open-source database. This post requires some knowledge in SQL and a tutorial for a complete understanding of SQL is found here.
Table Of Contents
2. Prerequisites
Install MySQL
To run the examples we will have to install the MySQL Community Server and use the MySQL Command-Line Client to interact with it. The MySQL version we will use is 5.7.26. From this link, download the MySQL Community Server 5.7.26 for your operating system and follow the installation instructions. Once MySQL is installed, you should be prompted to change the root password. Do not forget that password as it will be the only way to login to MySQL. Finally, MySQL should automatically start as a service.
Create Test Database
The MySQL Command-Line Client comes with the MySQL Community Server so you won’t have to install any additional application. To run the examples of this post, we will execute SQL commands through the MySQL Command-Line Client. Before running any example, we would have to login to MySQL and create a new test database and table, in which we will run all our examples.
Login to MySQL
Open a terminal window and run the following command to login to MySQL:
Login to MySQL
$ mysql -u root -p
This login command uses the root user, which is the default root user for MySQL. After that, you will be prompted to enter the password you set when you installed MySQL. Upon successful login you will see the following welcome message:
Output
Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 4 Server version: 5.7.26 MySQL Community Server (GPL) Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
Congratulations, you successfully logged in to MySQL and you can use the MySQL Command-Line Client to execute SQL commands. The next step would be to create a test database.
Create a Test Database
The test database will be used to run the examples of this post. In the MySQL Command-Line Client run:
Create new empty database
mysql> create database test; Query OK, 1 row affected (0.00 sec)
The above command will create a database named test and output that the operation was successful. To confirm that the database was created run:
Show all databases
mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | | test | +--------------------+ 5 rows in set (0.00 sec)
The show databases command shows all the databases created for MySQL. The output shows that our test database was successfully created. Finally, to use the test database we need to run the following command:
Switch to test database
mysql> use test; Database changed
We have now switched to the test database and we are ready to start running examples in it.
3. SQL Basics
Before we dive into the SQL Data Types let’s see some basics for SQL. SQL stands for Structured Query Language and allows you to do CRUD (Create, Read, Update, Delete) operations on database tables. A table is a collection of data and is made up of rows and columns. The rows are used to store data and the columns hold the data type. In this post, we will focus on the data types which are essentially the column part of a table. Each table must have a unique identifier for the rows, also called the primary key.
Creating a table
Let’s now create the table that we will use in the examples below. The table simply represents the articles of a blog. In the MySQL Command-Line Client run:
Create a test table
mysql> CREATE TABLE articles ( id INT AUTO_INCREMENT, title VARCHAR(64) NOT NULL, posted_by VARCHAR(64) NOT NULL, posted_date DATE, active BIT(1) NOT NULL, last_update_date DATETIME, rating_percentage DOUBLE, views BIGINT, no_of_comments SMALLINT, category ENUM('JAVA', 'QUARTZ'), body TEXT, metadata JSON, PRIMARY KEY (id) );
In the table above, the id is the primary key which auto increments during an insert. We will examine the rest of the columns in the following section.
View the table structure
To view the structure of the articles table we just created, run the following command:
Show articles table structure
mysql> describe articles; +-------------------+-----------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------------------+-----------------------+------+-----+---------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | title | varchar(64) | NO | | NULL | | | posted_by | varchar(64) | NO | | NULL | | | posted_date | date | YES | | NULL | | | active | bit(1) | NO | | NULL | | | last_update_date | datetime | YES | | NULL | | | rating_percentage | double | YES | | NULL | | | views | bigint(20) | YES | | NULL | | | no_of_comments | smallint(6) | YES | | NULL | | | category | enum('JAVA','QUARTZ') | YES | | NULL | | | body | text | YES | | NULL | | | metadata | json | YES | | NULL | | +-------------------+-----------------------+------+-----+---------+----------------+
This command is very useful when we want to view the data type for a table.
Inserting test data into table
Finally, we have to insert some articles into the table we created in the previous section. Open the MySQL Command-Line Client and run:
Insert data into articles table
mysql> insert into articles(title, posted_by, posted_date, active, last_update_date, rating_percentage, views, no_of_comments, category, body, metadata) values ('Java Microservices', 'bob', '2019-04-01', 1, '2019-03-29 05:10:23', 85.23, 453, 5, 'JAVA', 'blah', '{"total_author_posts": 15, "tags": ["Java", "Microservices"]}'), ('Quartz Best Practices', 'tom', '2018-11-05', 1, '2018-11-04 15:43:00', 76.3, 7834, 28, 'QUARTZ', 'blah', '{"total_author_posts": 4, "tags": ["Quartz"]}'), ('Java HashMap', 'tom', '2015-04-24', 0, '2015-04-20', 34, 6543, 2, 'JAVA', 'blah', '{"tags": ["Java"]}');
We added three rows to the test table. Notice that we didn’t add any value in the id column, as it auto increments as we said before. To confirm that run:
View articles data
mysql> select id, title from articles; +----+-----------------------+ | id | title | +----+-----------------------+ | 1 | Java Microservices | | 2 | Quartz Best Practices | | 3 | Java HashMap | +----+-----------------------+ 3 rows in set (0.00 sec)
That returns 3 rows that have an auto-incremented id. We have finished creating a test table and insert some test data into it. In the following section, we will see the SQL Data Types and use the table we created in our examples.
4. SQL Data Types
Like any programming language, SQL also has many data types. If we could divide the SQL data types into categories then those would be:
- Numeric
- Character
- Date and Time
- JSON / XML
Each relational database does not have support for all the data types that we will see in the following sections. We will examine the most widely used ones and focus on the equivalent data types of MySQL in the code examples.
4.1 Numeric
The numeric data types hold numbers with or without scale and are divided into 4 main categories:
- BIT
- INTEGER
- FIXED POINT
- FLOATING POINT
Let’s see in more detail each one of those in the following sections.
BIT
The BIT(M) data type stores bit values. It usually ranges from 0-1 but in MySQL, it ranges from 1-64. The BIT(1) can also act as a boolean (true or false).
INTEGER
There are various integer data types that should be used depending on the range needed for each use case. The following table shows all the integer data types and their range and required storage.
DATA TYPE | RANGE FROM | RANGE TO | STORAGE |
TINYINT | 0 | 255 | 1 Byte |
SMALLINT | -32,768 | 32,767 | 2 Bytes |
MEDIUMINT | -8,388,608 | 8,388,608 | 3 Bytes |
INT | -2,147,483,648 | 2,147,483,648 | 4 Bytes |
BIGINT | -9,223,372,036,854,775,808 | 9,223,372,036,854,775,808 | 8 Bytes |
FIXED POINT
The DECIMAL(M, D) is a fixed-point data type that has fixed precision and scale. M
is the total number of digits (the precision) and D
is the number of digits after the decimal point (the scale). In MySQL, the maximum number of digits (M
) for DECIMAL is 65 and the maximum number of supported decimals (D
) is 30.
FLOATING POINT
The floating-point data types are the DOUBLE(M, D) and FLOAT(M, D). The DOUBLE is a small (single-precision) floating-point number, whereas the FLOAT is a normal-size (double-precision) floating-point number. The following table shows the ranges and required storage for those data types:
DATA TYPE | RANGE FROM | RANGE TO | STORAGE |
FLOAT | -3.402823466E+38 | 3.402823466E+38 | 4 Bytes |
DOUBLE | -1.7976931348623157E+308 | 1.7976931348623157E+308 | 8 Bytes |
Example
Let’s go back to our articles table we created before and examine the numeric columns of it:
- id INT
- active BIT(1)
- rating_percentage DOUBLE
- views BIGINT
- no_of_comments SMALLINT
The id is the primary key and is of type INT. Note that if we had too many records into this table then having declared that as INT would not be sufficient. The active column is a BIT(1) which acts as a boolean. The rating_percentage is a DOUBLE and takes values with precision. The view is BIGINT as we might reach a huge number of viewers. Finally, the no_of_comments is a SMALLINT as we might need to limit the comments for our articles.
To view all those columns open the MySQL Command-Line Client and run:
View numeric columns
mysql> select id, active+0, rating_percentage, views, no_of_comments from articles; +----+----------+-------------------+-------+----------------+ | id | active+0 | rating_percentage | views | no_of_comments | +----+----------+-------------------+-------+----------------+ | 1 | 1 | 85.23 | 453 | 5 | | 2 | 1 | 76.3 | 7834 | 28 | | 3 | 0 | 34 | 6543 | 2 | +----+----------+-------------------+-------+----------------+ 3 rows in set (0.00 sec)
This query returns all the numeric columns of our articles table. Note the active+0 we used here. This is because the BIT is stored as a binary in MySQL so we have to convert it to text in order to view it properly.
4.2 Character
The Character data type is a synonym for the String data type. Each character data type can be stored as a string or binary and has an equivalent Unicode data type. The following table shows a list of Character data types:
DATA TYPE | UNICODE DATA TYPE | BINARY | STORAGE |
CHAR | NCHAR | BINARY | Fixed-length – uses the same amount of storage space per entry |
VARCHAR | NVARCHAR | VARBINARY | Variable-length – uses the amount necessary to store the actual text |
TEXT | NTEXT | BLOB | Variable-huge-length – uses the amount necessary to store the actual text |
CHAR – NCHAR – BINARY
The length of the CHAR is fixed to the length that you declare when you create the table. In MySQL, the length can be any value from 0 to 255. The NCHAR is the Unicode version of CHAR and the BINARY is the binary version of it.
VARCHAR – NVARCHAR – VARBINARY
Values in the VARCHAR are variable-length strings. In MySQL, the length can be specified as a value from 0 to 65,535. The NVARCHAR is the Unicode version of VARCHAR and the VARBINARY is the binary version of it.
TEXT – NTEXT – BLOB
The TEXT is a variable-length string that stores huge strings in a non-binary format. The NTEXT is the Unicode version of TEXT and the BLOB is the binary version of it.
ENUM
Another character data type is the ENUM. The ENUM is a list of string values that are defined during the table creation. You can’t set an ENUM to a value that is not defined in the list.
Example
Let’s go back to our articles table we created before and examine the character columns of it:
- title VARCHAR(64)
- posted_by VARCHAR(64)
- body TEXT
- category ENUM(‘JAVA’, ‘QUARTZ’)
The title and posted_by are VARCHAR with a maximum length of 64 characters. The body is the actual article body and should be a huge string that’s why it’s declared a TEXT. The category is an ENUM which can take only 2 values, JAVA or QUARTZ.
To view all those columns open the MySQL Command-Line Client and run:
View character columns
mysql> select id, title, posted_by, category, body from articles; +----+-----------------------+-----------+----------+------+ | id | title | posted_by | category | body | +----+-----------------------+-----------+----------+------+ | 1 | Java Microservices | bob | JAVA | blah | | 2 | Quartz Best Practices | tom | QUARTZ | blah | | 3 | Java HashMap | tom | JAVA | blah | +----+-----------------------+-----------+----------+------+ 3 rows in set (0.00 sec)
The result set shows all the character columns. Note here that the body column is shown as a string and not as a binary.
Let’s try to change the category value of the first row to a value that is not defined in the ENUM and see the error we’ll get:
Change the value of ENUM
mysql> update articles set category = 'Microservices' where id = 1; ERROR 1265 (01000): Data truncated for column 'category' at row 1
As we see MySQL returns an error when we try to set the value of an ENUM data type to a non defined value.
4.3 Date and Time
The DATE and TIME data types are very important when it comes to saving dates and times. The following table shows the different date and time data types:
DATA TYPE | FORMAT |
DATE | YYYY-MM-DD |
DATETIME | YYYY-MM-DD HH:MI:SS |
TIME | HH:MI:SS |
YEAR | YYYY |
DATE
The DATE stores year, month and day in YYYY-MM-DD format.
DATETIME
The DATETIME stores year, month, day, hour, minute and second in YYYY-MM-DD HH:MI:SS format.
TIME
The TIME stores hour, minute and second in HH:MI:SS format.
YEAR
The YEAR stores year in 2-digit (YY) or 4-digit (YYYY) format.
Example
Let’s go back to our articles table we created before and examine the date and time columns of it:
- posted_date DATE
- last_update_date DATETIME
The posted_date is a DATE as we are not interested in the time of the posted date. The last_update_date is a DATETIME as it shows the exact date and time the article was updated.
To view all those columns open the MySQL Command-Line Client and run:
View date and time columns
mysql> select id, posted_date, last_update_date from articles; +----+-------------+---------------------+ | id | posted_date | last_update_date | +----+-------------+---------------------+ | 1 | 2019-04-01 | 2019-03-29 05:10:23 | | 2 | 2018-11-05 | 2018-11-04 15:43:00 | | 3 | 2015-04-24 | 2015-04-20 00:00:00 | +----+-------------+---------------------+ 3 rows in set (0.00 sec)
The result set returns the 3 rows with the date and time columns. Note that the last_update_date of the third row didn’t have time when we inserted it, but MySQL formats that column with time and returns 00:00:00.
4.4 JSON / XML
The JSON and XML are special SQL data types that store JSON or XML documents respectively. The JSON data type provides automatic validation of JSON documents and optimized storage format. You can validate a JSON before inserting it here. The XML data type is not supported by MySQL but you can use a TEXT data type to store XML.
Example
Let’s go back to our articles table we created before and examine the JSON column of it:
View JSON column
mysql> select id, title, metadata from articles; +----+-----------------------+---------------------------------------------------------------+ | id | title | metadata | +----+-----------------------+---------------------------------------------------------------+ | 1 | Java Microservices | {"tags": ["Java", "Microservices"], "total_author_posts": 15} | | 2 | Quartz Best Practices | {"tags": ["Quartz"], "total_author_posts": 4} | | 3 | Java HashMap | {"tags": ["Java"]} | +----+-----------------------+---------------------------------------------------------------+ 3 rows in set (0.00 sec)
The above JSON values are valid, otherwise, they wouldn’t be inserted. Let’s try and update the metadata column with an invalid JSON and see what will happen:
Update JSON to invalid one
mysql> update articles set metadata = '{"tags: Java"}' where id = 1; ERROR 3140 (22032): Invalid JSON text: "Missing a colon after a name of object member." at position 14 in value for column 'articles.metadata'.
As we see, MySQL does not allow us to insert an invalid JSON in a JSON column and returns an error.
5. SQL Data Types – Best Practices
In this section, we will explore best practices for choosing the appropriate SQL Data Type. By doing that you will get better performance of disk, memory and CPU. Let’s see below some of those best practices.
Choose the smallest data type
Make sure you always choose the smallest data type for a table column. That will be beneficial due to less space on the disk, in memory, and in the CPU cache. If you are in doubt as to which data type is the smallest for your use case, then always choose the smallest that you are sure you won’t exceed. You can always change the data type, but you should avoid doing that as it is a time-consuming operation. For example, in the articles table, we used the SMALINT data type for the no_of_comments columns as we assumed that we wouldn’t have more than 32,767 comments for an article.
Choose the simplest data type
Always choose the simplest SQL data type. By that, we mean that integer is cheaper to compare than character and date and time are better than storing a DateTime as a character. This will lower the CPU required for operations such as comparing or sorting.
Avoid NULL data type values
Many developers choose to make a table column NULL, as they are not 100% sure if it can take a non-null value during an insert. It’s harder for a database engine to optimize queries that refer to NULL columns, because they make indexes and comparisons more complicated, use more storage space and require special processing.
Avoid using ENUM
At first glance, ENUM seems to be a great data type to use, as it has a predefined list it can take values from and as such it’s easy for developers to know beforehand which value to use. However, there will be times that you would want to add new values to the ENUM list. That is because, when adding a new ENUM value the database will have to rebuild the entire table which might take a lot of time and will slow the overall performance of the database.
6. Conclusion
In this post, we examined the various SQL Data Types: Numeric, Character, Date and Time, JSON / XML. We also created a test database and table using the open-source MySQL database and provided several code examples. Finally, we covered best practices for choosing the right SQL Data Type to improve the performance of a database.
7. Download the SQL commands
You can download the SQL commands of the above examples here: SQL Data Types Tutorial
Last updated on Nov. 07th, 2021