sql

MySQL Replication Tutorial

1. Introduction

MySQL is one of the most used SQL databases in many of the world-enterprise applications due to its ability to manage transactions and the ACID-behaviour which is built into its core. One of the reasons MySQL is so popular is due to the easiness it provides for its usage and its related commands. In the open source web application framework LAMP (which consist of Linux, Apache, MySQL and PHP), MySQL server is a central & important component. The MySQL Database server is written using C and C++ which internally uses a lexical analyser to parse and understand the SQL queries.

Today, when the systems have become distributed, scalable and highly fault tolerant, we cannot afford a fault in our databases like the DB servers going down and there is no way this is managed automatically. In this lesson, we will study about MySQL Replication, a concept through which we can make sure that even if the MySQL DB for a system goes down, it can shift to its replica and manage data from that as well, without letting users know what happened in the garage. This is one of the reasons that most of the systems today make use of MySQL Replication. MySQL Replication can be done for various reasons, including but not limited to:

  1. To ensure we have a backup of the data directly from our database
  2. To run analytics or check the data without disturbing the main database
  3. Scaling out the DB for better performance

As MySQL replication has so many uses, we must perform it in our production systems. If you want to find out what is MySQL Server and how to use it, read this lesson. Let’s get started.

2. MySQL Setup

For this lesson, we have made two new servers with different IPs which we will be using as Master and Slave in our replica set. To proceed further, we need to setup MySQL Server and Client tools on both of them. We can do this be executing the following command:

Installing MySQL Server and Client

sudo apt-get install mysql-server mysql-client

Once we run this command, the mentioned utilities will be installed on the server. We will do the same installation on both the servers we made. While installing, we also set a MySQL root password:

MySQL Replication - Setting Root Password
Setting Root Password

Once the installation process is completed, we can confirm if the MySQL server is up and running with the following command:

Check MySQL Service status

sudo service mysql status

We will see an output like this:

MySQL Replication - Check MySQL Server Status
Check MySQL Server Status

Now that we know that the MySQL server is up and running, we can connect to it using the username and password we supplied to it during the installation process. We can use the following command to do so:

Logging into MySQL

mysql -u root -p

When we press enter, MySQL Server will silently wait for us to provide the password. When we type it, the password will not be echoed back to the terminal due to security reasons. We will see the following prompt once we have logged into the MySQL command line:

MySQL Replication - MySQL Login
MySQL Login

When we are inside the MySQL command prompt, we can use the given command to show the databases which are present in the system and to ensure that MySQL is running fine:

Show All databases

show databases;

Here is what we get back with this command:

MySQL Replication - Check MySQL Databases
Check MySQL Databases

In the output, MySQL just displays the list of MySQL default databases that are used for administration purpose. Once we can see the Active status on both the servers, we can move ahead with the configuration for Master and Slave databases.

3. Master MySQL Server Configuration

Now that we have active MySQL installation, we can start with the configuration of the master database first. To do this, we need to add configuration in the master MySQL config file. To open it with the nano editor on Ubuntu, use the following command:

Edit config file

sudo nano /etc/mysql/mysql.conf.d/my.cnf

This file contains many options with which we can modify and configure the behaviour of the MySQL server we are running on the system. Firstly, we need to find the bind-address property in the file which will look like this:

Bind Address property

# Instead of skip-networking the default is now to listen only on
# localhost which is more compatible and is not less secure.
bind-address            = 127.0.0.1

We will modify this IP to the current server IP itself so that it looks like:

Updated Bind Address property

bind-address            = <server_ip_here>

The next property we need to find is server-id in the file which will look like this:

Server ID property

# note: if you are setting up a replication slave, see README.Debian about
#       other settings you may need to change.
#server-id              = 1

This will be a commented property which we will uncomment:

Updated Server ID property

server-id              = 1

Next important property to look for is log_bin. This property informs about the file where the replica set details are actually kept. This looks like:

Log Bin property

#log_bin                        = /var/log/mysql/mysql-bin.log

This is the file where the slave logs about the changes it has accomodated in it from the master database. We will just uncomment the property for now. Finally, we will edit the binlog_do_db property which informs the slave database server about which DB to replicate across the Slave database. We can include more than one database by repeating this line for all of the databases we will need:

DB to backup

binlog_do_db            = jcg_database

Here is a glimpse of the config file which shows some of the properties we updated:

MySQL Replication - Updated config file
Updated config file

Once all of the properties are done, we can save the file and make sure that MySQL server is restarted so that these changes are reflected in the server. To restart MySQL server, run the following command:

Restart MySQL

sudo service mysql restart

Once the MySQL Server restartes, the next changes we need to do is inside the MySQL shell itself. So log back in the MySQL command line again.

We need to grant some privileges to the Slave DB so that it can access and replicate the data from the DB we mentioned in the config file, jcg_database. We will execute the following commands to grant the privileges to the Slave DB:

Grant Privileges

GRANT REPLICATION SLAVE ON *.* TO 'root'@'%' IDENTIFIED BY 'password';

Now, flush the privileges with the command:

Flush Privileges

FLUSH PRIVILEGES;

Now, switch to the database we want to replicate after creating it:

Use new DB

mysql> CREATE SCHEMA jcg_database;
Query OK, 1 row affected (0.00 sec)

mysql> USE jcg_database;
Database changed

Now, lock the database to stop any new changes happening in it with the following command:

Apply Read Lock

FLUSH TABLES WITH READ LOCK;

Please note that we made some new tables and inserted some data before we applied this lock. Now, type the following command:

Check Master Status

SHOW MASTER STATUS;

Here is what we get back with this command:

MySQL Replication - Master DB Status
Master DB Status

Note the position as this is the place from where the slave DB will start replicating the database. Now, if we do any changes to DB, it will unlock automatically, so do not do any new changes in same window. The next part is a little tricky. Open a new terminal window or tab (without closing the current tab) and log into the MySQL server and execute the following command:

Dump MySQL

mysqldump -u root -p --opt jcg_database > jcg_database.sql

Here is what we get back with this command:

MySQL Replication - MySQL Dump
MySQL Dump

Now we we can exit the new tab we opened separately and return to older tab. On that tab, unlock the DB and exit from MySQL:

Unlock and Quit

UNLOCK TABLES;
QUIT;

With that, we are done with all the configuration needed to be done on master database.

4. Slave MySQL Server Configuration

Now, we are ready to start configuring our Slave database which will replicate the data. We can log in to the Slave server and open MySQL command line in it. For now, we only need to create a DB with the same name which we want to replicate and exit from the MySQL terminal:

MySQL Replication - MySQL Slave DB
MySQL Slave DB

Now, we need to import the original database into Slave MySQL server with the SQL file we made. make sure that you bring that file on this new server and run the following command to import it into slave MySQL DB:

Import DB

mysql -u root -p jcg_database < /root/jcg_database.sql

Once you hit enter, the database content and metadata will be imported into the slave DB. Once that is done, we can configure the Slave MySQL DB as well:

Configure DB

nano /etc/mysql/mysql.conf.d/mysqld.cnf

We need to make sure that some properties in this config file are setup as well. The firt property is server-id. It is currently set to 1 which is the default value. Set it to some other value:

Server ID for Slave

server-id               = 2

Here are some other properties which we need to modify as well:

Other properties for Slace

relay-log               = /var/log/mysql/mysql-relay-bin.log

log_bin                 = /var/log/mysql/mysql-bin.log

binlog_do_db            = jcg_database

We will have to add relay-log property as it is not present in the config file by default. Once this is done, we need to restart the Slave MySQL DB as well so that the configuration changes we did can come into effect. Execute the following command to do this:

Restart MySQL

sudo service mysql restart

Once the MySQL Server restartes, the next changes we need to do is inside the MySQL shell itself. So log back in the MySQL command line again.

In the MySQL shell, execute the following command:

Enable Replication

CHANGE MASTER TO MASTER_HOST='<master-server-ip>',MASTER_USER='root', MASTER_PASSWORD='hello123', MASTER_LOG_FILE='mysql-bin.000001', MASTER_LOG_POS=  1306;

This command completes various steps at a single time, which are:

  • It informs current MySQL server that it is Slave to given MySQL Master Server
  • It provides Slace with login credentials to Master Server
  • It informs about the position from which Slave needs to start the replication process and also the log file details

We can now finally activate the slave server with the following command:

Activate MySQL Slave Server

START SLAVE;

We can now see some master details with the following command:

MySQL Master Status

SHOW SLAVE STATUS\G;

Here is what we get back with the above command:

MySQL Master Status Info

mysql> SHOW SLAVE STATUS\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 206.189.133.122
                  Master_User: root
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 1306
               Relay_Log_File: mysql-relay-bin.000002
                Relay_Log_Pos: 320
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1306
              Relay_Log_Space: 527
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: 3b890258-be5d-11e8-88c2-422b77d8526c
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.00 sec)

If there is an issue in connecting, you can try starting slave with a command to skip over it:

MySQL Master Status

SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1; SLAVE START; 

Now, we are done with MySQL replication and you should see the data being replicated across MySQL servers. To test the replication, you can try inserting some data into the Master database and check if the data is replicated to the slave database as well.

5. Replication lag

MySQL replication makes use of two threads to complete replication across master and slave databases:

  1. IO_THREAD
  2. SQL_THREAD

The IO_THREAD connects to the master MySQL server, reads the binary log to track and change events in the database, copies them to the local relay log file from where SQL_THREAD of Slave database reads and track the changes and replicate them to the Slave database. If we observe any replication lag, we must first identify if this lag is from Slave’s IO_THREAD or Slave’s SQL_THREAD.

Usually, I/O thread doesn’t cause any significant replication delay as it is just reading the binary logs from the master database but there are factors which can affect its performance like network connectivity, network latency etc. and how fast is the communication network line between the two servers. If there are high amount fo writes on Master, the replication could be slow due to bandwidth issues.

On the other side of the coin, if it is the SQL Thread on Slave which is causing the delay, the most probable reason for the same is that the SQL queries from the Master DB are taking too long to execute on Slave database which piles up the changes which need to be done. Also, take into account that slave is single threaded prior to MySQL 5.6, which would be another reason for delays on the slave SQL_THREAD.

6. Advantages of Replication

To reiterate what we stated earlier briefly, let us state here some explicit advantages of MySQL replication in a production environment:

  • Performance: A Slave server can be easily used to provide READ support to any client asking for data. This means that the load on the Master database is reduced a lot as there are no reads being done on it.
  • Backup Performance: If there are any backup tasks which are run, it can be run via the Slave database as data is being replicated. This means that backup jobs will not affect the Master database at all.
  • Disaster Recovery: In the events of Master database going completely offline, the Slave database can quickly take its place and start performing write operations, if configured in that manner. This will allow minimal site downtime as the master server is rebuilt and restored.

7. Disadvantages of Replication

Although everything sounded so good about MySQL Replication, there are some downsides about the replication process as well, which are stated here:

  • Complexity: An application with a lot of Slaves for replication can create a maintenance nightmare if not managed correctly but this can be easily overcome
  • Performance: To complete the replication process, the binary logs need to be written to the disk. Although this can be minimal but it still needs to be considered while looking at overall server performance. This can be solved by making binary logs written to a separate partition of the disk to limit IO performance issues.

8. Limitations of Replication

There are some limitations (or replication design) points which needs to be stated explicitly here:

  • Replication is not a backup for application logic and any changes which are done on the Master database will always replicate to Slave database and there is no way this can be limited. If a user deletes data on the master database, it will be deleted on the Slave database as well.
  • In case of multiple Slaves, performance is not increased but decreased as well because the database connections are now distributed across multiple servers and the risk of issues in case of any server fails will increase.

9. Types of MySQL Replication

Essentially, MySQL support three different methods to replicate data from the master server to a slave server. All of these ways make use of the binary logs but it differs in the way the logs are written to it. Here are the ways replication is done:

  1. Statement-based replication: With this method, with each change in the database, the SQL statements for those changes are stored in the binary log file. The slave will read these SQL statements and execute them on its own MySQL database to produce the exact same copy fo data from the master server. This is the default replication method in MySQL 5.1.11 and earlier and MySQL 5.1.29 onwards.
  2. Row-based replication: In this method, the binary log file stores all of the record-level changes which occur in the master database tables. The slave server reads this data and updates its own records based on master’s data to produce an exact replica of the master database.
  3. Mixed-format replication: In this method, the server will dynamically select between statement-based replication and row-based replication, depending on some conditions like using a user-defined function (UDF), using an INSERT command with the DELAYED clause, using temporary tables, or using a statement that uses system variables. This is the default replication method in MySQL 5.1.12 to MySQL 5.1.28.

In a use-case when you are not sure about which replication method to move forward with, it’s best to go with Statement-based replication as it is used most commonly and simplest to carry out. Though, if you have a write-heavy system, Statement-based replication is not recommended as it applies table locks as well. In that case, you can go with Row-based replication method which requires a lot less table locks.

10. Effect on Performance

As mentioned, replication can affect the performance of your database but it can be minimal. The impact on the master server from replication is typically very small compared to all of the other things it has to do because the master only really has to accomplish two significant things in a replication environment:

  • formulate and write events to the binary log on the local hard drive
  • send a copy of every event it writes to the binary log to every connected slave

We cannot consider writing the binary log to be a cost of replication, because we should always have binary logging turned on even if we’re not replicating. It’s an extremely valuable troubleshooting and recovery tool.

The cost of sending the replication events to the slave(s) is also negligible because the slave is responsible for maintaining a persistent TCP connection to the master, which only has to copy the data onto the socket as the events occur. Beyond that, the master neither knows nor cares whether or when the slave gets around to executing them.

A partial exception to that last statement is semi-synchronous replication, which is not the default. In this mode, the master waits for at least one slave to acknowledge receipt and persistent storage (though not actual execution) of the binary log events from each transaction, before the master returns control to the client on each commit.

But in every case, the master is not responsible for actually executing the updates on the slave — it just sends one of two things to the slave: either a copy of the actual input query that ran (in statement-based mode) or the data for the rows actually inserted/updated/deleted by each query (in row-based mode). In mixed mode, the query optimizer will decide which format to use on a per-event basis.

11. MySQL Replication – Conclusion

MySQL Replication is an excellent choice (besides its disadvantages and limitations) when it comes to making sure that a production system runs with failover reliability which makes it a fault-tolerant system, which is a must when it comes to distributed and highly available systems today. The replication is so important to be done that even the cloud distributors like AWS support it natively.

This lesson provides with important configuration changes which need to be done for a system to replicate its data across a single slave. We can set up any number of slaves without disturbing the master server as there is no configuration on the Master server which relates or binds it to any Slave server. We also provided some knowledge about the advantages, disadvantages and limitations MySQL replication has which were important to be stated explicitly.

Shubham Aggarwal

Shubham is a Java Backend and Data Analytics Engineer with more than 3 years of experience in building quality products with Spring Boot, MongoDB, Elasticsearch, MySQL, Docker, AWS, Git, PrestoDB tools and I have a deep knowledge and passion towards analytics, Micro-service based architecture, design patterns, antipatterns and software design thinking.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Yatin
5 years ago

Thank you for the tutorial. It is great! :)

Back to top button