MongoDB is the most famous and widely used NoSQL database. It has been used in almost every programming language today and hence it is important to setup a standard way of connecting to the database. In this article, we discuss one such common item which needs to be understood and used while establishing a database connection to MongoDB irrespective of the programming language.
The article discusses about the formation of the connection string used to connect to MongoDB in any programming language. Just like any other database, MongoDB also requires a connection string to specify the connection parameters like the URL, port, username as well as password for the database. Additionally, MongoDB also supports connection of multiple database at a time using a connection string with multiple parameters. Herein, we will discuss in detail all these aspects one after the other.
2. Standard connection string
In this section, we will discuss the standard connection string format that is being followed throughout the different available libraries and drivers. A list of drivers and libraries used to connect to MongoDB is available here. The below code displays the standard connection string that is used.
In the above example, there are a total of five parameters as discussed below:
[username:password@]: This part of the connection string is used to specify username and password used to connect to the specified database. The password used here needs to be plain string. MongoDB automatically tries to match it with the encrypted hash stored in the backend. The username and password needs to be for the database specified in the URL. We will be covering this parameter further.
[host1][:port1]: The second parameter in the URL string is the host and port for the primary database that we are trying to connect. The host could be an IP or a relevant subdomain or domain URL connected to a mongoDB server. A typical specification of the host and port will be as shown below.
The above URL almost completed the connection string for connecting to a MongoDB. If you notice carefully, we are yet to specify a database that is to be connected. Just like MySQL has got different Schemas, MongoDB also has different databases you can connect to. The next parameter to discuss is the database. You can specify the database to connect to using a URL string parameter itself. Below code shows how to connect to a specific database
database here specifies the database to be connected. Note here that the username and password specified above needs to be for this database that you plan to connect.
3. Complex connection string
MongoDB supports replication as well as sharding. Depending on what you use, the connection string also could get affected. Before we proceed towards understanding how to specify the connection string, let us understand the terminologies – Sharding and replication.
3.1 What is sharding and replication?
Replication is a method of configuring a copy of database generated automatically. MongoDB provides the facility to configure a automated replication of the database instance using just a minor configuration tweak. This replication instance will hold the same data and thus prevent any data loss from occuring. The replication instance does not really come into picture for the application server and thus it does not share any load.
Sharding, however is different. Sharding is a replica set that share the load on the master database. A shard in mongodb is an instance that server the application server just like the main database. It holds all the data just like the master database and server the application server requests routed to it.
3.2 Modifying connection string to use shards of MongoDB master database
In order to use the shards from the mongoDB database, we need to specify the respective URLs in the connection string. These are nothing but the IPs or URLs of the mongoDB server holding the shard copy of the master database. The shard URL can be specified by adding more IPs or URLs separated by comma. This is shown below.
Note here that the database name is specified after the URLs/IPs have been specified. Thus, every shard instance holds the database under the same name always. In this manner, we could connect to multiple instances of shards for a single master MongoDB server.
4. Connection string options
The above sections cover in detail the various parameters contained in the URL string. These parameters are related to the database that we are supposed to connect. In this section, we will discuss the various options that can be specified for the database connection. These parameters can be passed in a similar way as the query string parameters. Before we understand how to specify different connection options, let us first understand the options that are available to specify for the MongoDB.
replicaSet: This connection option is used to specify a replica set if the mongoDB master database is using one. When specifying this attribute, it is advisable to provide a seedlist of atleast two mongoDB databases to make the connection fail safe.
ssl: This attribute as the name indicates is used to configure whether the connection should be using SSL protocol to establish connection or not. A value “true” for the attribute will indicate that SSL protocol needs to be used for the database connection.
connectTimeoutMS: The attribute is used to specify connection timeout in milliseconds. Its default value is ideally infinite which means the connection will never timeout.
socketTimeoutMS: This attribute, as the identifier indicates, is used to specify the socket timeout. By default, the socket connection never times out. A socket is basically the atomic channel used for data exchange in MongoDB. Thus, this timeout basically indicates how long should the driver wait for the data transaction to go through in MongoDB.
These attributes could be specified easily in the connection string URL as the query string parameters. The same has been shown in the code snippet below.
Thus, the connection string is similar to URL in normal context.
5. Advanced connection options
In this section, we will see few advanced commands that can be used with the connection string. Just like any other database, MongoDB drivers also support connection pool. The connection pool configuration is normally specified using functions or configuration files for other databases. However, in MongoDB, it is possible to specify the connection pool parameters right into the connection string. To specify these connection parameters, we use a similar syntax as shown above. Let us understand the available connection pool commands first.
5.1 Connection Pool parameters
maxPoolSize: The parameter is used to configure the mmaximum pool size of connections for the MongoDB. The maximum pool size indicates the maximum number of connections that should remain open at any point of time.
minPoolSize: Similar to the
maxPoolSize, this parameter defines the minimum number of connections that should be kept active throughout the execution of the application.
maxIdleTimeMS: There will be instances where the connection has been created in the connection pool but not being used. Such an instance is called an idle connection. This parameter is used to define how long should the application maintain such an idle connection. The specified time is in milliseconds.
waitQueueMultiple: There is a limit to the number of connections that could wait in the queue for every connection in the pool. This limit can be configured using this parameter. The number of users that can wait for a connection at any point of time is equal to the maximum pool size multiplied by the value of this parameter.
waitQueueTimeoutMS: For the requests waiting in the queue, there is a defined limit for the amount of time it should wait in the queue. This parameter is used to specify how long should a request wait in the queue until served. The parameter value is specified in milli seconds.
All the above parameters can be applied in the connection string just like the connection parameters. The below code snippet shows how to pass these parameters in the connection string.
5.2 Read & Write Concern parameters
The next set of parameters that are passed in the connection string are the parameters specifying some configuration for the read and write of data. These parameters are explained below.
w: The option is used to specify whether MongoDB should acknowledge the propogation of the write changes in the database or not. The parameter takes values 0 and 1. The value 0 indicates that an acknowledgement is not really required.
wtimeoutMS: Used to specify the timeout for waiting for the acknowledgement of data write. This timeout gets started from the time the insert query has been executed. The parameter value is specified in millisecond unit.
readConcernLevel: This option allows you to define a level of isolation for the reads. There are four different level that can be defined. These are explained below.
local: The query returns data from the instance with no guarantee that the data has been written to a majority of the replica set members (i.e. may be rolled back). This means that the data may or may not be replicated and persisted into all the replicas but still available in the directed copy of the database.
available: The query returns data from the instance with no guarantee that the data has been written to a majority of the replica set members (i.e. may be rolled back).
This is the default for reads against the secondaries if the reads are not associated with causally consistent sessions.
majority: The query returns the data that has been acknowledged by a majority of the replica set members. The documents returned by the read operation are durable, even in the event of failure. This option basically returns the documents which are written onto the majority of the replica set members and hence is believed to be more consistent.
linearizable: This is the fourth type of read concern that can be specified. This is the read level that only returns the records. This type of read level waits for the concurrent ongoing writes to be written to the respective replica sets and only then returns the data. Thus, such a read level is preferrable in an environment where high data consistency is of importance.
snapshot: This is the final type of read level that can be specified. This mode is only available for multi document transactions.If a transaction is not part of casually consistent sessions,the data being written with the write concern level majority gets written to database snapshot on commit. Thus, this read concern level returns the data which is mostly committed and would not be rolled back in any case.
These parameters can also be specified in the similar way as the connection pool parameters shown above.
5.3 Authentication options
The next set of options are the authentication options for the database. The first parameters is the source of authentication.
authSource: The authentication source for the database could be anything from an LDAP to Kerberos to database based authentication. To specify database based authentication, this parameter can be assigned the name of the database that will be used for authentication. In case of external methods being used for authentication, the parameters value to be passed will be
authMechanism: The parameter is used to specify the authentication mechanism to be used for the authentication purpose. The value of the parameter will be one of the encryption methods listed below:
- SCRAM-SHA-256 (Added in MongoDB 4.0)
- MONGODB-CR (Removed in MongoDB 4.0)
- GSSAPI (Kerberos)
- PLAIN (LDAP SASL)
gssapiServiceName: Set the Kerberos service name when connecting to Kerberized MongoDB instances. This value must match the service name set on MongoDB instances. The default value of the service name is mongodb for all the clients and MongoDB instances wherein the service name is not specified.
A connection string in MongoDB is of great importance when it comes to configuring the connection parameters. MongoDB takes in most of the configuration parameters over the connection string and the driver parses them to define few connection level configuration as well. The connection string helps in reducing the configuration management to a single string. Thus, understanding the MongoDB connection string is highly important for large scale applications.