MongoDB Data Models Example
1. Introduction
This is an in-depth article on how to create MongoDB Data models. Mongo Database is a no sql database. It has capabilities such as query language to retrieve from the database. It also provides operational and administrative procedures. A document in Mongo Database is a data structure which has field and value pairs. These documents are like JSON objects. The values of fields can be other documents, arrays, and arrays of documents.
2. MongoDB Data Models
2.1 Prerequisites
MongoDB needs to be installed for the MongoDB Data Models example.
2.2 Download
You can download the Mongo DB from the Mongo Database website for linux, windows or macOS version.
2.3 Setup
On MacOS, you need to tap the formula repository of MongoDB. This repo needs to be added to the formula list. The command below adds the formula repository of MongoDB to the forumula list:
Brew Tap
brew tap mongodb/brew
After setting the formula list, you can install the Mongo DB with the following command :
Brew Command
brew install mongodb-community@4.0
2.4 MongoDB CommandLine
After installation, you can run MongoDB on the command line. To run mongoDB on the command line, the following command can be used:
Brew Command
mongod --config /usr/local/etc/mongod.conf
The output of the executed command is shown below.
2.5 Mongo DB Operations
After starting the Mongod process, Mongo Shell can be invoked on the command line. Mongo shell can be run using the command below:
Mongo Shell
mongo
The output of the executed command is shown below.
2.5.1 Create Database
You can use database_name to create a database. This command will create a new database. If the database exists, it will start using the existing database. The command below is used to create “octopus” database:
Create Database
use octopus
The output of the executed command is shown below.
2.5.2 Drop Database
You can used the dropDatabase()
command to drop the existing database. The command below is used to drop a database. This command will delete the database. If use this command without db, then the default ‘test’ database is deleted.
Run Command
db.dropDatabase()
The output of the executed command is shown below.
2.5.3 Create Document
You can use createCollection
command to create a set of documents. The created collection is used to create a document. The command below is used to create “persons” collection:
Create Collection
db.createCollection("persons")
The output of the executed command is shown below.
You can use the insert method to create a document which is stored in the collection. A new collection will be created if the collection does not exist in the database. The document will be inserted into the collection after it is created. In the command, if _id parameter is not specified, then a unique ObjectId is assigned for the document.
The command below is used to create a person document:
Insert Document
db.persons.insertOne( { person: "john smith", id: 001, ssn: 345675431,gender: "male" } )
The output of the executed command is shown below.
2.5.4 Read Document
You can use find()
method to query data from the collection. This method will show the documents in a non-structured way. You can use a pretty method to show the formatted results.
The command below is used to query the collection for the documents in the database:
Query Document
db.persons.find().pretty()
The output of the executed command is shown below.
2.5.5 Update Document
You can use update ()
method to update the document into a collection. This method update the values in the specified document
The command below is used to update the person document in the persons collection which is stored in the octopus database:
Update Document
db.persons.updateOne( { person: "john smith" }, { $set: { "id": 002, "ssn": 323455678 }, $currentDate: { lastModified: true } } )
The output of the executed command is shown below.
2.5.6 Delete Document
You can use remove()
method to delete a document from the collection. This method accepts two parameters which are deletion criteria and justOne flag. If the deletion criteria is not specified, then all the documents are deleted from the collection.
The command below is used to delete the person document in the persons collection which is stored in the octopus database:
Delete Document
db.persons.remove({'person':'john smith'})
The output of the executed command is shown below.
2.6 Mongo DB Data Model
2.6.1 Relationships
Relationships represent the way documents are related to each other. They can be modeled through Embedded and Referenced approaches. The relationship types can be One to One( 1:1), One to Many( 1:N), Many to One (N:1) and Many to Many (N:N).
We will start looking at Person document. The sample data for the person document is shown below:
Person Document
{ "_id":ObjectId("52eecd85242f436000001"), "person": "Tom Hanks", "id": "987654321", "ssn": "345982341", "gender": "male" }
Another document of type department is shown below:
Department Document
{ "_id":ObjectId("82aacd85242f436000011"), "department": "HR", "id": "9", "location": "Los Angeles" "country": "USA" }
One to one relation ship between Person and Department is modeled as below using embedded approach:
Person Document
{ "_id":ObjectId("52eecd85242f436000001"), "person": "Tom Hanks", "id": "987654321", "ssn": "345982341", "gender": "male" "department": { "street": "22 A, Parker Apt", "code": 123456, "city": "Los Angeles", "state": "California", "country": "USA" } }
Using references, person to department relationship can be modeled as below:
Person to Department using references
Person document { "_id":ObjectId("52eecd85242f436000001"), "person": "Tom Hanks", "id": "987654321", "ssn": "345982341", "gender": "male" } { "_id":ObjectId("82aacd85242f436000011"), "department": "HR", "person_id": "52eecd85242f436000001", "id": "9", "location": "Los Angeles" "country": "USA" }
One to Many relationship between person and address is shown below using the embedded approach.
Person to Address One to Many
{ "_id":ObjectId("52ffc33cd85242f436000001"), "person": "Tom Hanks", "id": "987654321", "ssn": "345982341", "gender": "male" "address": [ { "street": "92 A, Windsor Apt", "code": 123456, "city": "Los Angeles", "state": "California", "country": "USA" }, { "street": "25 Franklin Apt", "code": 456789, "city": "Chicago", "state": "Illinois", "country": "USA" } ] }
Using references, person to address relationship can be modeled as below:
Person to Address – References
Person Document { "_id":ObjectId("52ffc33cd85242f436000001"), "person": "Tom Hanks", "id": "987654321", "ssn": "345982341", "gender": "male" } Address Document1 { "person_id": "52ffc33cd85242f436000001", "street": "92 A, Windsor Apt", "code": 123456, "city": "Los Angeles", "state": "California", "country": "USA" } Address Document 2 { "person_id": "52ffc33cd85242f436000001", "street": "25 Franklin Apt", "code": 456789, "city": "Chicago", "state": "Illinois", "country": "USA" }
A group document can have many to many relationship with person document. A sample group document is shown below:
Group Document
{ "_id":ObjectId("22avxd85242f436000001"), "group": "Group1", "type": "Engineers" }
Many to Many relationship between Person and Group is shown using embedded approach.
Person to Group using embedded
{ "_id":ObjectId("52ffc33cd85242f436000001"), "person": "Tom Hanks", "id": "987654321", "ssn": "345982341", "gender": "male" "groups": [ { "_id":ObjectId("22avxd85242f436000001"), "group": "Group1", "type": "Engineers" }, { "_id":ObjectId("35kfsd85242f436000001"), "group": "Group2", "type": "Managers" } ] }
Using references, Person to Group many to many relationship is shown below:
Person to Group using references
Person Document { "_id":ObjectId("52ffc33cd85242f436000001"), "person": "Tom Hanks", "id": "987654321", "ssn": "345982341", "gender": "male" } Group Document 1 { "_id":ObjectId("22avxd85242f436000001"), "group": "Group1", "type": "Engineers" } Group Document 2 { "_id":ObjectId("35kfsd85242f436000001"), "group": "Group2", "type": "Managers" }
Manager to a person can be a parent to child relationship.The relationship is shown using the embedded approach.
Manager to Person using Embedded
{ "_id":ObjectId("52ffc33cd85242f436000001"), "manager": "John Smith", "id": "987652321", "ssn": "245982341", "gender": "male", "persons":[ { "_id":ObjectId("52ffc33cd85242f436000001"), "person": "Tom Hanks", "id": "987654321", "ssn": "345982341", "gender": "male" }, { "_id":ObjectId("83eec33cd85242f436000001"), "person": "Roger Harper", "id": "387654321", "ssn": "324982341", "gender": "male" }, ] }
Parent to child relationship between Manager and Person using references approach is shown below:
Manager to Person using references
{ "_id":ObjectId("52ffc33cd85242f436000001"), "manager": "John Smith", "id": "987652321", "ssn": "245982341", "gender": "male", "persons":[ ObjectId("52ffc33cd85242f436000001"), ObjectId("83eec33cd85242f436000001") ] }
2.6.2 Json Schema
Json schema is used to specify the validation rules. The sample schema is shown below for Persons collection.
Persons Schema
db.createCollection("persons", { validator: { $jsonSchema: { bsonType: "object", required: [ "name", "dob" ], properties: { name: { bsonType: "string", description: "should be a string and is required" }, gender: { bsonType: "string", description: "should be a string and is not required" }, dob: { bsonType: "int", minimum: 2017, maximum: 3017, exclusiveMaximum: false, description: "should be an integer in [ 2017, 3017 ] and is required" } } } } })
2.6.3 Design
You can use embedded data models when two entities have contains relationship. Embedded data models can be used for one-to-many relationships. These data models provide good performance for read operations. Embedded data models have the ability to request and retrieve data. They can be used to update data in a write operation.
References data models can be used when embedding results in duplication of data. These data models are used to represent many-to-many relationships. References models can be used to model hierarchical data sets.
2.6.4 Sharding
Sharding is related to distributing data across computing machines. Mongo Database uses sharding to deploy large data sets. It helps in high throughput processing. Vertical and horizontal scaling are the two methods for scaling a system. A sharded cluster consists of the components such as a shard, query router, and config servers. A shard consists of a subset of sharded data. Query router provides an interface between clients and the cluster. Config server stores metadata and configuration settings of the cluster.
2.6.5 Security
Mongo Database has security features such as authentication, authorization, access control, encryption, and secure deployment. Atlas can be used to encrypt the data. The data can be in-transit and at-rest. This makes it easy to provide and monitor access with user and roles management
2.6.6 Replication
Replication in Mongo Database is related to maintaining the replica of the data set. A group of Mongod processes manages the replica set. Replica sets ensure redundancy and high availability. Replication provides fault tolerance against the failure of a database server. It can provide increased capacity. Increased capacity is because of the clients who can send read operations to distributed servers. Replication sets can enhance data locality and availability. You can maintain replicas for disaster recovery, reporting, and backup.
2.6.7 Storage
Mongo Database provides a different type of storage engines. Storage engines allow you to manage application data. The journal is used to recover the database in the event of a shutdown. Configurable options are provided to allow the journal to maintain a balance between performance and reliability. These options can be chosen based on the use case.
2.7 Best Practices
MongoDB Operational best practices can be obtained from this link. The performance related best practices can be accessed from the MongoDB website.
2.8 Error Handling
Programming with Java using MongoDB is available on java code geeks at this link. MongoException is the parent exception class in MongoDB. WriteConcernException is the exception pertaining to a write failure error. MongoException.CursorNotFound is the exception relate to Cursor not found or timed-out error. The default for time out for cursor finding is 10 minutes. MongoException.Network exception is related to network exceptions. The networking configuration needs to be set for retry, number of retries and time to wait for a retry.
2.9 Database Conventions
The naming conventions for the document structure can be accessed from the MongoDB site.
3. Download the Source Code
You can download the full source code of this example here: MongoDB Data Models Example