MongoDB - Architectural Best Practices

We would like to provide some guidelines for the use of MongoDB in deployments on our platform.  If you have selected one of engineered servers several of these recommendations have already been implemented for you and I will denote that in each of the appropriate areas.  I will cover selecting a deployment strategy to prepare for your MongoDB installation,  the actual installation, and lastly some operational considerations and best practices.

Deployment Strategy

When planning your deployment of MongoDB you should consider several key areas.   The most important will be your current and anticipated data set size.  This will be the primary driver for your choice of individual physical node resource needs as well as guide your sharding plans.   The next most important thing to consider is the importance of your data and how tolerant you will be of the possibility of lost or lagging data (especially in replicated scenarios).  The last area for you to cover will of course be testing your deployment strategy.  I will be discussing some specific tools that can help you with load testing a potential deployment strategy to see if it will meet your needs.

Memory sizing

MongoDB (like many data oriented applications) works best when the data set can reside in memory.  Nothing performs better than a MongoDB instance that does not require disk I/O.  Whenever possible select a platform that has more available RAM than your working data set size.  If your data set exceeds the available RAM for a single node, then consider using sharding to increase the amount of available RAM in a cluster to accommodate the larger data set. This will maximize the overall performance of your deployment.   Page faults can indicate that you may be exceeding available RAM in your deployment and you may need to increase your available RAM.

Disk Type

If speed is not your primary concern, or if you have a data set that is far larger than any available in memory strategy can support, then selecting the proper disk type for your deployment is important.  IOPS will be key in selecting your disk type and obviously the higher the IOPS the better the performance of MongoDB.  Local disks should be used if possible as network storage can cause high latency and poor performance for your deployment.  Whenever possible it is also advised that you utilize RAID 10 when creating disk arrays.

CPU

Clock speed and the amount of available processors becomes a consideration if you anticipate using map reduce.  However, it has been noted that when running a MongoDB instance with the majority of the data being in memory, clock speed can have a major impact on overall performance.  If you are running under these circumstances and would like to maximize your operations per second, consider a deployment strategy that includes a CPU with a high clock/bus speed.

Replication

Replication provides high availability of your data if node fails in your cluster.   It should be standard to replicate with at least 3 nodes in any MongoDB deployment.  The most common configuration for replication with 3 nodes is a 2x1 deployment having 2 primary nodes in a single data center with a backup server in a secondary data center.


Sharding

If you anticipate a large data set then it is advised that you deploy a sharded MongoDB deployment.  Sharding allows you to partition your data set across multiple nodes.  You may allow MongoDB to automatically distribute the data across nodes in the cluster or you may elect to define a shard key and create range based sharding for that key.  Sharding may also help write performance so it is also possible that you may elect to shard even if your data set is small but requires a high amount of updates or inserts.  It is important to note that when you deploy a sharded set, MongoDB will require 3 and only 3 config server instances which are specialized Mongo runtimes to track the current shard configuration.  Loss of one of these nodes will cause the cluster to go into a read only mode for the configuration only and will require that all nodes be brought back online before any configuration changes can be made.

Write Safety Mode

There are several write safety modes that govern how MongoDB will handle the persistence of the data to disk.  It is important to consider which strategy best fits your needs for both data integrity and performance.  The following write safety modes are available:

  • None – This mode provides a deferred writing strategy that is non-blocking.  This will allow for high performance however there is a small opportunity in the case of a node failing that data can be lost.  There is also the possibility that data written to one node in a cluster will not be immediately available on all nodes in that cluster for read consistency.  The ‘None’ strategy will additionally not provide any sort of protection in the case of network failures.  This makes this mode highly unreliable and should only be used when performance is a priority and data integrity is not a concern at all.
  • Normal – This is the default for MongoDB if you do not select any other mode.  It provides a deferred writing strategy that is non-blocking.  This will allow for high performance however there is a small opportunity in the case of a node failing that data can be lost.  There is also the possibility that data written to one node in a cluster will not be immediately available on all nodes in that cluster for read consistency.
  • Safe – This mode will block until MongoDB has acknowledged that it has received the write request but will not block until the write is actually performed.  This provides a better level of data integrity and will ensure that read consistency is achieved within a cluster.
  • Journal Safe – Journals provide a recovery option for MongoDB.  Using this mode will ensure that the data has been acknowledged and a Journal update has been performed before returning.
  • Fsync – This mode provides the highest level of data integrity and blocks until a physical write of the data has occurred.  This comes with a degradation in performance and should be used only if data integrity is the primary concern for your application.


Testing the Deployment

It is key that once you have determined your deployment strategy that you test it with a data set similar to your production data.  10Gen has several tools it provides to help you with load testing your deployment.  The console has a tool named ‘benchrun’  which can execute operations from within a javascript test harness.  It will return operation information as well as latency numbers for each given operation.  If more detailed information is required about the MongoDB instance you should consider using the mongostat command or MMS to monitor your deployment during the testing.  For more information on these tools please use to the following references:

    JS Benchmarking Harness for MongoDB
    MongoStat Overview
    10gen's MongoDB Monitoring Service

Installation

There are several considerations when doing the installation of MongoDB that can help create both a stable and performance oriented solution.  10Gen recommends the use CentOS (64 bit) if at all possible.  32 bit operating systems as well as Windows operating systems provide a poor deployment platform and should be avoided.  32 bit  operating systems have file size limits that cause issues, and Windows can cause performance issues if virtual memory begins to be utilized by the OS to make up for a lack of RAM in your deployment.  SoftLayer has provided CentOS 64 bit operating systems by default for all engineered server deployments.

In addition to the selection of CentOS 64 bit, it is also recommended that you make the following alterations to the base OS installation to maximize performance:

  • Set SSD Read Ahead Defaults to 16 Blocks – SSD drives have excellent seek times allowing for shrinking the Read Ahead to 16 blocks. Spinning disks might require slight buffering so these have been set to 32 blocks.
  • noatime – Adding the noatime option eliminates the need for the system to make writes to the file system for files which are simply being read — or in other words: Faster file access and less disk wear.
  • Turn NUMA Off in BIOS – Linux, NUMA and MongoDB tend not to work well together. If you are running MongoDB on NUMA hardware, we recommend turning it off (running with an interleave memory policy). If you don’t, problems will manifest in strange ways like massive slow downs for periods of time or high system CPU time.
  • Set ulimit – We have set the ulimit to 64000 for open files and 32000 for user processes to prevent failures due to a loss of available file handles or user processes.Use ext4 – We have selected ext4 over ext3. We found ext3 to be very slow in allocating files (or removing them). Additionally, access within large files is poor with ext3.
Once again, these alterations have been provided by default on all SoftLayer engineered servers.

It is also highly recommended that the Journal and Data volumes be distinct physical volumes.  If the Journal and Data directories reside on a single physical volume, flushes to the Journal will interrupt the access of data and provide spikes of high latency within your MongoDB deployment.

Operations

Once a MongoDB deployment has been promoted to production there are a few recommendations for monitoring and optimizing performance.  It is always recommended that you have the MMS agent running on all instances of MongoDB to help monitor the health and performance of the deployment.  This tool is also very useful if you are a subscriber of 10Gen and provides useful debugging data to 10Gen during support interactions.  In addition to MMS, as mentioned in the deployment selection section of this document, the mongostat command can also provide runtime information about the performance of a MongoDB node.

If you have performance issues indicated by either of these tools, it is possible that sharding or indexing may help to alleviate these performance issues.  These 2 solutions are a first line option when certain performance behaviors are encountered.

  • Indexes -  Indexes should be created for a MongoDB deployment if monitoring tools indicate that field based queries are performing poorly.  Always use indexes when you are querying data based on distinct fields to help boost performance.
  • Sharding -  Sharding can be leveraged when the overall performance of the node is suffering because of a large operating data set.  Be sure to shard before you get in the red – the system only splits chunks for sharding on insert or update so if you wait too long to shard you may have some uneven distribution for a period of time or forever depending on your data set and sharding key strategy.

Conclusion

These best practices should help guide you to a successful MongoDB deployment, however they do not represent every possible scenario you may encounter.  It is always best to either leverage the 10Gen subscription to access support directly from 10Gen to assist you or to use their documentation on the MongoDB website directly itself to help with specific issues and concerns you might have.

www.mongodb.org