As a part of its native replication, MongoDB maintains multiple copies of data in a construct called a replica set.
So, what is a replica set? A replica set in MongoDB is a group of mongod (primary daemon process for the MongoDB system) process that maintains the same data set. Put simply, it is a group of MongoDB servers operating in a primary / secondary failover fashion. Replica sets provide redundancy and high availability.
A replica set is a fully self-healing shard (see previous post for introduction to sharding) helping prevent database downtime. Replica failover is fully automated, eliminating any kind of manual administrator intervention.
An administrator can configure the number of replicas in a MongoDB replica set. The more replicas configured increase data availability and protection against database downtime. Consider the types of failures you may experience: multiple machines fail, a rack failure, a network partition, loss of a data center, etc.
Replica Set Architecture
At a minimum, a replica set configuration must have at least three healthy members to operate. One member will serve as the primary (there can only be one!), the remaining members as the secondary, but there may be a need for an arbiter.
The primary is the only member in the replica set that receives write operations. The operations are recorded on the primary’s oplog which is replicated by secondary members. The operations are applied to each’s data set.
All replica set members can accept read operations. However, by default, an application directs its read operations to the primary member.
Each replica set has a single primary, and if the primary is unavailable then an election will determine a new primary. Elections occur after initiating a replica set, but also will occur should the primary become unavailable. Elections are a necessary process to allow the replica set to recover normal operations without manual intervention.
A secondary maintains a copy of the primary’s data set by applying operations from primary’s oplog to its own dataset. This is done asynchronously.
An administrator can configure as many secondaries as desired. The more secondaries, spread out over availability zones or regions, the less likely it is that the cluster will experience downtime.
An arbiter is a member that participates in an election (in order to break a tie) but does not replicate any data. If a replica set has an even number of members then an arbiter must be added to act as a tie-breaker, otherwise a new primary will not be elected.
This requirement was put in place to prevent a split-brain scenario in which two secondary members in a cluster are unable communicate with each other and decide to vote for themselves. This would cause both to become primaries, leading to data inconsistency issues…among other problems.
Alternatively, an arbiter can be completely avoided by simply adding another secondary so that there is an odd number of voting cluster members.