Frequently Asked Questions

This page provides answers to some frequently asked questions we have encountered. If you have additional questions please contact MongoDB Support.

Can I change the load level while `mongosync` is syncing?

Yes, you can adjust the cluster workload level during a migration by following the steps in Reconfigure mongosync During Synchronization.

Can I perform reads or writes to my destination cluster while `mongosync` is syncing?

mongosync combines and reorders writes from the source to the destination during synchronization, and temporarily modifies collection characteristics. As a result, mongosync can't guarantee that the destination matches the source, including a stale version of the source, at any point in time when the sync is running, even if the sync is paused. To safely accept traffic to the migrated collections on the destination cluster, wait for the migration to commit. To learn more, see Mid-sync Considerations.

Performing writes to your migrated collections on the destination cluster during synchronization results in undefined behavior. mongosync blocks writes on the destination cluster by default. To learn more about write-blocking, see Write Blocking and start.

Upon commit, it is only safe to write to your migrated collections on the destination cluster when canWrite is true. To check the value of canWrite, run the progress endpoint.

To learn more about permissable reads and writes during synchronization, see Reads and Writes.

Note

Index builds on the destination cluster are treated as writes while mongosync is syncing.

Can I use `mongosync` for continuous synchronization between two MongoDB clusters?

For example, can I maintain a secondary cluster for Disaster Recovery, Analytics, or other similar use cases?

No, mongosync is designed to support one-time migrations. mongosync must commit in order to safely accept read or write traffic to the migrated collections on the destination cluster. Therefore, Disaster Recovery, Analytics, and other similar use cases are not supported workflows for mongosync.

For more information, see Can I perform reads or writes to my destination cluster while mongosync is syncing?.

Why are the destination cluster indexes larger than the source cluster indexes?

The following factors may contribute to an increase in index size on destination clusters:

mongosync inserts and removes data during a migration, which can cause data to be stored inefficiently on disk.
By default, mongosync builds indexes before copying data. mongosync copies data in _id order. If an index is not correlated with _id, the index size can become large. For more information, see the MongoDB Manual FAQ: Indexes page.

Use the following methods to mitigate an increase in index size:

Restart the migration with the buildIndexes parameter set to never. When the migration finishes, manually build indexes on the destination cluster.
After the migration, perform a rolling initial sync on the destination cluster.
After the migration, run compact (database command) on the destination cluster. This rebuilds indexes and releases unneeded disk space to the OS, but may impact cluster performance.

Can `mongosync` run on its own hardware?

Yes, mongosync can run on its own hardware. mongosync does not have to run on the servers that host your MongoDB instances. When mongosync runs on its own hardware, it can use an operating system (OS) that is different than the OS on the source or destination clusters.

What hardware specifications should the destination cluster have?

For most migrations, the destination cluster should have higher hardware specifications than the source cluster, including the following properties:

CPU
Memory
Disk I/O

These hardware specifications ensure that the destination cluster can handle mongosync writes and that the sync can keep up with the source cluster's workload.

The destination cluster must have enough disk storage to accommodate the logical data size being migrated and the destination oplog entries from the initial sync. For example, to migrate 10 GB of data, the destination cluster must have at least 10 GB available for the data and another 10 GB for the insert oplog entries from the initial sync.

To use embedded verification, you must have a larger oplog on the destination. If you enable the embedded verifier and reduce the size of the destination oplog, the embedded verifier might not be able to keep up, causing mongosync to error.

If you need to reduce the overhead of the destination oplog entries and the embedded verifier is disabled, you can:

Use the oplogSizeMB setting to lower the destination cluster's oplog size.
Use to oplogMinRetentionHours setting to lower or remove the destination cluster's minimum oplog retention period.

Should I increase the size of the `oplog` in the source cluster?

mongosync applies operations in the oplog on the source cluster to the data on the destination cluster after the collection copy phase. When operations that mongosync has not applied roll off the oplog on the source cluster, the sync fails and mongosync exits.

Note

mongosync does not replicate applyOps operations made on the source cluster during sync to the destination cluster.

If you anticipate syncing a large data set, or if you plan to pause synchronization for an extended period of time, you might exceed the oplog window. Use the oplogSizeMB setting to increase the size of the oplog on the source cluster.

Which connection string options does `mongosync` allow?

mongosync requires readConcern: "majority" and writeConcern: "majority".

If the readConcern is not majority, mongosync returns an error:

Invalid URI option, read concern must be majority

If the writeConcern is not majority, mongosync returns an error:

Invalid URI option, write concern must be majority

mongosync accepts all other connection string options.

Does `mongosync` support network compression?

mongosync supports network compression and enables it by default. However, your cluster configurations must share at least one common compressor.

For more information about network compression configuration options, see the --networkMessageCompressors option in the Database Manual.

Which security and authentication options are supported?

mongosync uses a standard MongoDB connection string to connect to the source and destination clusters.

LDAP and X509 are supported. For available authentication options, see Authentication on Self-Managed Deployments.

Does `mongosync` automatically restart on error?

mongosync does not automatically restart on error. However, you can write a script or use your operating system's process managers, systemd for example, to restart the mongosync process.

The mongosync binary is stateless. The metadata for restarting is stored on the destination cluster.

A mongosync operation can be resumed if mongosync becomes unavailable during synchronization. When mongosync becomes available again, restart the mongosync process with the same parameters. mongosync resumes the operation from where it stopped when mongosync became unavailable.

Note

Starting in mongosync 1.7.3, mongosync can take at least two minutes to respond when you resume or restart a sync operation. During this time, any calls to the progress endpoint might fail. If a progress call fails, it is safe to retry.

Can the source or destination be a replica set with arbiters?

Yes, the replica set can have arbiters. The source replica set must have more than 2 non-arbiter nodes and you must sync from a non-arbiter node. Use the source cluster's connection string to specify a read preference for a non-arbiter, data-bearing node.

What if I see a Slow Operation Warning?

Slow operation warnings can occur during the initial sync or the application of a change event when there is a slow read operation on the source cluster or a slow write operation on the destination cluster. The warning may indicate network congestion or resource strain on the source or destination cluster.

While these warnings do not indicate failures in themselves, slow operations can cause operation timeout errors in mongosync and migration failures.

If you see slow operation warnings, check CPU, memory, and network usage on the source and destination clusters. If the clusters are underprovisioned for your needs, consider upgrading the cluster hardware.

Should I stop a migration if the logs contain the word "error" or "failure"?

No, logs that contain the word "error" or "failure" show non-fatal errors and do not signal that you need to stop mongosync early. These logs do not indicate that mongosync is failing or corrupting data. If a fatal error occurs, mongosync stops the sync and writes a fatal log entry.

What if I see a lot of duplicate key errors in the logs?

Duplicate key errors are a normal part of the sync process. These errors can occur if:

You insert a document on the source cluster after mongosync starts. mongosync may directly copy the document and redundantly apply the insert change event for the document later.
You stop and resume mongosync. This can lead to duplicate insertions when mongosync restarts.
mongosync encounters a transient error and reattempts an insertion that may have already succeeded.

What should I do if mongosync returns a fatal error?

A fatal error indicates a problem that must be corrected and requires the migration to be restarted. After addressing the error, delete all migrated data on the destination cluster, including the mongosync_reserved_for_internal_use database. Then, relaunch mongosync and start a new migration.

Does `mongosync` Support TTL Indexes?

Mongosync supports syncing TTL Indexes from the source to the destination cluster.

Can I customize chunk distributions when syncing into a sharded cluster?

No, you can't configure mongosync to customize chunk distributions on a destination sharded cluster. mongosync samples each collection during initialization to determine how to distribute documents efficiently across the destination cluster’s shards after migration.

Back

0.9

Relational Migrator

Can I perform reads or writes to my destination cluster while mongosync is syncing?

Note

Can I use mongosync for continuous synchronization between two MongoDB clusters?

Why are the destination cluster indexes larger than the source cluster indexes?

Can mongosync run on its own hardware?

What hardware specifications should the destination cluster have?

Should I increase the size of the oplog in the source cluster?

Note

Which connection string options does mongosync allow?

Does mongosync support network compression?

Which security and authentication options are supported?

Does mongosync automatically restart on error?

Note

Can the source or destination be a replica set with arbiters?

What if I see a Slow Operation Warning?

Should I stop a migration if the logs contain the word "error" or "failure"?

What if I see a lot of duplicate key errors in the logs?

What should I do if mongosync returns a fatal error?

Does mongosync Support TTL Indexes?

Can I customize chunk distributions when syncing into a sharded cluster?

Can I perform reads or writes to my destination cluster while `mongosync` is syncing?

Can I use `mongosync` for continuous synchronization between two MongoDB clusters?

Can `mongosync` run on its own hardware?

Should I increase the size of the `oplog` in the source cluster?

Which connection string options does `mongosync` allow?

Does `mongosync` support network compression?

Does `mongosync` automatically restart on error?

Does `mongosync` Support TTL Indexes?