How to Migrate Data In MongoDB
This article covers the guide to migrate data from offline or live MongoDB instance using oplog replay alongside mitigate connection switch latency with existing utilities.

Table of Contents
- Migrating from an offline database in MongoDB
- Creating a backup
- Restoring the backup
- Migrating from an online database in MongoDB
- Initial Migration with Oplog Capture
- Restore the data with oplog replay
- Mitigating database connection switch latency
Migrating from an offline database in MongoDB
The goal of this post is to learn about the various ways of data migration in MongoDB that can help us to write scripts that change your database by adding new documents, modifying existing ones.
If you're coming here for the first time, please take a look at the prequel Self-Hosted MongoDB.
Alright then, picking from where we left off, let's get started with the data migration in MongoDB.
Now, the basic steps to migrate data from one MongoDB to another would be:
- Create a zipped backup of the existing data
- Dump the data in a new DB
This is very straight forward when the source database is not online because we know that there won't be any new documents created/updated during the migration process. Let's look at simple migration first before diving into the live scenario.
Migrating from an offline database in MongoDB
Creating a backup
We're going to use an existing utility program mongodump for creating the database backup.
Run this command in the source database server
1mongodump --host="hostname:port" \
2 --username="username" --password="password" \
3 --authenticationDatabase "admin" \
4 --db="db name" --collection="collection name" --query='json' \
5 --forceTableScan -v --gzip --out ./dump
--host
: The source MongoDB hostname along with the port. It defaults to localhost:27017
. If it is a connection string you can use this option —-uri="mongodb://username:password@host1[:port1]..."
--username
: Specifies a username to authenticate to a MongoDB database that uses authentication.
--password
: Specifies a password to authenticate to a MongoDB database that uses authentication.
--authenticationDatabase
: Specifies the authentication database where the specified --username
has been created.
If you do not specify an authentication database or a database to export, mongodump assumes the admin database holds the user's credentials.
--db
: Specifies the database to take a backup from. If you do not specify a database, mongodump collects from all databases in this instance.
Alternatively, you can also specify the database directly in the URI connection string i.e.
mongodb://username:password@uri/dbname
. Providing a connection string while also using--db
and specifying conflicting information will result in an error.
--collection
: Specifies a collection to backup. If you do not specify a collection, this option copies all collections in the specified database or instance to the dump files.
--query
: Provides a JSON document as a query that optionally limits the documents included in the output of mongodump.
You must enclose the query document in single quotes ('{ ... }')
to ensure that it does not interact with your environment.
The query must be in Extended JSON v2 format (either relaxed or canonical/strict mode), including enclosing the field names and operators in quotes e.g. '{ "created_at": { "\$gte": ISODate(...) } }'
.
To use the
--query
option, you must also specify the--collection
option.
--forceTableScan
: Forces mongodump to scan the data store directly. Typically, mongodump saves entries as they appear in the index of the _id
field.
If you specify a query
--query
, mongodump will use the most appropriate index to support that query. Hence , you cannot use--forceTableScan
with the--query
option.
--gzip
: Compresses the output. If mongodump outputs to the dump directory, the new feature compresses the individual files. The files have the suffix .gz
.
--out
: Specifies the directory where mongodump will write BSON
files for the dumped databases. By default, mongodump saves output files in a directory named dump in the current working directory.
Restoring the backup
We will use a utility program called mongorestore
for restoring the database backup.
Copy the backup directory dump to the new Database instance and run the following command:
1mongorestore --uri="mongodb://user:password@host:port/?authSource=admin" \
2 --drop --noIndexRestore --gzip -v ./dump
Replace the credentials with the new database credentials. Unline in the previous step, the --authenticationDatabase
option is specified in the URI string.
Also, use --gzip
if used while creating the backup.
--drop
: Before restoring the collections from the dumped backup, drops the collections from the target database. It does not drop collections that are not in the backup.
--noIndexRestore
: Prevents mongorestore from restoring and building indexes as specified in the corresponding mongodump output.
If you want to change name of the database while restoring, you can do so using
--nsFrom="old_name.*" --nsTo="new_name.*"
options.However, it won’t work if you were to migrate withoplogs
which is a requirement in migration from an online instance.
Migrating from an online database in MongoDB
The only challenge with migrating from an online database is not able to pause the updates during migration. So here is the overview of the steps,
- Run an initial bulk migration with
oplogs
capture - Run a sync job to mitigate the database connection switch latency
Now, to capture
oplogs
, a replica set must be initialized in the source and destination databases. This is because theoplogs
are captured fromlocal.oplog.rs
namespace, which is created after initializing a replica set. You can follow this guide to configure a replica set.
Initial Migration with Oplog Capture
Oplogs, in simple words, are the operation logs created per operation in the database. They represent a partial document state or, in other words, the database state. So we are going to capture any updates in our old database during the migration process using these oplogs
.
Run the mongodump program with the following options,
1mongodump --uri=".../?authSource=admin" \
2 --forceTableScan --oplog \
3 --gzip -v --out ./dump
--oplog
: Creates a file named oplog.bson
as part of the mongodump
output. The oplog.bson
file, located in the top level of the output directory, contains oplog
entries that occur during the mongodump operation. This file provides an effective point-in-time snapshot of the state of our database instance.
Restore the data with oplog replay
In order to replay the oplogs, a special role is required. Let's create and assign the role to the database user being used for migration.
Create the role
1db.createRole({
2 role: "interalUseOnlyOplogRestore",
3 privileges: [
4 {
5 resource: { anyResource: true },
6 actions: [ "anyAction" ]
7 }
8 ],
9 roles: []
10})
Assign the role
1db.grantRolesToUser(
2 "admin",
3 [{ role:"interalUseOnlyOplogRestore", db:"admin" }]
4);
Now you can restore using the mongorestore program with the following options,
1mongorestore --uri="mongodb://admin:.../?authSource=admin" \
2 --oplogReplay
3 --gzip -v ./dump
In the above command, using the same user admin
with whom the role was associated.
--oplogReplay
: After restoring the database dump, replays the oplog entries from a bson file and restores the database to the point-in-time backup captured with the mongodump --oplog
command.
Mitigating database connection switch latency
Alright, so far we are done with most of the heavy lifting. The only thing that remains is maintaining consistency between the databases during the connection switch in our application servers.
If you're running MongoDB version 3.6+, it's better to go for the Change Stream approach, which is a event-based mechanism introduced to capture changes in your database in an optimized way. Here is an article that covers it : An Introduction to Change Streams
Check out the generic sync script, which you can run as a CRON job every minute.
Update the variables in this script and run as
1$ ./delta-sync.sh from_epoch_in_milliseconds
2from_epoch_in_milliseconds is automatically picked with every iteration if not supplied
Or you can set up a cron job to run this every minute.
1* * * * * ~/delta-sync.sh
The output can be monitored with the following command (I'm running RHEL 8, refer to your OS guide for cron output)
1$ tail -f /var/log/cron | grep CRON
This is a sample sync log.
1CMD (~/cron/dsync.sh)
2CMDOUT (INFO: Updated log registry to use new timestamp on next run.)
3CMDOUT (INFO: Created sync directory: /home/ec2-user/cron/dump/2020-11-03T19:01:01Z)
4CMDOUT (Fetching oplog in range [2020-11-03T19:00:01Z - 2020-11-03T19:01:01Z])
5CMDOUT (2020-11-03T19:01:02.319+0000#011dumping up to 1 collections in parallel)
6CMDOUT (2020-11-03T19:01:02.334+0000#011writing local.oplog.rs to /home/ec2-user/cron/dump/2020-11-03T19:01:01Z/local/oplog.rs.bson.gz)
7CMDOUT (2020-11-03T19:01:04.943+0000#011local.oplog.rs 0)
8CMDOUT (2020-11-03T19:01:04.964+0000#011local.oplog.rs 0)
9CMDOUT (2020-11-03T19:01:04.964+0000#011done dumping local.oplog.rs (0 documents))
10CMDOUT (INFO: Dump success!)
11CMDOUT (INFO: Replaying oplogs...)
12CMDOUT (2020-11-03T19:01:05.030+0000#011using write concern: &{majority false 0})
13CMDOUT (2020-11-03T19:01:05.054+0000#011will listen for SIGTERM, SIGINT, and SIGKILL)
14CMDOUT (2020-11-03T19:01:05.055+0000#011connected to node type: standalone)
15CMDOUT (2020-11-03T19:01:05.055+0000#011mongorestore target is a directory, not a file)
16CMDOUT (2020-11-03T19:01:05.055+0000#011preparing collections to restore from)
17CMDOUT (2020-11-03T19:01:05.055+0000#011found collection local.oplog.rs bson to restore to local.oplog.rs)
18CMDOUT (2020-11-03T19:01:05.055+0000#011found collection metadata from local.oplog.rs to restore to local.oplog.rs)
19CMDOUT (2020-11-03T19:01:05.055+0000#011restoring up to 4 collections in parallel)
20CMDOUT (2020-11-03T19:01:05.055+0000#011replaying oplog)
21CMDOUT (2020-11-03T19:01:05.055+0000#011applied 0 oplog entries)
22CMDOUT (2020-11-03T19:01:05.055+0000#0110 document(s) restored successfully. 0 document(s) failed to restore.)
23CMDOUT (INFO: Restore success!)
You can stop this script after verifying that no more oplogs
are being created, i.e., when source DB went offline.
This concludes the complete self-hosted MongoDB data migration guide. If you want to learn more about MongoDB here is a useful resource on how to use MongoDB as datasource in goLang.

Featured Posts
TOTP Authentication Explained: How It Works, Why It’s Secure
Advantages of Time-Based One-Time Passwords (TOTP)
JWT Authentication with LoginRadius: Quick Integration Guide
Complete Guide to JSON Web Token (JWT) and How It Works
A comprehensive guide to OAuth 2.0
How Chrome’s Third-Party Cookie Restrictions Affect User Authentication?
How to Implement OpenID Connect (OIDC) SSO with LoginRadius?
Testing Brute-force Lockout with LoginRadius
Breaking Down the Decision: Why We Chose AWS ElastiCache Over Redis Cloud
LoginRadius Launches a CLI for Enterprise Dashboard
How to Implement JWT Authentication for CRUD APIs in Deno
Multi-Factor Authentication (MFA) with Redis Cache and OTP
Introduction to SolidJS
Why We Re-engineered LoginRadius APIs with Go?
Why B2B Companies Should Implement Identity Management
Top 10 Cyber Threats in 2022
Build a Modern Login/Signup Form with Tailwind CSS and React
M2M Authorization: Authenticate Apps, APIs, and Web Services
Implement HTTP Streaming with Node.js and Fetch API
NestJS: How to Implement Session-Based User Authentication
How to Integrate Invisible reCAPTCHA for Bot Protection
How Lapsus$ Breached Okta and What Organizations Should Learn
NestJS User Authentication with LoginRadius API
How to Authenticate Svelte Apps
How to Build Your Github Profile
Why Implement Search Functionality for Your Websites
Flutter Authentication: Implementing User Signup and Login
How to Secure Your LoopBack REST API with JWT Authentication
When Can Developers Get Rid of Password-based Authentication?
4 Ways to Extend CIAM Capabilities of BigCommerce
Node.js User Authentication Guide
Your Ultimate Guide to Next.js Authentication
Local Storage vs. Session Storage vs. Cookies
How to Secure a PHP API Using JWT
React Security Vulnerabilities and How to Fix/Prevent Them
Cookie-based vs. Cookieless Authentication: What’s the Future?
Using JWT Flask JWT Authentication- A Quick Guide
Single-Tenant vs. Multi-Tenant: Which SaaS Architecture is better for Your Business?
Build Your First Smart Contract with Ethereum & Solidity
What are JWT, JWS, JWE, JWK, and JWA?
How to Build an OpenCV Web App with Streamlit
32 React Best Practices That Every Programmer Should Follow
How to Build a Progressive Web App (PWA) with React
Bootstrap 4 vs. Bootstrap 5: What is the Difference?
JWT Authentication — Best Practices and When to Use
What Are Refresh Tokens? When & How to Use Them
How to Participate in Hacktoberfest as a Maintainer
How to Upgrade Your Vim Skills
Hacktoberfest 2021: Contribute and Win Swag from LoginRadius
How to Implement Role-Based Authentication with React Apps
How to Authenticate Users: JWT vs. Session
How to Use Azure Key Vault With an Azure Web App in C#
How to Implement Registration and Authentication in Django?
11 Tips for Managing Remote Software Engineering Teams
One Vision, Many Paths: How We’re Supporting freeCodeCamp
C# Init-Only Setters Property
Content Security Policy (CSP)
Implementing User Authentication in a Python Application
Introducing LoginRadius CLI
Add Authentication to Play Framework With OIDC and LoginRadius
React renderers, react everywhere?
React's Context API Guide with Example
Implementing Authentication on Vue.js using JWTtoken
How to create and use the Dictionary in C#
What is Risk-Based Authentication? And Why Should You Implement It?
React Error Boundaries
Data Masking In Nginx Logs For User Data Privacy And Compliance
Code spliting in React via lazy and suspense
Implement Authentication in React Applications using LoginRadius CLI
What is recoil.js and how it is managing in react?
How Enum.TryParse() works in C#
React with Ref
Implement Authentication in Angular 2+ application using LoginRadius CLI in 5 mins
How Git Local Repository Works
How to add SSO for your WordPress Site!
Guide to Authorization Code Flow for OAuth 2.0
Introduction to UniFi Ubiquiti Network
The Upcoming Future of Software Testers and SDETs in 2021
Why You Need an Effective Cloud Management Platform
What is Adaptive Authentication or Risk-based Authentication?
Top 9 Challenges Faced by Every QA
Top 4 Serverless Computing Platforms in 2021
QA Testing Process: How to Deliver Quality Software
How to Create List in C#
What is a DDoS Attack and How to Mitigate it
How to Verify Email Addresses in Google Sheet
Concurrency vs Parallelism: What's the Difference?
35+ Git Commands List Every Programmer Should Know
How to do Full-Text Search in MongoDB
What is API Testing? - Discover the Benefits
The Importance of Multi-Factor Authentication (MFA)
Optimize Your Sign Up Page By Going Passwordless
Image Colorizer Tool - Kolorizer
PWA vs Native App: Which one is Better for you?
How to Deploy a REST API in Kubernetes
Integration with electronic identity (eID)
How to Work with Nullable Types in C#
Git merge vs. Git Rebase: What's the difference?
How to Install and Configure Istio
How to Perform Basic Query Operations in MongoDB
Invalidating JSON Web Tokens
How to Use the HTTP Client in GO To Enhance Performance
Constructor vs getInitialState in React
Web Workers in JS - An Introductory Guide
How to Use Enum in C#
How to Migrate Data In MongoDB
A Guide To React User Authentication with LoginRadius
WebAuthn: A Guide To Authenticate Your Application
Build and Push Docker Images with Go
Istio Service Mesh: A Beginners Guide
How to Perform a Git Force Pull
NodeJS Server using Core HTTP Module
How does bitwise ^ (XOR) work?
Introduction to Redux Saga
React Router Basics: Routing in a Single-page Application
How to send emails in C#/.NET using SMTP
How to create an EC2 Instance in AWS
How to use Git Cherry Pick
Password Security Best Practices & Compliance
Using PGP Encryption with Nodejs
Python basics in minutes
Automating Rest API's using Cucumber and Java
Bluetooth Controlled Arduino Car Miniature
AWS Services-Walkthrough
Beginners Guide to Tweepy
Introduction to Github APIs
Introduction to Android Studio
Login Screen - Tips and Ideas for Testing
Introduction to JAMstack
A Quick Look at the React Speech Recognition Hook
IoT and AI - The Perfect Match
A Simple CSS3 Accordion Tutorial
EternalBlue: A retrospective on one of the biggest Windows exploits ever
Setup a blog in minutes with Jekyll & Github
What is Kubernetes? - A Basic Guide
Why RPA is important for businesses
Best Hacking Tools
Three Ways to do CRUD Operations On Redis
Traversing the realms of Quantum Network
How to make a telegram bot
iOS App Development: How To Make Your First App
Apache Beam: A Basic Guide
Python Virtual Environment: What is it and how it works?
End-to-End Testing with Jest and Puppeteer
Speed Up Python Code
Build A Twitter Bot Using NodeJS
Visualizing Data using Leaflet and Netlify
STL Containers & Data Structures in C++
Secure Enclave in iOS App
Optimal clusters for KMeans Algorithm
Upload files using NodeJS + Multer
Class Activation Mapping in Deep Learning
Full data science pipeline implementation
HTML Email Concept
Blockchain: The new technology of trust
Vim: What is it and Why to use it?
Virtual Dispersive Networking
React Context API: What is it and How it works?
Breaking down the 'this' keyword in Javascript
Handling the Cheapest Fuel- Data
GitHub CLI Tool ⚒
Lazy loading in React
What is GraphQL? - A Basic Guide
Exceptions and Exception Handling in C#
Unit Testing: What is it and why do you need it?
Golang Maps - A Beginner’s Guide
LoginRadius Open Source For Hacktoberfest 2020
JWT Signing Algorithms
How to Render React with optimization
Ajax and XHR using plain JS
Using MongoDB as Datasource in GoLang
Understanding event loop in JavaScript
LoginRadius Supports Hacktoberfest 2020
How to implement Facebook Login
Production Grade Development using Docker-Compose
Web Workers: How to add multi-threading in JS
Angular State Management With NGXS
What's new in the go 1.15
Let’s Take A MEME Break!!!
PKCE: What it is and how to use it with OAuth 2.0
Big Data - Testing Strategy
Email Verification API (EVA)
Implement AntiXssMiddleware in .NET Core Web
Setting Up and Running Apache Kafka on Windows OS
Getting Started with OAuth 2.0
Best Practice Guide For Rest API Security | LoginRadius
Let's Write a JavaScript Library in ES6 using Webpack and Babel
Cross Domain Security
Best Free UI/UX Design Tools/Resources 2020
A journey from Node to GoLang
React Hooks: A Beginners Guide
DESIGN THINKING -A visual approach to understand user’s needs
Deep Dive into Container Security Scanning
Different ways to send an email with Golang
Snapshot testing using Nightwatch and mocha
Qualities of an agile development team
IAM, CIAM, and IDaaS - know the difference and terms used for them
How to obtain iOS application logs without Mac
Benefits and usages of Hosts File
React state management: What is it and why to use it?
HTTP Security Headers
Sonarqube: What it is and why to use it?
How to create and validate JSON Web Tokens in Deno
Cloud Cost Optimization in 2021
Service Mesh with Envoy
Kafka Streams: A stream processing guide
Self-Hosted MongoDB
Roadmap of idx-auto-tester
How to Build a PWA in Vanilla JS
Password hashing with NodeJS
Introduction of Idx-Auto-Tester
Twitter authentication with Go Language and Goth
Google OAuth2 Authentication in Golang
LinkedIn Login using Node JS and passport
Read and Write in a local file with Deno
Build A Simple CLI Tool using Deno
Create REST API using deno
Automation for Identity Experience Framework is now open-source !!!
Creating a Web Application using Deno
Hello world with Deno
Facebook authentication using NodeJS and PassportJS
StackExchange - The 8 best resources every developer must follow
OAuth implementation with Node.js and Github
NodeJS and MongoDB application authentication by JWT
Working with AWS Lambda and SQS
Google OAuth2 Authentication in NodeJS - A Guide to Implementing OAuth in Node.js
Custom Encoders in the Mongo Go Driver
React's Reconciliation Algorithm
NaN in JavaScript: An Essential Guide
SDK Version 10.0.0
Getting Started with gRPC - Part 1 Concepts
Introduction to Cross-Site Request Forgery (CSRF)
Introduction to Web Accessibility with Semantic HTML5
JavaScript Events: Bubbling, Capturing, and Propagation
3 Simple Ways to Secure Your Websites/Applications
Failover Systems and LoginRadius' 99.99% Uptime
A Bot Protection Overview
OAuth 1.0 VS OAuth 2.0
Azure AD as an Identity provider
How to Use JWT with OAuth
Let's Encrypt with SSL Certificates
Encryption, Hashing & Salting: Your Guide to Secure Data
What is JSON Web Token
Understanding JSONP
Using NuGet to publish .NET packages
How to configure the 'Actions on Google' console for Google Assistant
Creating a Google Hangout Bot with Express and Node.js
Understanding End Of Line: The Power of Newline Characters
Cocoapods : What It Is And How To Install?
Node Package Manager (NPM)
Get your FREE SSL Certificate!
jCenter Dependencies in Android Studio
Maven Dependency in Eclipse
Install Bootstrap with Bower
Open Source Business Email Validator By Loginradius
Know The Types of Website Popups and How to Create Them
Javascript tips and tricks to Optimize Performance
Learn How To Code Using The 10 Cool Websites
Personal Branding For Developers: Why and How?
Wordpress Custom Login Form Part 1
Is Your Database Secured? Think Again
Be More Manipulative with Underscore JS
Extended LinkedIn API Usage
Angular Roster Tutorial
How to Promise
Learning How to Code
Delete a Node, Is Same Tree, Move Zeroes
CSS/HTML Animated Dropdown Navigation
Part 2 - Creating a Custom Login Form
Website Authentication Protocols
Nim Game, Add Digits, Maximum Depth of Binary Tree
The truth about CSS preprocessors and how they can help you
Beginner's Guide for Sublime Text 3 Plugins
Displaying the LoginRadius interface in a pop-up
Optimize jQuery & Sizzle Element Selector
Maintain Test Cases in Excel Sheets
Separate Drupal Login Page for Admin and User
How to Get Email Alerts for Unhandled PHP Exceptions
ElasticSearch Analyzers for Emails
Social Media Solutions
Types of Authentication in Asp.Net
Using Facebook Graph API After Login
Hi, My Name is Darryl, and This is How I Work
Beginner's Guide for Sublime Text 3
Social Network Branding Guidelines
Index in MongoDB
How to ab-USE CSS2 sibling selectors
Customize User Login, Register and Forgot Password Page in Drupal 7
Best practice for reviewing QQ app
CSS3 Responsive Icons
Write a highly efficient python Web Crawler
Memcached Memory Management
HTML5 Limitation in Internet Explorer
What is an API
Styling Radio and Check buttons with CSS
Configuring Your Social Sharing Buttons
Shopify Embedded App
API Debugging Tools
Use PHP to generate filter portfolio
Password Security
Loading spinner using CSS
RDBMS vs NoSQL
Cloud storage vs Traditional storage
Getting Started with Phonegap
Animate the modal popup using CSS
CSS Responsive Grid, Re-imagined
An Intro to Curl & Fsockopen
Enqueuing Scripts in WordPress
How to Implement Facebook Social Login
GUID Query Through Mongo Shell
Integrating LinkedIn Social Login on a Website
Social Provider Social Sharing Troubleshooting Resources
Social Media Colors in Hex
W3C Validation: What is it and why to use it?
A Simple Popup Tutorial
Hello developers and designers!