MongoDB Below 3.4
MongoDB is a popular open-source NoSQL database that stores and retrieves data in a flexible and scalable manner. ClickPipes supports the integration of MongoDB as both the source and target database for building data pipelines. This article provides a comprehensive guide on how to add MongoDB (3.4 and earlier) to ClickPipes, enabling you to leverage its scalability, flexibility, querying, and indexing capabilities for your data processing needs.
Supported Versions and Architectures
Category | Description |
---|---|
Version | MongoDB 3.2 and 3.4. For MongoDB above 3.4, please choose the data source named MongoDB. |
Architecture | ● As a source: Supports replica set and sharded cluster architectures. Additionally, full and incremental data synchronization can be performed from the secondary nodes of a sharded cluster. ● As a target: Supports single node, replica set, and sharded cluster architectures. |
Supported Data Types
Category | Data Types |
---|---|
Strings and Code | String, JavaScript, Symbol |
Numeric Types | Double, Int32, Int64, Decimal128 |
Document and Array | Document, Array |
Binary and ObjectId | Binary, ObjectId |
Boolean Type | Boolean |
Date and Timestamp | Date, Timestamp |
Special Types | Min Key, Max Key, Null |
SQL Operations for Sync
DML: INSERT, UPDATE, DELETE
When MongoDB is used as a target database, you can select the write strategy through the advanced settings of the task node: in case of insert conflicts, you can choose to convert to an update or discard the record; in case of update failures, you can choose to convert to an insert or just log the issue.
Prerequisites
As a Source Database
Make sure that the schema of the source database is a replica set or a sharding cluster. If it is standalone, you can configure it as a single-member replica set to open Oplog. For more information, see Convert a Standalone to a Replica Set.
To ensure sufficient storage space for the Oplog, it is important to configure it to accommodate at least 24 hours' worth of data. For detailed instructions, see Change the Size of the Oplog.
To create an account and grant permissions according to permission management requirements, follow the necessary steps.
- Grant Read Access to Specific Databases
- Grant Read Access to All Databases
use admin
db.createUser(
{
user: "clickpipes",
pwd: "my_password",
roles: [
{ role: "read", db: "database_name" },
{ role: "read", db: "local" },
{ role: "read", db: "config" },
{ role: "clusterMonitor", db: "admin" },
]
}
)use admin
db.createUser(
{
user: "clickpipes",
pwd: "my_password",
roles: [
{ role: "readAnyDatabase", db: "admin" },
{ role: "clusterMonitor", db: "admin" },
]
}
)tipIn shard cluster architectures, the shard server is unable to retrieve user permissions from the config database. Therefore, it is necessary to create corresponding users and grant permissions on the master nodes of each shard.
When the source database is a cluster, in order to improve data synchronization performance, ClickPipes Cloud will create a thread for each shard and read the data. Before configuring data synchronization/development tasks, you also need to perform the following operations.
For sharded cluster architectures, to improve data synchronization performance, ClickPipes will create a thread for each shard to read data. Before configuring data synchronization/development tasks, you also need to perform the following operations:
- Stop the Balancer to avoid data inconsistency caused by chunk migrations.
- Clean up orphaned documents to avoid _id conflicts.
As a Target Database
Grant write role to specified database (e.g. demodata) and clusterMonitor role for data validation, e.g.:
use admin
db.createUser(
{
user: "clickpipes",
pwd: "my_password",
roles: [
{ role: "readWrite", db: "demodata" },
{ role: "clusterMonitor", db: "admin" },
]
}
)
When using MongoDB version 3.2, you also need to grant read permissions for the local database.
Add MongoDB Below 3.4 Data Source
In the left navigation bar, click Connections.
Click Create on the right side of the page.
In the pop-up dialog, search and select MongoDB Below 3.4.
On the redirected page, fill in the MongoDB connection information as described below.
- Connection Settings
- Name: Fill in a unique name that has business significance.
- Type: Supports MongoDB as a source or target database.
- Connection Mode: Choose how you want to connect:
- URI Mode: After selecting this mode, you will be required to provide the necessary information for the database URI connection. The connection string should include the username and password, which are concatenated in the format. For example, the connection string may look like:
mongodb://admin:password@192.168.0.100:27017/mydb?replicaSet=xxx&authSource=admin
. - Standard mode: After selecting this mode, you need to fill in the database address, name, account number, password and other connection string parameters.
- URI Mode: After selecting this mode, you will be required to provide the necessary information for the database URI connection. The connection string should include the username and password, which are concatenated in the format. For example, the connection string may look like:
- Advanced Settings
- Use TLS/SSL Connection: Choose according to your business needs:
- TLS/SSL Connection: ClickPipes will connect to a separate server in the network that provides a TLS/SSL channel to the database. If your database is located in an inaccessible subnet, try this method and upload the client private key file, provide the private key password, and choose whether to validate the server certificate.
- Direct Connection: ClickPipes Cloud will connect directly to the database and you need to set up security rules to allow access.
- Use TLS/SSL Connection: Choose according to your business needs:
- Connection Settings
Click Test. Once it passes, click Save.
tipIf the connection test fails, please follow the prompts on the page to troubleshoot and resolve the issue.