@grouparoo/mongo

Last Updated: 2021-08-02

Grouparoo's MongoDB plugin enables you to import data from one or more MongoDB databases.

This guide will show you how to work with the MongoDB plugin to create a Source to import your data.

Install the Mongo Plugin

To work with the Mongo plugin, you must first install it in an existing Grouparoo project. You can do this using the install command from our CLI:

$ grouparoo install @grouparoo/mongo

This adds the package to your package.json file as a dependency, and also drops the plugin in the grouparoo.plugins section in that same file, which enables it.

// package.json
{
// ...
"dependencies": {
"@grouparoo/mongo": "...",
// ...
},
"grouparoo": {
"plugins": [
"@grouparoo/mongo",
// ...
]
}
}

Once the plugin is installed, you'll be working primarily with the CLI's configuration commands to get everything set up.

Create a Mongo App

With Grouparoo, an App is how we establish a connection with a source or destination. Add this connection by generating an App:

$ grouparoo generate mongo:app my_mongo_app

This will generate a file at config/apps/my_mongo_app.js. Open this file and edit the connection details to match your desired configuration. Here is an example of what this config object will look like after generation:

// config/apps/my_mongo_app.js
exports.default = async function buildConfig() {
return [
{
class: "app",
id: "my_mongo_app",
name: "my_mongo_app",
type: "mongo",
options: {
uri: "...",
database: "...",
}
},
];
};

Mongo App Options

Here are the Mongo-specific options available to you in the options section of the config file:

uri [required]

MongoDB Connection String.

database [required]

The database name - e.g. "data_warehouse".

Validating & Applying Your Config

You can validate your config at any time using the validate command:

$ grouparoo validate

And you can apply that config (save it to your Grouparoo application's database) using the apply command:

$ grouparoo apply

Create a Mongo Source

The Mongo Source is a specific type of Source that we call a Columnar Source, which means it imports data from a column-based mechanism, like a database. Columnar Sources can take one of two forms:

  • A Columnar Table Source targets specific fields within a single collection and can perform aggregation methods on that field.
  • A Columnar Query Source provides the ability to write custom MongoDB Query Language (MQL) code to extract data from one or more collections and import the result into Grouparoo.

Create a Mongo Table Source

You can generate a Mongo Table Source using the generate command. You must specify a parent, which should match the id of the App you created.

This is the simplest form of Generator you can use for Table Sources:

$ grouparoo generate mongo:table:source users --parent my_mongo_app

This generates a file at config/sources/users. Open this file and edit the options to match your desired configuration. Table Sources are among the most complex config objects generated. There is a lot going on in these files.

Here is a filled out version of a common use case. We'll step through the unique pieces below.

exports.default = async function buildConfig() {
return [
// --- Source ---
{
class: "source",
id: "users",
name: "users",
type: "mongo-table-import",
appId: "my_mongo_app",
options: {
table: "users",
},
mapping: {
id: "user_id",
},
},
},
// --- Schedule ---
{
id: "users_schedule",
name: "users_schedule",
class: "schedule",
sourceId: "users",
recurring: true,
recurringFrequency: 1000 * 60 * 15,
options: {
column: "updated_at",
},
},
];
};

Table Source Options

Here are the options for a Mongo

table [required]

Name of the collection in the Mongo database.

Table Source Mappings

Defining Mappings is a critical part of the process, as it tells Grouparoo with which Profile to associate imported data.

For example, let's say your database has a field named email and that maps directly to a unique Property on the Profile in Grouparoo called emailAddress. In that case, your mapping would look like this:

mapping: {
email: "emailAddress",
}

Configuring your First Source

Before you can define a Mapping, you must have a primary key for Profiles. In most cases, this is an ID or an e-mail address. Grouparoo will automatically determine what field is used for a primary key based on the mapping and automatically generate one Profile per unique value in that field.

Table Source Schedule

A Schedule is attached to every Source config file by default, commented out. If you want to import data from the source on a schedule (which is the typical behavior), then you should remove the comments and apply the necessary values.

You can read more about the common options here. The Mongo-specific options (those in the options object) are:

column [required]

The name of the field to use as the high watermark.

Table Source Properties

After you generate a Source, you'll likely want to add Properties to it. You can do this through the CLI:

$ grouparoo generate mongo:table:property first_name --parent users

The Property generator will drop individual files in the config/properties directory. Edit these files to match your desired configuration.

Table Source Property Options

The Property config object has several options. Some are share across all Properties, while others are more specific to the type of Property generated. A Mongo Table Source has a few unique options. These can be found in the options object in the config file. They are:

column [required]

The name of the field to use for the Property.

aggregationMethod [required] (default: "exact")

The type of aggregation method to use when extracting the data. The available options will be added to the generated config file as a comment.

sort [required] (default: null)

Table Source Property Filters

A Mongo Table Source also provides the ability to filter your data via the filters option. This is a series of rules that will filter data in the database collection to find the appropriate value for each Profile for a given Property.

For example, let's say you had a property called lifetime_value which summed all the purchases for a given user. Your Source is a purchases collection that has a state field set to either successful or returned. You may only want to include successful purchases. Your filters config might look like this:

{
filters: [{ key: "state", op: "equals", match: "successful" }],
}

The available operators (op) will be added to the generated config file as a comment near the filters section.

Create a Mongo Query Source

You can generate a Mongo Query Source using the generate command. You must specify a parent, which should match the id of the App you created.

$ grouparoo generate mongo:query:source users --parent my_mongo_app

A Query Source is a more flexible way to build properties. With a Query Source, you can add custom MongoDB Query Language (MQL) commands to your Properties, which could pull data from one or more collections in your database.

Query Source Options

The unique thing about the way a Query Source works is that there aren't any unique options for the Source itself.

Query Source Schedule

Like a Table Source Schedule, a Query Source Schedule is included with the generated config file, commented. It has a couple unique options:

query

A MongoDB Query Language (MQL) query to return that tells Grouparoo which Profiles to check each time the interval occurs.

propertyId

The id of the Grouparoo Property whose data is returned by options.query.

Here's an example:

{
options: {
query: [
{
$match: {
updatedAt: {
$gt: "new Date(ISODate().getTime() - 1000 * 60 * 60 * 24 * 2)",
},
},
},
{
$project: {
_id: 1,
},
},
],
propertyId: "userId"
}
}

Query Source Properties

Query Sources are a little simpler than Table Sources when it comes to Properties. You can generate a Property using the CLI:

$ grouparoo generate mongo:query:property lifetime_value --parent users

This will drop a file at config/properties/lifetime_value.js. Edit this file to match your desired configuration.

There is one unique option for Query Source Properties:

query

The query to extract the Property. You can use mustache variables to represent the keys of other Properties in the system. You can use the id of any other Properties you created in Grouparoo.

Here's an example that sums the values in the price field for rows in which the user_id field's value matches the value of the Grouparoo Profile's userId field (i.e. userId is the id for the Property in Grouparoo):

{
options: {
query: [
{
$match: {
user_id: {{userId}},
},
},
{
$group: {
_id: null,
total: {
$sum: "$price",
},
},
},
{
$project: {
_id: 0,
},
},
],
}
}

Mongo Next Steps

Once you have the plugin installed, App created, and a Source configured, you are ready to validate, apply, then import your data!