@grouparoo / redshift

Last Updated: 2021-10-12

Grouparoo's Redshift Plugin enables you to import data from one or more Redshift databases or to export data to a Redshift database.

This guide will show you how to work with the Redshift Plugin to create a Source to import your data, or to create a Destination to export your data.

Install the Redshift Plugin

To work with the Redshift Plugin, you must first install it in an existing Grouparoo project. You can do this using grouparoo config and choosing to add the new App. You can also install a plugin with the Grouparoo CLI and use the install command from our CLI: grouparoo install @grouparoo/redshift

Both methods of installation add the package to your package.json file as a dependency, and also drops the Plugin in the grouparoo.plugins section in that same file, which enables it.

// package.json
// ...
"dependencies": {
"@grouparoo/redshift": "...",
// ...
"grouparoo": {
"plugins": [
// ...

Once the Plugin is installed, you'll be working primarily with the CLI's configuration commands to get everything set up.

Create a Redshift App

With Grouparoo, an App is how we establish a connection with a Source or Destination. Add this connection by generating an App:

$ grouparoo generate redshift:app my_redshift_app

This will generate a file at config/apps/my_redshift_app.js. Open this file and edit the connection details to match your desired configuration. Here is an example of what this config object will look like after generation:

// config/apps/my_redshift_app.js
exports.default = async function buildConfig() {
return [
class: "app",
id: "my_redshift_app",
name: "my_redshift_app",
type: "redshift",
options: {
host: localhost,
port: 5439,
database: "...",
schema: public,
user: "...",
password: "...",

Redshift App Options

Here are the Redshift-specific options available to you in the options section of the config file:

host (default: localhost)

Database host.

port (default: 5439)

Port on which to open database connection.

database [required]

The database name - e.g. "data_warehouse".

schema (default: public)

Schema used to read tables. The Plugin works inside a single schema.


The user to connect to the database - e.g. "grouparoo_user". If you are connecting to localhost, leave as undefined.


The database password - e.g. "P@assword". If you don't have a password, leave as undefined.

Validating & Applying Your Config

You can validate your config at any time using the validate command:

$ grouparoo validate

And you can apply that config (save it to your Grouparoo application's database) using the apply command:

$ grouparoo apply

Create a Redshift Source

The Redshift Source is a specific type of Source that we call a Columnar Source, which means it imports data from a column-based mechanism, like a database. Columnar Sources can take one of two forms:

  • A Columnar Table Source targets specific columns within a single table and can perform aggregation methods on that column.
  • A Columnar Query Source provides the ability to write custom SQL code to extract data from one or more tables and import the result into Grouparoo.

Create a Redshift Table Source

You can generate a Redshift Table Source using the generate command. You must specify a parent, which should match the id of the App you created.

This is the simplest form of Generator you can use for Table Sources:

$ grouparoo generate redshift:table:source users --parent my_redshift_app

This generates a file at config/sources/users. Open this file and edit the options to match your desired configuration. Table Sources are among the most complex config objects generated. There is a lot going on in these files.

Here is a filled out version of a common use case. We'll step through the unique pieces below.

exports.default = async function buildConfig() {
return [
// --- Source ---
class: "source",
id: "users",
name: "users",
type: "redshift-table-import",
appId: "my_redshift_app",
options: {
table: "users",
mapping: {
id: "user_id",
// --- Schedule ---
id: "users_schedule",
name: "users_schedule",
class: "schedule",
sourceId: "users",
recurring: true,
recurringFrequency: 1000 * 60 * 15,
options: {
column: "updated_at",

Table Source Options

Here are the options for a Redshift

table [required]

Name of the table in the Redshift database.

Table Source Mappings

Defining Mappings is a critical part of the process, as it tells Grouparoo with which Record to associate imported data.

For example, let's say your database has a column named email and that maps directly to a unique Property on the Record in Grouparoo called emailAddress. In that case, your Mapping would look like this:

mapping: {
email: "emailAddress",

Configuring your First Source

Before you can define a Mapping, you must have a primary key for Records. In most cases, this is an ID or an e-mail address. Grouparoo will automatically determine what field is used for a primary key based on the Mapping and automatically generate one Record per unique value in that field.

Table Source Schedule

A Schedule is attached to every Source config file by default, commented out. If you want to import data from the source on a Schedule (which is the typical behavior), then you should remove the comments and apply the necessary values.

You can read more about the common options here. The Redshift-specific options (those in the options object) are:

column [required]

The name of the column to use as the High Watermark.

Table Source Properties

After you generate a Source, you'll likely want to add Properties to it. You can do this through the CLI:

$ grouparoo generate redshift:table:property first_name --parent users

The Property Generator will drop individual files in the config/properties directory. Edit these files to match your desired configuration.

Table Source Property Options

The Property config object has several options. Some are share across all Properties, while others are more specific to the type of Property generated. A Redshift Table Source has a few unique options. These can be found in the options object in the config file. They are:

column [required]

The name of the column to use for the Property.

aggregationMethod [required] (default: "exact")

The type of aggregation method to use when extracting the data. The available options will be added to the generated config file as a comment.

sort [required] (default: null)

Table Source Property Filters

A Redshift Table Source also provides the ability to Filter your data via the filters option. This is a series of rules that will Filter data in the database table to find the appropriate value for each Record for a given Property.

For example, let's say you had a Property called lifetime_value which summed all the purchases for a given user. Your Source is a purchases table that has a state column set to either successful or returned. You may only want to include successful purchases. Your filters config might look like this:

filters: [{ key: "state", op: "eq", match: "successful" }],

The available operators (op) will be added to the generated config file as a comment near the filters section.

Create a Redshift Query Source

You can generate a Redshift Query Source using the generate command. You must specify a parent, which should match the id of the App you created.

$ grouparoo generate redshift:query:source users --parent my_redshift_app

A Query Source is a more flexible way to build properties. With a Query Source, you can add custom SQL commands to your Properties, which could pull data from one or more tables in your database.

Query Source Options

The unique thing about the way a Query Source works is that there aren't any unique options for the Source itself.

Query Source Schedule

Like a Table Source Schedule, a Query Source Schedule is included with the generated config file, commented. It has a couple unique options:


A SQL query to return that tells Grouparoo which Records to check each time the interval occurs.


The id of the Grouparoo Property whose data is returned by options.query.

Here's an example:

options: {
query: "SELECT id FROM users WHERE updated_at >= (NOW() - INTERVAL '2 day')",
propertyId: "userId"

Query Source Properties

Query Sources are a little simpler than Table Sources when it comes to Properties. You can generate a Property using the CLI:

$ grouparoo generate redshift:query:property lifetime_value --parent users

This will drop a file at config/properties/lifetime_value.js. Edit this file to match your desired configuration.

There is one unique option for Query Source Properties:


The query to extract the Property. You can use mustache variables to represent the keys of other Properties in the system. You can use the id of any other Properties you created in Grouparoo.

Here's an example that sums the values in the price column for rows in which the user_id column's value matches the value of the Grouparoo Record's userId field (i.e. userId is the id for the Property in Grouparoo):

options: {
query: "SELECT SUM(price) from purchases where user_id = {{userId}}";

Create a Redshift Destination

Redshift is one of the few integrations that can be both a Source and a Destination. You can use Grouparoo to export data to a Redshift database. Like a Source, a Destination requires that you've already configured an App to describe the connection details to you Redshift database.

Once your App exists, you can generate the config file for the Destination:

$ grouparoo generate redshift:destination my_exports

This would generate a config file at config/destinations/my_export.jsin your Grouparoo project. Edit this file to match your desired configuration.

The Redshift Destination has a few unique options:


The name of the table which Grouparoo which use to export Records.


The primary key for the specified table.


Grouparoo also exports Group memberships. This is the table used to export Group data.


The foreign key that the groups table uses to reference the records table.


The column name used to store the Group name values.

Redshift Next Steps

Once you have the Plugin installed, App created, and a Source or Destination configured, you are ready to validate, apply, then import or export your data!