@grouparoo / clickhouse

Last Updated: 2021-11-15

Grouparoo's ClickHouse Plugin enables you to import data from one or more ClickHouse databases or to export data to a ClickHouse database.

This guide will show you how to work with the ClickHouse Plugin to create a Source to import your data, or to create a Destination to export your data.

Install the ClickHouse Plugin

To work with the ClickHouse Plugin, you must first install it in an existing Grouparoo project. You can do this using grouparoo config and choosing to add the new App. You can also install a plugin with the Grouparoo CLI and use the install command from our CLI: grouparoo install @grouparoo/clickhouse

Both methods of installation add the package to your package.json file as a dependency, and also drops the Plugin in the grouparoo.plugins section in that same file, which enables it.

// package.json
// ...
"dependencies": {
"@grouparoo/clickhouse": "...",
// ...
"grouparoo": {
"plugins": [
// ...

Once the Plugin is installed, you'll be working primarily with the CLI's configuration commands to get everything set up.

Create a ClickHouse App

With Grouparoo, an App is how we establish a connection with a Source or Destination. Add this connection by generating an App:

$ grouparoo generate clickhouse:app my_clickhouse_app

This will generate a file at config/apps/my_clickhouse_app.js. Open this file and edit the connection details to match your desired configuration. Here is an example of what this config object will look like after generation:

// config/apps/my_clickhouse_app.js
exports.default = async function buildConfig() {
return [
class: "app",
id: "my_clickhouse_app",
name: "my_clickhouse_app",
type: "clickhouse",
options: {
host: localhost,
port: 8123,
database: "...",
user: default,
password: "...",

ClickHouse App Options

Here are the ClickHouse-specific options available to you in the options section of the config file:

host [required] (default: localhost)

Database HTTP host.

port [required] (default: 8123)

HTTP Port on which to open database connection.

database [required]

The database name - e.g. "data_warehouse".

user (default: default)

The user to connect to the database - e.g. "grouparoo_user". If you are connecting to localhost, leave as default.


The database password - e.g. "P@assword". If you don't have a password, leave as undefined.

Validating & Applying Your Config

You can validate your config at any time using the validate command:

$ grouparoo validate

And you can apply that config (save it to your Grouparoo application's database) using the apply command:

$ grouparoo apply

Create a ClickHouse Source

The ClickHouse Source is a specific type of Source that we call a Columnar Source, which means it imports data from a column-based mechanism, like a database. Columnar Sources can take one of two forms:

  • A Columnar Table Source targets specific columns within a single table and can perform aggregation methods on that column.
  • A Columnar Query Source provides the ability to write custom SQL code to extract data from one or more tables and import the result into Grouparoo.

Create a ClickHouse Table Source

You can generate a ClickHouse Table Source using the generate command. You must specify a parent, which should match the id of the App you created.

This is the simplest form of Generator you can use for Table Sources:

$ grouparoo generate clickhouse:table:source users --parent my_clickhouse_app

This generates a file at config/sources/users. Open this file and edit the options to match your desired configuration. Table Sources are among the most complex config objects generated. There is a lot going on in these files.

Here is a filled out version of a common use case. We'll step through the unique pieces below.

exports.default = async function buildConfig() {
return [
// --- Source ---
class: "source",
id: "users",
name: "users",
type: "clickhouse-table-import",
appId: "my_clickhouse_app",
options: {
table: "users",
mapping: {
id: "user_id",
// --- Schedule ---
id: "users_schedule",
name: "users_schedule",
class: "schedule",
sourceId: "users",
recurring: true,
recurringFrequency: 1000 * 60 * 15,
options: {
column: "updated_at",

Table Source Options

Here are the options for a ClickHouse

table [required]

Name of the table in the ClickHouse database.

Table Source Mappings

Defining Mappings is a critical part of the process, as it tells Grouparoo with which Record to associate imported data.

For example, let's say your database has a column named email and that maps directly to a unique Property on the Record in Grouparoo called emailAddress. In that case, your Mapping would look like this:

mapping: {
email: "emailAddress",

Configuring your First Source

Before you can define a Mapping, you must have a primary key for Records. In most cases, this is an ID or an e-mail address. Grouparoo will automatically determine what field is used for a primary key based on the Mapping and automatically generate one Record per unique value in that field.

Table Source Schedule

A Schedule is attached to every Source config file by default, commented out. If you want to import data from the source on a Schedule (which is the typical behavior), then you should remove the comments and apply the necessary values.

You can read more about the common options here. The ClickHouse-specific options (those in the options object) are:

column [required]

The name of the column to use as the High Watermark.

Table Source Properties

After you generate a Source, you'll likely want to add Properties to it. You can do this through the CLI:

$ grouparoo generate clickhouse:table:property first_name --parent users

The Property Generator will drop individual files in the config/properties directory. Edit these files to match your desired configuration.

Table Source Property Options

The Property config object has several options. Some are share across all Properties, while others are more specific to the type of Property generated. A ClickHouse Table Source has a few unique options. These can be found in the options object in the config file. They are:

column [required]

The name of the column to use for the Property.

aggregationMethod [required] (default: "exact")

The type of aggregation method to use when extracting the data. The available options will be added to the generated config file as a comment.

sort [required] (default: null)

Table Source Property Filters

A ClickHouse Table Source also provides the ability to Filter your data via the filters option. This is a series of rules that will Filter data in the database table to find the appropriate value for each Record for a given Property.

For example, let's say you had a Property called lifetime_value which summed all the purchases for a given user. Your Source is a purchases table that has a state column set to either successful or returned. You may only want to include successful purchases. Your filters config might look like this:

filters: [{ key: "state", op: "eq", match: "successful" }],

The available operators (op) will be added to the generated config file as a comment near the filters section.

Create a ClickHouse Query Source

You can generate a ClickHouse Query Source using the generate command. You must specify a parent, which should match the id of the App you created.

$ grouparoo generate clickhouse:query:source users --parent my_clickhouse_app

A Query Source is a more flexible way to build properties. With a Query Source, you can add custom SQL commands to your Properties, which could pull data from one or more tables in your database.

Query Source Options

The unique thing about the way a Query Source works is that there aren't any unique options for the Source itself.

Query Source Schedule

Like a Table Source Schedule, a Query Source Schedule is included with the generated config file, commented. It has a couple unique options:


A SQL query to return that tells Grouparoo which Records to check each time the interval occurs.


The id of the Grouparoo Property whose data is returned by options.query.

Here's an example:

options: {
query: "SELECT id FROM users WHERE updated_at >= (NOW() - INTERVAL '2 day')",
propertyId: "userId"

Query Source Properties

Query Sources are a little simpler than Table Sources when it comes to Properties. You can generate a Property using the CLI:

$ grouparoo generate clickhouse:query:property lifetime_value --parent users

This will drop a file at config/properties/lifetime_value.js. Edit this file to match your desired configuration.

There is one unique option for Query Source Properties:


The query to extract the Property. You can use mustache variables to represent the keys of other Properties in the system. You can use the id of any other Properties you created in Grouparoo.

Here's an example that sums the values in the price column for rows in which the user_id column's value matches the value of the Grouparoo Record's userId field (i.e. userId is the id for the Property in Grouparoo):

options: {
query: "SELECT SUM(price) from purchases where user_id = {{userId}}";

Create a ClickHouse Destination

ClickHouse is one of the few integrations that can be both a Source and a Destination. You can use Grouparoo to export data to a ClickHouse database. Like a Source, a Destination requires that you've already configured an App to describe the connection details to you ClickHouse database.

Once your App exists, you can generate the config file for the Destination:

$ grouparoo generate clickhouse:destination my_exports

This would generate a config file at config/destinations/my_export.jsin your Grouparoo project. Edit this file to match your desired configuration.

The ClickHouse Destination has a few unique options:


The name of the table which Grouparoo which use to export Records.


The primary key for the specified table.


Grouparoo also exports Group memberships. This is the table used to export Group data.


The foreign key that the groups table uses to reference the records table.


The column name used to store the Group name values.

ClickHouse Next Steps

Once you have the Plugin installed, App created, and a Source or Destination configured, you are ready to validate, apply, then import or export your data!