Data Flow

Last Updated: 2021-06-14

The Grouparoo Application's main job is to collect data from Sources, store that data to build a robust Profiles, build Groups of Profiles, and finally syndicate this data with Destinations, your other marketing tools.

This document is place to centrally describe how all of this works!

Core Data Models

Grouparoo Data Models

Profiles and Profile Properties

The main object in Grouparoo is a Profile. A profile by itself is simply an id and the time it was created and last updated. It is a "join table" to collect many Profile Properties, which act as a key-value store for the data about Profiles. For example, a common Profile Property would be key='email', value='evan@grouparoo.com', type='string'. We also store when each Profile Property was created and updated.

Groups

Groups are a collection of Profiles.

There are 2 types of Groups: "Manual" and "Calculated".

  • Manual Groups contain profiles that a Grouparoo user has added to the group. The membership of Manual Groups does not change.
  • Calculated Groups are constantly recalculated against Profile Properties. These dynamic lists of Profiles are how you build up cohorts that change over time. You may have a group of Abandoned Carts Last Week that checks a Profile Property of "last_abandoned_card_date" and makes sure it was within the last 7 days. You might also choose to segment your customers by the language they speak with a "Locale" Profile Property.

Profiles can be in many groups at once.

Apps

Apps are how you define connections to your tools like your databases, email vendors, CRMs, etc. With each App you tell Grouparoo how to connect so we can import and export your data.

Properties

Properties describe each Profile Property and how they work. It's a bit like a "Schema" for the Profiles:

  • What Profile Properties exist? You cannot just add any Profile Property to a Profile. You need to define it first.
  • Is this Profile Property unique? If it is, Grouparoo will make sure that no other Profile can have the same value for that Profile Property. Common examples would be email or user_id.
  • What is the type of this Profile Property? A number, a string? We actually stringify all of our Profile Property data for searching, but wen we render it back out again, we convert it. This also helps us know how to build search queries for Calculated Groups.
  • How is this Profile Property to be retrieved? By which App? We talk more about this below.

Every Profile Property also includes a reference to an app (perhaps "MySQL-Datawarehouse") and a statement/query/option of how to get it for each Profile. Grouparoo will always be keeping your data in sync so we always need to know how to retrieve a Profile's Profile Properties.

  • For an App "MySQL-Datawarehouse" which is linked to the Property for "First Name (string)" you might provide a SQL statement like select first_name from users where users.id={{ user_id }}.
  • You can also have more complex queries, like you could connect the App "MySQL-Datawarehouse" to the Property for "ltv (number)" which might do select (sum(purchases.total) - sum(refunds.total)) as ltv from users join purchases on purchases.user_id = users.id join refunds.user_id = users.id where users.id={{ user_id }}
  • For a an app CSV responsible for "VIP (boolean)" you might choose "column" from a dropdown indicating that any column labeled "VIP" should be allowed to set this Profile Property

Data into Grouparoo

Sources, Schedules, and Runs

When you add an App to Grouparoo, you can also configure if you want Grouparoo to periodically check that app for new data. This might be accomplished by checking a database table for new/updated rows or asking an API for recent changes. This is called a Schedule. You can create many Schedules from a single App, against each Source. An App can have many Sources, often correlating to a logical collection in the App, e.g. Tables in a Database.

Perhaps the App "MySQL-Datawarehouse" has 4 Schedules:

  1. "MySQL-Datawarehouse:users". This Schedule checks the "users" table every 10 minutes for new or updated records
  2. "MySQL-Datawarehouse:carts". This Schedule checks the "carts" table every hour for only new carts
  3. "MySQL-Datawarehouse:purchase". This Schedule checks the "purchase" table every hour for only new purchases.
  4. "MySQL-Datawarehouse:refunds". This Schedule checks the "refunds" table every hour for only new refunds

Each App works a little differently, but you will be asked what to check for and how often. Each instance of a periodic check is called a Run. When new data is found, the Run will produce Imports.

Imports

Imports are created when a Schedules' Run finds new or updated data. Either way, an Import tells Grouparoo that something has changed that we should take note of.

Most Imports trigger the update of a profile. An Import can be as simple as {user_id: 3, firstName: 'Evan'}. This tells us that we should update the Profile we have for user #3, and if we don't have a Profile for User #3, we should create one.

We know that the Import's data we get should be saved to the related profile if:

  1. The Import's payload contains a unique Profile Property as defined by the Properties
  2. The other data key-value pairs are also defined by the Properties.

Imports are stored in Grouparoo for later use. You can choose to delete old Imports after a period of time.

Profile Hydration

Any time an Import is recorded by Grouparoo, and that Import can be linked to a Profile, Grouparoo will fully update that profile, checking all the Apps that are connected to it via Profile Properties and asking for new data. You can see which Profile Properties should be updated by checking if they are in the pending state or not. In this way, you can be sure that your Profiles are always up to date, and we can recover quickly from any new or missed data. We call this step "Profile Hydration".

This means that if an Import comes in from the "MySQL-Datawarehouse:purchase" Import Schedules, Grouparoo will automatically check all Profile Properties, including "ltv", "first_name" and everything else.

Data out of Grouparoo

Destinations

When configuring your Destinations, you'll be asked which Groups and Profile Properties you want to send (or all of them). You may want to sync "email" and "first_name" to your email tool for everyone in the Group "lapsed customers" and "USA Customers" so you can send them a coupon as part of your re-engagement campaign. Oh, and the Group "lapsed customers" is Dynamic, based on LTV>0, and last_purchase at least 2 months ago, while "USA Customers" is Dynamic where locale=en-us... all of which Grouparoo is keeping up-to-date for you.

To accomplish this you, would create an Destination for the App "Mailchimp", toggle on the "lapsed-customers" and "USA Customers" groups, and finally the requested Profile Properties of "email" and "first_name".

When Grouparoo sends data via a Destination, Exports are created.

Exports

The Grouparoo App for Mailchimp will handle using the Mailchimp API to send the Export for you. Exports keep track of the exact data which was sent to the Destination. Grouparoo will try to send data in bulk when possible, handle retries, and skip any updates that won't result in any new information being sent to the Destination.

You can choose to delete old Exports after a period of time.

Plugins and the Grouparoo Marketplace

Plugins are Grouparoo's way of providing more Apps to Grouparoo, allowing you to import and export data from more and more sources and destinations.