Monitoring and Performance

Last Updated: 2022-01-24

Monitoring

To ensure that you have a robust Grouparoo deployment, we recommend that you set up monitoring along each of the debugging steps above so that you can be alerted to problems as they are starting to happen. Here are some suggestions about how to test the Grouparoo Application Tier:

Monitor the Application

  • Monitor the /api/v1/status/private endpoint from within your firewall. Doing so will let you know that the application is running in general. This endpoint reports on the resque queue length and the size of the backlog of jobs that need to be worked. The queue getting longer than normal can be an early indicator of a problem. This route requires authentication, so to monitor it, you'll need to create an APIKey with the system:read scope, e.g.: curl "http://localhost:3000/api/v1/status/private?apiKey=abc123". Learn more about the Grouparoo rest API here.
  • Monitor the /api/v1/status/public endpoint from outside of your firewall (or from the vantage point of your end users). Doing so will ensure that your end users can use the Grouparoo application.
  • Monitor the Grouparoo application for errors. Use a tool like New Relic or Sentry to report and log errors to a central place. Check how often these errors happen. Be sure to check the Background Worker errors listed in the /resque interface, and retry or delete them accordingly. At this time, Grouparoo provides plugins for New Relic (@grouparoo/newrelic) and Sentry's (@grouparoo/sentry) Application Monitoring (APM) and Error Tracking services.
  • Send metrics to a central place to monitor changing trends. The @grouparoo/prometheus plugin can send metrics to your Prometheus server to create dashboards and alert on anomalies.

Monitor the Tier

  • Monitor the hardware performance of the servers running the Grouparoo Application Tier. Is there enough RAM? Is the CPU constantly pegged to 100% usage? Your hosting provider should provide tools to monitor your servers. If you are deploying with a tool like Docker or Kubernetes, there are ways to aggregate this information for all instances of the container.
  • Monitor the number of rows in your database. Are there more Records or less Exports than you expect? Also be sure to monitor your query performance. Grouparoo is a write-heavy application an requires a reasonably fast "disk" backing the postgres database.
  • Monitor the memory consumption of Redis. Redis is an in-memory database, and if it gets full, it will either refuse to accept new data or start deleting old data. Keeping Redis’ data volume small involves tuning the batch sizes of your Grouparoo application and ensuring you have enough Application Background Workers to run all the tasks created and the data tier to support it.

Grouparoo Performance

Say you have 1 million Records to sync from 1 Source to 1 Destination. On average, it might take 0.1 seconds to import a Record and 0.1 seconds to export a record. Assuming you aren't going to hit any rate limits, and you have WORKERS=5, Importing and Exporting all the Records will take ~11 hours for the first Run.

Subsequent Runs will be much faster. Grouparoo can cache the Record Properties, caches high watermarks from Schedule Runs, and if nothing has changed, Grouparoo will not need to re-import or re-export those Records until they do change.

1,000,000 Records * (0.1s + 0.1s)
--------------------------------- = ~11 hours
60 s/min * 60 min/hr * 5 workers

If your Grouparoo sync is going slower than the above example, here are suggestions to speed things up. These are listed in order of likelihood to help.

  1. Ensure that your Sources are adequately provisioned for repeated large reads. Check your logs, query times, and vendor credits.
  2. Ensure the Grouparoo DB is fast enough
  3. Slowly increase the Grouparoo Batch Sizes
  4. Check if Grouparoo is hitting rate limits when exporting Records to your Destinations. Is there a larger plan available with more throughput?
  5. Gradually increase the number of WORKERS "threads" the Grouparoo application uses. The number of workers can be thought of as internal processing "threads", each working in parallel to import and export your data. Increasing the number of WORKERS may speed up Grouparoo, provided your sources and destinations can handle the increased volume and load. If not, increasing the number of WORKERS might actually slow things down. An upper limit of 20 WORKERS per server should be maintained.
  6. If you have increased the number of WORKERS past 20, and your Sources and Destinations are not under heavy load, and the Grouparoo postgres database isn't slowing down, you can start to investigate horizontally scaling the Grouparoo application itself. The steps to do this will differ based on your hosting provider, but you will end up with more than one instance of the Grouparoo application running at once, sharing the same data tier. The Grouparoo application is prepared to be horizontally scaled with a shared task queue and workers.