Monitoring and PerformanceLast Updated: 2022-01-24
To ensure that you have a robust Grouparoo deployment, we recommend that you set up monitoring along each of the debugging steps above so that you can be alerted to problems as they are starting to happen. Here are some suggestions about how to test the Grouparoo Application Tier:
- Monitor the
/api/v1/status/privateendpoint from within your firewall. Doing so will let you know that the application is running in general. This endpoint reports on the resque queue length and the size of the backlog of jobs that need to be worked. The queue getting longer than normal can be an early indicator of a problem. This route requires authentication, so to monitor it, you'll need to create an APIKey with the
curl "http://localhost:3000/api/v1/status/private?apiKey=abc123". Learn more about the Grouparoo rest API here.
- Monitor the
/api/v1/status/publicendpoint from outside of your firewall (or from the vantage point of your end users). Doing so will ensure that your end users can use the Grouparoo application.
- Monitor the Grouparoo application for errors. Use a tool like New Relic or Sentry to report and log errors to a central place. Check how often these errors happen. Be sure to check the Background Worker errors listed in the
/resqueinterface, and retry or delete them accordingly. At this time, Grouparoo provides plugins for New Relic (
@grouparoo/newrelic) and Sentry's (
@grouparoo/sentry) Application Monitoring (APM) and Error Tracking services.
- Send metrics to a central place to monitor changing trends. The
@grouparoo/prometheusplugin can send metrics to your Prometheus server to create dashboards and alert on anomalies.
- Monitor the hardware performance of the servers running the Grouparoo Application Tier. Is there enough RAM? Is the CPU constantly pegged to 100% usage? Your hosting provider should provide tools to monitor your servers. If you are deploying with a tool like Docker or Kubernetes, there are ways to aggregate this information for all instances of the container.
- Monitor the number of rows in your database. Are there more Records or less Exports than you expect? Also be sure to monitor your query performance. Grouparoo is a write-heavy application an requires a reasonably fast "disk" backing the postgres database.
- Monitor the memory consumption of Redis. Redis is an in-memory database, and if it gets full, it will either refuse to accept new data or start deleting old data. Keeping Redis’ data volume small involves tuning the batch sizes of your Grouparoo application and ensuring you have enough Application Background Workers to run all the tasks created and the data tier to support it.
Say you have 1 million Records to sync from 1 Source to 1 Destination. On average, it might take 0.1 seconds to import a Record and 0.1 seconds to export a record. Assuming you aren't going to hit any rate limits, and you have
WORKERS=5, Importing and Exporting all the Records will take ~11 hours for the first Run.
Subsequent Runs will be much faster. Grouparoo can cache the Record Properties, caches high watermarks from Schedule Runs, and if nothing has changed, Grouparoo will not need to re-import or re-export those Records until they do change.
1,000,000 Records * (0.1s + 0.1s) --------------------------------- = ~11 hours 60 s/min * 60 min/hr * 5 workers
If your Grouparoo sync is going slower than the above example, here are suggestions to speed things up. These are listed in order of likelihood to help.
- Ensure that your Sources are adequately provisioned for repeated large reads. Check your logs, query times, and vendor credits.
- Ensure the Grouparoo DB is fast enough
- Slowly increase the Grouparoo Batch Sizes
- Check if Grouparoo is hitting rate limits when exporting Records to your Destinations. Is there a larger plan available with more throughput?
- Gradually increase the number of
WORKERS"threads" the Grouparoo application uses. The number of workers can be thought of as internal processing "threads", each working in parallel to import and export your data. Increasing the number of
WORKERSmay speed up Grouparoo, provided your sources and destinations can handle the increased volume and load. If not, increasing the number of
WORKERSmight actually slow things down. An upper limit of 20 WORKERS per server should be maintained.
- If you have increased the number of
WORKERSpast 20, and your Sources and Destinations are not under heavy load, and the Grouparoo postgres database isn't slowing down, you can start to investigate horizontally scaling the Grouparoo application itself. The steps to do this will differ based on your hosting provider, but you will end up with more than one instance of the Grouparoo application running at once, sharing the same data tier. The Grouparoo application is prepared to be horizontally scaled with a shared task queue and workers.