The Grouparoo Blog


Defer Transaction Side-Effects in Node.js

Tagged in Engineering 
By Evan Tahler on 2021-01-21

Steps of a simple sync

At Grouparoo, we use Actionhero as our Node.js API server and Sequelize for our Object Relational Mapping (ORM) tool - making it easy to work with complex records from our database. Within our Actions and Tasks, we often want to treat the whole execution as a single database transaction - either all the modifications to the database will succeed or fail as a unit. This is really helpful when a single activity may create or modify many database rows.

Why do we need Transactions?

Take the following example from a prototypical blogging site. When a user is created (POST /api/v1/user), we also create their first post and send them a welcome email. All examples in this post are written in Typescript, but the concepts work the same for Javascript.

import { action } from "actionhero";
import { User, Post } from "../models";

export class UserCreate extends Action {
  constructor() {
    super();
    this.name = "user:create";
    this.description = "create a user and their first post";
    this.inputs = {
      firstName: { required: true },
      lastName: { required: true },
      password: { required: true },
      email: { required: true },
    };
  }

  async run({ params }) {
    const user = await User.create(params);
    await user.updatePassword(params.password);
    await user.sendWelcomeEmail();

    const post = await Post.create({
      userId: user.id,
      title: "My First Post",
      published: false,
    });

    return { userId: user.id, postId: post.id };
  }
}

In this example, we:

  1. Create the user record
  2. Update the user’s password
  3. Send the welcome email
  4. Create the first post for the new user
  5. Return the IDs of the new records created

This works as long as nothing fails mid-action. What if we couldn’t update the user’s password? The new user record would still be in our database, and we would need a try/catch to clean up the data. If not, when the user tries to sign up again, they would have trouble as there would already be a record in the database for their email address.

To solve this cleanup problem, you could use transactions. Using Sequelize’s Managed Transactions, the run method of the Action could be:

async run({ params }) {
  return sequelize.transaction(async (t) => {
    const user = await User.create(params, {transaction: t});
    await user.updatePassword(params.password, {transaction: t} );
    await user.sendWelcomeEmail();

    const post = await Post.create({
      userId: user.id,
      title: 'My First Post',
      published: false,
    }, {transaction: t})

    return { userId: user.id, postId: post.id };
  })
}

Managed Transactions in Sequelize are very helpful - you don’t need to worry about rolling back the transaction if something goes wrong! If there’s an error throw-n, it will rollback the whole transaction automatically.

While this is safer than the first attempt, there are still some problems:

  1. We have to remember to pass the transaction object to every Sequelize call
  2. We need to ensure that every method we call which could read or write to the database needs to use the transaction as well, and take it as an argument (like user.updatePassword()... that probably needs to write to the database, right?)
  3. Sending the welcome email is not transaction safe.

Sending the email as-written will happen even if we roll back the transaction because of an error when creating the new post… which isn’t great if the user record wasn’t committed! So what do we do?

Automatically Pass Transactions to all Queries: CLS-Hooked

The solution to our problem comes from a wonderful package called cls-hooked. Using the magic of AsyncHooks, this package can tell when certain code is within a callback chain or promise. In this way, you can say: "for all methods invoked within this async function, I want to keep this variable in scope". This is pretty wild! If you opt into using Sequelize with CLS-Hooked, every SQL statement will check to see if there is already a transaction in scope... You don't need to manually supply it as an argument!

From the cls-hooked readme:

CLS: "Continuation-Local Storage"

Continuation-local storage works like thread-local storage in threaded programming, but is based on chains of Node-style callbacks instead of threads.

There is a performance penalty for using cls-hooked, but in our testing, it isn’t meaningful when compared to await-ing SQL results from a remote database.

Using cls-hooked, our Action's run method now can look like this:

// Elsewhere in the Project

const cls = require('cls-hooked');
const namespace = cls.createNamespace('actionhero')
const Sequelize = require('sequelize');
Sequelize.useCLS(namespace);
new Sequelize(....);

// Our Run Method

async run({ params }) {
  return sequelize.transaction(async (t) => {
    const user = await User.create(params);
    await user.updatePassword(params.password);
    await user.sendWelcomeEmail();

    const post = await Post.create({
      userId: user.id,
      title: 'My First Post',
      published: false,
    })

    return { userId: user.id, postId: post.id };
  })
}

Ok! We have removed the need to pass transaction to all queries and sub-methods! All that remains now is the user.sendWelcomeEmail() side-effect. How can we delay this method until the end of the transaction?

CLS and Deferred Execution

Looking deeper into how cls-hooked works, we can see that it is possible to tell if you are currently in a namespace, and to set and get values from the namespace. Think of this like a session... but for the callback or promise your code is within! With this in mind, we can write our run method to be transaction-aware. This means that we can use a pattern that knows to run a function in-line if we aren’t within a transaction, but if we are, defer it until the end. We’ve wrapped utilities to do this within Grouparoo’s CLS module.

With the CLS module you can write code like this:

// from the Grouparoo Test Suite: Within Transaction

test("in a transaction, deferred jobs will be run afterwords", async () => {
  const results = [];
  const runner = async () => {
    await CLS.afterCommit(() => results.push("side-effect-1"));
    await CLS.afterCommit(() => results.push("side-effect-2"));
    results.push("in-line");
  };

  await CLS.wrap(() => runner());
  expect(results).toEqual(["in-line", "side-effect-1", "side-effect-2"]);
});

You can see here that once you CLS.wrap() an async function, you can defer the execution of anything wrapped with CLS.afterCommit() until the transaction is complete. The order of the afterCommit side-effects is deterministic, and in-line happens first.

You can also take the same code and choose not apply CLS.wrap() to it to see that it still works, but the order of the side-effects has changed:

// from the Grouparoo Test Suite: Without Transaction

test("without a transaction, deferred jobs will be run in-line", async () => {
  const results = [];
  const runner = async () => {
    await CLS.afterCommit(() => results.push("side-effect-1"));
    await CLS.afterCommit(() => results.push("side-effect-2"));
    results.push("in-line");
  };

  await runner();
  expect(results).toEqual(["side-effect-1", "side-effect-2", "in-line"]);
});

CLSAction and CLSTask

Now that it is possible to take arbitrary functions and delay their execution until the transaction is complete, we can use these techniques to make a new type of Action and Task that has this functionality built in. We call these CLSAction and CLSTask. These new classes extend the regular Actionhero Action and Task classes, but provide a new runWithinTransaction method to replace run, which helpfully already uses CLS.wrap(). This makes it very easy for us to opt-into an Action in which automatically runs within a Sequelize transaction, and can defer it's own side-effects!

Putting everything together, our new transaction-safe Action looks like this:

// *** Define the CLSAction Class ***

import { Action } from "actionhero";
import { CLS } from "../modules/cls";

export abstract class CLSAction extends Action {
  constructor() {
    super();
  }

  async run(data) {
    return CLS.wrap(async () => await this.runWithinTransaction(data));
  }

  abstract runWithinTransaction(data): Promise<any>;
}
// *** Use the CLSAction Class ***

import { CLSAction } from "../classes";
import { User, Post } from "../models";

export class UserCreate extends CLSAction {
  constructor() {
    super();
    this.name = "user:create";
    this.description = "create a user and their first post";
    this.inputs = {
      firstName: { required: true },
      lastName: { required: true },
      password: { required: true },
      email: { required: true },
    };
  }

  async runWithinTransaction({ params }) {
    const user = await User.create(params);
    await user.updatePassword(params.password);
    await CLS.afterCommit(user.sendWelcomeEmail);

    const post = await Post.create({
      userId: user.id,
      title: "My First Post",
      published: false,
    });

    return { userId: user.id, postId: post.id };
  }
}

If the transaction fails, the email won’t be sent, and all the models will rolled back. There won't be any mess to clean up 🧹!

Summary

The cls-hooked module is a very powerful tool. If applied globally, it unlocks the ability to produce side-effects throughout your application worry-free. Perhaps your models need to enqueue a Task every time they are created... now you can if you cls.wrap() it! You can be sure that task won’t be enqueued unless the model was really saved and committed. This unlocks powerful tools that you can use with confidence.



Stay up to date

We will let you know about product updates and new content.




Get Started with Grouparoo

Start syncing your data with Grouparoo Cloud

Start Free Trial

Or download and try our open source Community edition.

Stay up to date

We will let you know about product updates and new content.