Databases

Last Updated: 2020-08-01

This document is a collection of the guidelines Grouparoo uses when dealing with Models and Databases internally.

Grouparoo Migration Guidelines

  • No sequential IDs. We use app-generated unique IDs for primary keys (UUIDs). We are working with large datasets and database that sometimes can't count.
  • Every table getsguid, createdAt and updatedAt, even if it seems silly now... we will need it eventually!
  • No enums, use VARCHAR(191). Migrating an enum is too hard and database-specific. We can use the ENUM type in the model for validations
  • No default values for strings. Define defaults in the model. Do mark columns as non-null.
  • No foreign key constraints in the database... do these checks in the model. But, it's probably a good idea to add a search index to these
  • All guids/uuids are 40 characters long - a real V4 UUID (36 characters) + 3 letter prefix + underscore, ie: app_168c4564-e389-4fbd-8338-db04d62022ba

Grouparoo Model Guidelines

  • No private methods. This breaks compatibility between Typescript-exported model definitions (to JS) and the TS files themselves. We require this compatibility as plugins both source and export to core... so a single runtime's require path may be core/dist -> plugin/src or even core/dist -> plugin/dist -> core/src
    • Instead, Pretend it's 1999 and prefix methods you want to be private with _, i.e. async _privateMethod()
  • New models need to be enumerated a few places:
    • Index.ts
    • SpecHelper.ts
    • modules/plugin.ts
  • In order to keep models smaller, we make the distinction between the Model itself and ModelOps
    • Table/Column definitions, Class methods, Setters, Getters, and Validations belong on the Model
    • Methods that mutate the Model (or more than one Model), or interact with a Plugin, belong in a ModelOp
    • It is appropriate for convenience to have a Model method call out to a ModelOp (ie: Profile.import => ProfileOpt.import(profile))

Querying

  • As we are working with larger volumes of data, we cannot assume that our querying in batches (via limit & offset) won't change out from under us. To that end, using offset is a code smell. We should instead be ordering by some non-changing column (guid, createdAt, etc) and storing the highest value as a highWaterMark for the next subsequent query.
-- bad, what if updated_at changed for some of the users as we query?
select * from users order by updated_at asc limit 100 offset 100

-- good
select * from users order by uuid asc limit 100;
-- next time
select * from users where uuid >= previousValue order by uuid asc limit 100;
  • Do not order by rand() or random(). It's very slow on large datasets. There are other alternatives possible, but may require more than one query.

Other Notes

  • If you are loading a model in an initializer in the @grouparoo/core project, you need to load it from Core's main export, not the Model file itself. ie: import {Profile} from '../index' not import {Profile} from '../models/Profile'.