Designing the perfect Typescript schema validation library

March 8th, 2020


Colin McDonnell

@vriad

There are a handful of wildly popular validation libraries in the Javascript ecosystem with thousands of stars on GitHub.

When I started my quest to find the ideal schema declaration/validation library, I assumed the hardest part would be identifying the Most Great option from a sea of excellent ones.

But as I dug deeper into the many beloved tools in the ecosystem, I was surprised that none of them could provide the features and developer experience I was looking for.

tl;dr I made a new Typescript validation library that has static type inference and the best DX this side of the Mississippi. To jump straight to README, head over to https://github.com/vriad/zod

A pinch of context

I'm building an API for a healthcare application with Typescript. Naturally I want this mission-critical medical software to be rock-solid, so my goal is to build a fully end-to-end typesafe data layer.

All data that passes from the client to the server AND server to client should be validated at runtime. Moreover, both the client and server should have static type definitions of all payloads, so I can catch more errors at compile-time.

I also don't hate myself, so I don't want to keep my static types and runtime type validators in sync by hand as my data model changes. Which means we need a tool that supports 🚀StAtIc TyPe InFeReNcE!!🚀 vuvuzela sounds

Let's look at our options!

The existing options

Joi (14.3k stars)

Doesn't support static type inference. Boo.

Yup (8.6k stars)

Yup is jquense's popular, Joi-inspired validation library that was implemented first in vanilla JS, with Typescript typings added later.

Yup supports static type inference! But with a small caveat: the typings are wrong.

For instance, the yup package treats all object properties as optional by default.

const schema = yup.object({
  asdf: yup.string(),
});
schema.validate({}); // passes

Yet the inferred type indicates that all properties are required. 😕

type SchemaType = yup.InferType<typeof schema>;

// returns { asdf: string }
// should be { asdf?: string }

Yup also mis-infers the type of "required" arrays.

const numList = yup.array().of(yup.string()).required();

// .required() is used to indicate a non-empty list
numList.validateSync([]); // fails

// yet the inferred type doesn't reflect this
type NumList = yup.InferType<typeof numList>;
// returns string[]
// should be [string,...string[]]

Finally, Yup doesn't explicitly support generic union and intersection types. This is a recurring source of frustration for the Yup community.

These may sound like nitpicks. But it's very uncool for the inferred type to not actually reflect the actual type of the validator it came from.

Moreover, there are no plans to fix these issues in Yup's type inference because those changes would be backwards compatible. See more about this here.

io-ts (2.7k stars)

Finally, a library that was designed for Typescript from the ground up!

Look, io-ts is excellent library. It's creator, gcanti, has done more than anyone to bring proper higher-order functional programming to Typescript with his fp-ts library.

But in my situation, and I think many others', io-ts prioritizes functional programming purity at the expense of developer experience. Functional purity is a valid and admirable design goal, but it makes io-ts particularly hard to integrate into an existing codebase with a more procedural or object-oriented bias. It's difficult to progressively integrate io-ts without entirely refactoring your code with a more functional flavor.

For instance, consider how to define an object schema with optional properties in io-ts:

const A = t.type({
  foo: t.string,
});

const B = t.partial({
  bar: t.number,
});

const C = t.intersection([A, B]);

type C = t.TypeOf<typeof C>;
/*
{
  foo: string;
  bar?: number | undefined
}
*/

You must define the required props with a call to t.type({...}), define your optional props with a call to t.partial({...}), and merge them with t.intersection().

Spoiler alert: Here's the equivalent in my new library:

const C = z.object({
  foo: z.string(),
  bar: z.number().optional(),
});

type C = t.TypeOf<typeof C>;
/* {
  foo: string;
  bar?: number | undefined
} */

io-ts also requires the use of gcanti's functional programming library fp-ts to parse results and handle errors. From the io-ts docs, here is the most basic example of how to run a validation:

import * as t from 'io-ts';
import { pipe } from 'fp-ts/lib/pipeable';
import { fold } from 'fp-ts/lib/Either';

// failure handler
const onLeft = (errors: t.Errors): string => `\${errors.length} error(s) found`;

// success handler
const onRight = (s: string) => `No errors: \${s}`;

pipe(t.string.decode('a string'), fold(onLeft, onRight));
// => "No errors: a string"

The functional approach here is alienating for developers who aren't accustomed to it.

Again: fp-ts is fantastic resource for developers looking to keep their codebase strictly functional. But depending on fp-ts necessarily comes with a lot of intellectual overhead; a developer has to be familiar with functional programming concepts, fp-ts's nomenclature, and the Either monad to do a simple schema validation. It's alienating for devs who don't have a functional background, and it drives them to libraries like Yup, which returns incorrect types.

Introducing Zod

So, in the tradition of many a nitpicky programmer, I decided to build my own library from scratch. What could go wrong?

The final result: https://github.com/vriad/zod.

Zod is a validation library designed for optimal developer experience. It's a Typescript-first schema declaration library with rigorous (and correct!) inferred types, incredible developer experience, and a few killer features missing from the existing libraries.

  • Uses Typescript generic inference to statically infer the types of your schemas
  • Eliminates the need to keep static types and runtime validators in sync by hand
  • Has a composable, declarative API that makes it easy to define complex types concisely

Zod was also designed with some core principles designed to make all declarations as non-magical and developer-friendly as possible:

  • Fields are required unless explicitly marked as optional (just like Typescript!)
  • Schemas are immutable; methods (i.e. .optional()) return a new instance.
  • Zod schemas operate on a Parse, don't validate! basis!

To jump straight to README, head over to https://github.com/vriad/zod. If you're feeling frisky, leave a star 🌟👍

Primitives

import * as z from 'zod';

const stringSchema = z.string(); // => ZodType<string>
const numberSchema = z.number(); // => ZodType<number>
const booleanSchema = z.boolean(); // => ZodType<boolean>
const undefinedSchema = z.undefined(); // => ZodType<undefined>
const nullTypeSchema = z.null(); // => ZodType<null>

Parsing

// every ZodType instance has a .parse() method
const stringSchema = z.string();
stringSchema.parse('fish'); // => "fish"
stringSchema.parse(12); // throws Error('Non-string type: number');

Type inference

Like in io-ts, you can extract the Typescript type of any schema with {'z.TypeOf<>'}.

const A = z.string();
type A = z.TypeOf<typeof A>; // string

const u: A = 12; // TypeError
const u: A = 'asdf'; // compiles

We'll include examples of inferred types throughout the rest of the documentation.

Objects

// all properties are required by default
const dogSchema = z.object({
  name: z.string(),
  age: z.number(),
  neutered: z.boolean(),
});

type Dog = z.TypeOf<typeof dogSchema>;
/* equivalent to:
type Dog = {
  name:string;
  age: number;
  neutered: boolean;
} 
*/

const cujo = dogSchema.parse({
  name: 'Cujo',
  age: 4,
  neutered: true,
}); // passes, returns Dog

const fido: Dog = {
  name: 'Fido',
  age: 2,
}; // TypeError: missing required property 'neutered'

Arrays

const dogsList = z.array(dogSchema);

dogsList.parse([{ name: 'Cujo', age: 3, neutered: true }]); // passes
dogsList.parse([]); // passes

Plus you can explicitly define a non-empty array schema, something io-ts doesn't support.

// Non-empty lists

const nonEmptyDogsList = z.array(dogSchema).nonempty();
nonEmptyDogsList.parse([]); // throws Error("Array cannot be empty")

Unions (including nullable and optional types)

Zod includes a built-in z.union method for composing "OR" types.

const stringOrNumber = z.union([z.string(), z.number()]);

stringOrNumber.parse('foo'); // passes
stringOrNumber.parse(14); // passes

Unions are the basis for defining nullable and optional values.

/* Optional Types */ // "optional string" === the union of string and undefined
const A = z.union([z.string(), z.undefined()]);
A.parse(undefined); // => passes, returns undefined
type A = z.TypeOf<typeof A>; // string | undefined

There is also a shorthand way to make a schema "optional":

const B = z.string().optional(); // equivalent to A

const C = z.object({
  username: z.string().optional(),
});
type C = z.TypeOf<typeof C>; // { username?: string | undefined };
/* Nullable Types */ const D = z.union([z.string(), z.null()]);

const E = z.string().nullable(); // equivalent to D
type E = z.TypeOf<typeof D>; // string | null

You can create unions of any two schemas.

/* Custom Union Types */
const F = z.union([z.string(), z.number()]).optional().nullable();
F.parse('tuna'); // => tuna
F.parse(42); // => 42
F.parse(undefined); // => undefined
F.parse(null); // => null
F.parse({}); // => throws Error!

type F = z.TypeOf<typeof F>; // string | number | undefined | null;

Intersections

Intersections are useful for creating "logical AND" types.

const a = z.union([z.number(), z.string()]);
const b = z.union([z.number(), z.boolean()]);

const c = z.intersection(a, b);
type c = z.TypeOf<typeof C>; // => number

const neverType = z.intersection(z.string(), z.number());
type Never = z.TypeOf<typeof stringAndNumber>; // => never

This is particularly useful for defining "schema mixins" that you can apply to multiple schemas.

const HasId = z.object({
  id: z.string(),
});

const BaseTeacher = z.object({
  name: z.string(),
});

const Teacher = z.intersection(BaseTeacher, HasId);

type Teacher = z.TypeOf<typeof Teacher>;
// { id:string; name:string };

Object merging

In the examples above, the return value of z.intersection is an instance of ZodIntersection, a generic class that wraps the two schemas passed in as arguments.

But if you're trying to combine two object schemas, there is a shorthand:

const Teacher = BaseTeacher.merge(HasId);

The benefit of using this shorthand is that the returned value is a new object schema (ZodObject), instead of a generic ZodIntersection instance. This way, you're able to fluently chain together many .merge calls:

// chaining mixins
const Teacher = BaseTeacher.merge(HasId).merge(HasName).merge(HasAddress);

Tuples

These differ from arrays in that they have a fixed number of elements, and each element can have a different type.

const athleteSchema = z.tuple([
  // takes an array of schemas
  z.string(), // name
  z.number(), // jersey number
  z.object({
    pointsScored: z.number(),
  }), // statistics
]);

type Athlete = z.TypeOf<typeof athleteSchema>;
// type Athlete = [string, number, { pointsScored: number }]

Recursive types

You can define a recursive schema in Zod, but because of a limitation of Typescript, their type can't be statically inferred. If you need a recursive Zod schema you'll need to define the type definition manually, and provide it to Zod as a "type hint".

interface Category {
  name: string;
  subcategories: Category[];
}

const Category: z.ZodType<Category> = z.lazy(() => {
  return z.object({
    name: z.string(),
    subcategories: z.array(Category),
  });
});

Category.parse({
  name: 'People',
  subcategories: [
    {
      name: 'Politicians',
      subcategories: [{ name: 'Presidents', subcategories: [] }],
    },
  ],
}); // passes

Function schemas

Zod also lets you define "function schemas". This makes it easy to validate the inputs and outputs of a function without intermixing your validation code and "business logic".

You can create a function schema with z.function(args, returnType) which accepts these arguments.

  • args: ZodTuple The first argument is a tuple (created with z.tuple(\[...\]) and defines the schema of the arguments to your function. If the function doesn't accept arguments, you can pass an empty tuple (z.tuple([])).
  • returnType: ZodType The second argument is the function's return type. This can be any Zod schema.
const args = z.tuple([
  z.object({ nameStartsWith: z.string() }),
  z.object({ skip: z.number(), limit: z.number() }),
]);

const returnType = z.array(
  z.object({
    id: string(),
    name: string(),
  })
);

const FetcherFactory = z.function(args, returnType);

z.function returns a higher-order "function factory". Every "factory" has .validate() method which accepts a function as input and returns a new function. The returned function automatically validates both its inputs and return value against the schemas provided to z.function. If either is invalid, the function throws. This lets you confidently execute business logic in a "validated function" without worrying about invalid inputs or return types, mixing your validation and business logic, or writing duplicative types for your functions. Here's an example.

const validatedQueryUser = FetchFunction.validate((filters, pagination) => {
  // the arguments automatically have the appropriate types
  // as defined by the args tuple passed to `z.function()`
  // without needing to provide types in the function declaration

  filters.nameStartsWith; // autocompletes
  filters.ageLessThan; // TypeError

  // Typescript statically verifies that value returned by
  // this function is of type { id: string; name: string; }[]

  return 'salmon'; // TypeError
});

const users = validatedQueryUser(
  {
    nameStartsWith: 'John',
  },
  {
    skip: 0,
    limit: 20,
  }
); // => returns { id: string; name: string; }[]

This is particularly useful for defining HTTP or RPC endpoints that accept complex payloads that require validation. Moreover, you can define your endpoints once with Zod and share the code with both your client and server code to achieve end-to-end type safety.

Onwards and upwards

I think there's a LOT of room for improvement in how we handle the complexity of data modeling, validation, and transport as developers. Typescript, GraphQL, and Prisma are huge steps towards a future where our tooling can provide guarantees of data integrity. But there's still a long way to go.

If you like what you see, head over to https://github.com/vriad/zod and leave a star.




When I write a new blog post, I send it out to my email newsletter a day before I publish it online. I like getting feedback from smart, thoughtful readers before (git) pushing my fledgling posts out of the nest. If you use one of my libraries or like my writing, consider joining!

Future projects will likely involve the PERN+T stack (Postgres, Express, React, Node, plus TypeScript) and some auxiliary technologies. But I reserve the right to publish unintelligible screeds about GraphQL or startups if I want. I post roughly once a month. 🙃







© Colin McDonnell 2020

Colin McDonnell's RSS FEED