Experimenting with TypeScript 4.0's Variadic Tuple Types (Variadic Kinds)

I wrote some code over 2 years ago that couldn't be properly typed with either Flow or TypeScript, but with the introduction of Variadic Tuple Types coming in TypeScript 4.0, I decided to give this piece of code a second look.

We have a function called innerJoin which takes in 2+N arguments:

  1. A comparator function
  2. A merge function
  3. A variadic list of arrays that all have to be sorted the same way: ...arrs

The inner join function loops through all of the arrays "at the same time", looks for items that need to be merged based on some "join predicate" (the comparator function) and then calls the merge function with all of those items to generate a "merged" item that will go on the end result array.

Here's an example of how to use this function in a very simple test case:

function idComparator(a: { id: number }, b: { id: number }) {
    if (a.id < b.id) {
        return -1;
    } else if (a.id === b.id) {
        return 0;
    } else {
        return 1;
    }
}

const array1 = [
    {
        id: 1,
        name: "David",
    },
    {
        id: 2,
        name: "Miguel",
    },
];

const array2 = [
    {
        id: 1,
        country: "Portugal",
    },
    {
        id: 2,
        country: "Portugal",
    },
    {
        id: 3,
        country: "USA",
    },
];

expect(
    innerJoin(idComparator, (a, b) => ({ ...a, ...b }), array1, array2)
).toEqual([
    {
        id: 1,
        name: "David",
        country: "Portugal",
    },
    {
        id: 2,
        name: "Miguel",
        country: "Portugal",
    },
]);

As you can see, a match was found for items 1 and 2, which resulted in a "merge" and item 3 was only present in one of the arrays, so it doesn't appear in the end result. A visualization of this algorithm can be seen here:

So, how was this function implemented? (The comments should explain everything you need to know)

inner-join.js

(Notice that this is a JavaScript file which means that any potential type errors on it will go unnoticed)

// This file's type definitions live in inner-join.d.ts, as we need to separate
// the type definition from the type implementation.

import _ from "lodash";

// All the arrays have to be sorted the same way. If not, the behavior
// of this function is *undefined*.
//
// A good visualization of this algorithm can be found here:
// http://sqlity.net/wp-content/uploads/2012/12/merge-join-algorithm.gif.
export default function innerJoin(comparator, merge, ...arrs) {
    const output = [];

    let iterators = _.fill(Array(arrs.length), 0);

    // We should only continue looping if none of the pointers is
    // at the end of their respective array. This is because we can
    // only match if we have a value from each array.
    const canContinue = () =>
        _.every(arrs, (arr, index) => iterators[index] < arr.length);

    // Gets all the current values by grabbing the value we are
    // currently pointing to for each array.
    const getAllValues = () =>
        _.map(arrs, (arr, index) => arr[iterators[index]]);

    // Tests whether all current values match by comparing all of them
    // against the first element.
    const allValuesEqual = () =>
        _.every(
            arrs,
            (arr, index) =>
                comparator(arrs[0][iterators[0]], arr[iterators[index]]) === 0
        );

    while (canContinue()) {
        const values = getAllValues();
        const comparison = allValuesEqual();

        if (comparison) {
            output.push(merge(...values));
            iterators = _.map(iterators, it => ++it);
        } else {
            let minimumIndex = 0;

            let i;
            for (i = 1; i < arrs.length; i++) {
                if (
                    comparator(
                        arrs[i][iterators[i]],
                        arrs[minimumIndex][iterators[minimumIndex]]
                    ) === -1
                ) {
                    minimumIndex = i;
                }
            }

            iterators[minimumIndex]++;
        }
    }

    return output;
}

As you can see on the very first line of the inner-join.js file, the type definitions for this file live elsewhere: in inner-join.d.ts. Let's see what that file looks like:

inner-join.d.ts

export default function innerJoin<T1, T2, T3>(
    comparator: (first: T1, second: T2) => -1 | 0 | 1,
    merge: (first: T1, second: T2) => T3,
    arr1: Array<T1>,
    arr2: Array<T2>
): Array<T3>;

export default function innerJoin<T1, T2, T3, T4>(
    comparator: (first: T1, second: T2 | T3) => -1 | 0 | 1,
    merge: (first: T1, second: T2, third: T3) => T4,
    arr1: Array<T1>,
    arr2: Array<T2>,
    arr3: Array<T3>
): Array<T4>;

The innerJoin function is generic over an arbitrary number of types, so typing it alongside its implementation without variadic tuple types is not possible. So, we had to move its type definitions to a separate file. On the separate file, we hand-type the function's type definition for 4 and 5 arguments. If we ever need to call this function with more arguments, we'll have to manually add more type definitions in this file.

While this isn't great, it's also not a huge problem since we can easily generate 20 or 30 of these type definitions and not have to worry about it for a long time. Some open source libraries like reselect have been doing this as well for a long time.

Rewriting our code using TypeScript 4.0's Variadic Tuple Types

The first thing we have to define is what innerJoin is generic over. That's easy, it is generic over 2 "things":

  • A tuple type corresponding to the types of all the arrays that will be passed in (e.g., if generic over [A, C], then it must receive as arguments [Array<A>, Array<C>].
  • The output type of the join (e.g., if generic over M, it must return Array<M>).

We can easily express this as:

export default function innerJoin<
    T extends unknown[], 
    MergedType
>(
    ...
): MergedType { ... }

Now let's think about the input types:

  • The comparator function which receives 2 arguments: an element from the first array and an element from either one of the remaining arrays. This function has to return 0, -1 or 1.
  • The merge function which receives N arguments: on element from each of the input arrays. This function has to return an instance of MergedType.
  • arrs: the list of input arrays.

The comparator is a function that receives an argument with type T[0] (the type of the first element in the tuple) and another argument with the union type of all the other types in T[1...N]. We can use DropFirstInTuple and then use the A[number] syntax. What this syntax means is that we want to get the type of what we can index out of A with a number — if A is a tuple, then A[number] will correspond to the union type of all the element types of A). Here's all of that together:

export default function innerJoin<
    T extends unknown[], 
    MergedType
>(
    comparator: (t1: T[0], tOther: DropFirstInTuple<T>[number]) => -1 | 0 | 1,
    ...
): MergedType { ... }

(The DropFirstInTuple type was written by Keegan Leitz, and I found it here)

The merge function is fairly simple to type:

export default function innerJoin<
    T extends unknown[], 
    MergedType
>(
    comparator: (t1: T[0], tOther: DropFirstInTuple<T>[number]) => -1 | 0 | 1,
    merge: (...t: T) => MergedType,
    ...
): MergedType { ... }

The list of input arrs is a little bit more challenging to type:

export default function innerJoin<
    T extends unknown[], 
    MergedType
>(
    comparator: (t1: T[0], tOther: DropFirstInTuple<T>[number]) => -1 | 0 | 1,
    merge: (...t: T) => MergedType,
    ...arrs: { [K in keyof T]: Array<T[K]> }
): MergedType { ... }

{ [K in keyof T]: Array<T[K]> } can be explained as follows:

For each key of T (T is a tuple type, so the keys are numbers like 0, 1, 2, etc.), we need the value to be an Array<T[K]>.

To exemplify, if T is [A, C] then { [K in keyof T]: Array<T[K]> } will be [Array< A>, Array<C>].

Using TypeScript 4.0, we've been able to type our innerJoin function! However, there is a TypeScript error further down in the file:

...
    const getAllValues = () =>
        arrs.map((arr, index) => arr[iterators[index]]);

...

    if (comparison) {
        output.push(merge(...values));
...
Argument of type 'unknown[]' is not assignable to parameter of type 'T'.
  'unknown[]' is assignable to the constraint of type 'T', but 'T' could be instantiated with a different subtype of constraint 'unknown[]'.

So the issue is that the TypeScript type checker doesn't understand that getAllValues returns a value of type T. If we try to annotate the output of getAllValues as T, we get the same error. The core issue is that we're mapping over a tuple type and so TypeScript just can't guess that the return type of map will be of type T since that would require the type checker understanding that map is an ordered traversal over a finite set of elements.

To get around this, we need to use a (dreaded) type assertion:

// Gets all the current values by grabbing the value we are
// currently pointing to for each array.
// TYPE ASSERTION: This function leverages a type assertion which is
// something one should never do as they are not type safe. However, there
// is no other way to map over a tuple type and return another tuple type.
const getAllValues = () =>
    arrs.map((arr, index) => arr[iterators[index]]) as T;

And that's it! Here's the final inner-join.ts:

// This file's type definitions live in inner-join.d.ts, as we need to separate
// the type definition from the type implementation.

import _ from "lodash";

// CREDIT to https://github.com/kjleitz for this type:
// https://dev.to/kjleitz/comment/gb5d

// Drops the first element of a tuple. Example:
//
//   type Foo = DropFirstInTuple<[string, number, boolean]>;
//   //=> [number, boolean]
//
export type DropFirstInTuple<T extends any[]> = ((...args: T) => any) extends (arg: any, ...rest: infer U) => any ? U : T;

// All the arrays have to be sorted the same way. If not, the behavior
// of this function is *undefined*.
//
// A good visualization of this algorithm can be found here:
// http://sqlity.net/wp-content/uploads/2012/12/merge-join-algorithm.gif.
export default function innerJoin<T extends unknown[], MergedType>(
    comparator: (t1: T[0], tOther: DropFirstInTuple<T>[number]) => -1 | 0 | 1,
    merge: (...t: T) => MergedType,
    ...arrs: { [K in keyof T]: Array<T[K]> }
): Array<MergedType> {
    const output = [];

    let iterators = _.fill(Array(arrs.length), 0);

    // We should only continue looping if none of the pointers is
    // at the end of their respective array. This is because we can
    // only match if we have a value from each array.
    const canContinue = () =>
        _.every(arrs, (arr, index) => iterators[index] < arr.length);

    // Gets all the current values by grabbing the value we are
    // currently pointing to for each array.
    // TYPE ASSERTION: This function leverages a type assertion which is
    // something one should never do as they are not type safe. However, there
    // is no other way to map over a tuple type and return another tuple type.
    const getAllValues = () =>
        arrs.map((arr, index) => arr[iterators[index]]) as T;

    // Tests whether all current values match by comparing all of them
    // against the first element.
    const allValuesEqual = () =>
        _.every(
            arrs,
            (arr, index) =>
                comparator(arrs[0][iterators[0]], arr[iterators[index]]) === 0
        );

    while (canContinue()) {
        const values = getAllValues();
        const comparison = allValuesEqual();

        if (comparison) {
            output.push(merge(...values));
            iterators = _.map(iterators, it => ++it);
        } else {
            let minimumIndex = 0;

            let i;
            for (i = 1; i < arrs.length; i++) {
                if (
                    comparator(
                        arrs[i][iterators[i]],
                        arrs[minimumIndex][iterators[minimumIndex]]
                    ) === -1
                ) {
                    minimumIndex = i;
                }
            }

            iterators[minimumIndex]++;
        }
    }

    return output;
}

Conclusion

All in all, I'm super excited for Variadic Tuple Types in TypeScript as they'll allow us to express variadic functions and other patterns which we couldn't type before without manually writing down multiple type definitions.

Will I check in this code to our repository? I haven't made up my mind on this yet, because the old implementation may be easier to maintain and understand (by a new hire for instance). On the other hand. the new implementation is safer (since we're not using JavaScript) and will work "forever" since we'll never need to add new manual type definitions.

Of course, I'd love to hear any other thoughts on this matter. Just reach out on Twitter!