Skip to content

Latest commit

 

History

History
600 lines (458 loc) · 21.6 KB

todo.org

File metadata and controls

600 lines (458 loc) · 21.6 KB

Todo

fucking important [0/13]

Unions [1/3]

in terms of representation, we just need Vec<Value> for now. later we can do specialization for common like T | undefined

Normalizing unions

Typically Typescript “normalizes” unions:

// string | number | boolean
type a = string | number | boolean | string

Basically when “merging” types into a union, we check if the in-progress constructed union doesn’t already have a type that is a “super-type”.

Bad explanation. In the example above, the second “string” is ommitted because string already exists in the union. In another example:

// string | number | boolean
type a = string | number | boolean | "hello"

"hello" is ommitted because “hello” subtypes string which is already in the union.

Perplexingly, this doesn’t work for object literals:

// Simplifies to:
// type b = {
//     foo: 1;
// } | {
//     foo: 2;
// } | {
//     foo: 1;
// }
type b = { foo: 1 } | { foo: 2 } | { foo: 1 };

It even works in cases like this:

// Simplifies to:
// type e = {
//     foo: string;
//     bar: number;
// } | {
//     foo: string;
//     bar: number;
// }
type e = { foo: string, bar: number } | { foo: string, bar: number }

Which is very confusing because the objects are actually identical.

This is actually part of the “Object edge case” explained in the next section.

Object edge case

this is a weird edge case:

type Extends<A, B> = A extends B ? "extends" : "not extends";

type a = { foo: 1 | 2 };
type b = { foo: 1 } | { foo: 2 };

// extends
type result = Extends<a, b>;

if we do the algorithm on objects where we check every key of a is subtype of every variant of union b it will fail, because:

// a['foo'] ==> 1 | 2

// not extends
type result = Extends<1 | 2, 1>
// not extends
type result2 = Extends<1 | 2, 2>

there are a couple ways to solve this:

specialized subtype impl for objects vs unions
how would this look like?

first we need to know when do apply this specialization. we need to be able to tell if a is an obect and b is a union of N objects where each object has a similar field but with different values.

when we know the conditions for this specialization are met, then we dispatch to extends_object_union function.

this will iterate the keys of a. if one key has a value that is a union, we check if the value subtypes the corresponding value in b.

  • when “normalizing” unions we turn { foo: 1 } | { foo: 2 } into { foo: 1 | 2 }, or do this just before subtype checking

Exploration: product types

Another interesting property is that you can think of an “object expression with a member as a union” as having the distributed property:

type a = { foo: 1 | 2 };
type b = { foo: 1 } | { foo: 2 };

We can think of the | operator as distributing the union over the object. Similar to distribute property from math: a ⋅ (b + c) = a ⋅ b + a ⋅ c

This analogy goes quite far, records are actually product types in type theory / FP.

{ "foo" * (`1` + `2`) } { "foo" * `1`} + { "foo" * `2`}

With a -> "foo", b -> 1, c -> 2.

Let’s see with a more practical example:

type circle = { type: 'circle', radius: number };
type rectangle = { type: 'rectangle', w: number, h: number };
type shape = circle | rectangle

type a = { type: 'circle' | 'rectangle', }

Figured it out:

The key insight to understanding this behaviour is to recognize that it only works when subtyping against a discriminated union.

A discriminated union is a union of N objects who share a single similar key, whose value is a literal.

Those three subtle conditions are important:

  1. there must be one similar key, the discriminant key
  2. the discriminat key’s value must be a literal type

The union must satisfy all above conditions for it to be considered a discriminated union.

Now that we understand what a discriminated union, we can observe that this weird edge case only works on them:

type Extends<A, B> = A extends B ? "extends" : "not extends";

// `b` is discriminated union...
type a = { foo: 1 | 2 };
type b = { foo: 1 } | { foo: 2 };

// so this extends check works
type result1 = Extends<a, b>;

// `d` is NOT a discriminated union...
type c = { bar: number | string };
type d = { bar: number } | { bar: string };

// so this extends check doesn't work!
type result2 = Extends<c, d>;

So it seems like the rule is that a discriminated union can be subtyped by an object whose keys and values are merged from the discriminated union, example:

type a = { foo: 1 | 2, a: string, b: number};
type b = { foo: 1, a: string } | { foo: 2, b: number };

// this extends check works
type result1 = Extends<a, b>;

Unfortunately, this doesn’t seem to be the case, not exactly:

type a = { foo: 1 | 2, bar: string | number };
type b = { foo: 1, bar: string } | { foo: 2, bar: number };

// doesn't extend
type result1 = Extends<e, f>

Logically, we can see why. The type a allows us to put { foo: 1, bar: number }, which is not valid because its a mix-match of the properties from the two variants.

The rule here is the | operator can only distribute over the discriminant keys in a discriminant union.

If we take an algebraic approach at looking at it as sum types and product types:

foo ^ `1` * bar ^ string + foo ^ `2` * bar ^ number
(foo ^ (`1` + `2`)) (bar ^ string + bar ^ number)  // b. but this is invalid in a record
// a. cant distribute over non discriminant fields, so cant do this:
(foo ^ (`1` + `2`)) (bar ^ (string + number))

Now this does work in this situation:

type a = { foo: 1 | 2, a: string, b: number };
type b = { foo: 1, a: string } | { foo: 2, b: number };

type result1 = Extends<a, b>

The algebraic derivation:

foo^`1` * a^string + foo^2 * b^number
(foo^(`1` + `2`)) * (a^string + b^number) // Which is type of `a`

So we need to: determine if we are doing A subtypes B where: B is discriminant union, A is object with discriminant key with union of B’s values for that key. A can only have a subset of the discriminant values, e.g.:

type a = { foo: 2 | 3, b: string, c: number };
type b = { foo: 1, a: string } | { foo: 2, b: number } | { foo: 3, b: number };

type result1 = Extends<a, b>

Disjoint edge case

if two unions are disjoint:

type ExtendsCheck<A, B> = A extends B ? "yes" : "no";
type t5 = ExtendsCheck<
  "NICE" | boolean | { lmao: boolean },
  string | boolean | number | { nice: string }
>;

the result is: “yes” | “no”

Unions of objects [0/0]

Unions of objects have some slight alterations when it comes to normalizing/flattening them.

Well, first normalization usually works as you expect.

type discriminatedUnionWithNormalUnion =
  | { foo: "lol"; hi: number }
  | { foo: "bar" }
  | { foo: string };

// normalization works as expected, they all get flattened to `{ foo: string }`
function someFunc(x: discriminatedUnionWithNormalUnion) {
  if (x.foo === "lol") {
    // @ts-expect-error
    x.hi;
  }
}

However, one discrepancy is that disjoint objects get normalized into an object with no keys at all:

type discriminatedUnionWithNormalUnion2 =
  | { foo: "lol" }
  | { foo: "bar" }
  | { hi: string };

// `x` has no keys
function someFunc(x: discriminatedUnionWithNormalUnion2) {
  // this is `never`
  type lmao = keyof typeof x;
}

When objects form a union with non-object types, they are considered disjoint, even though non-object types have properties themselves:

type discriminatedUnionWithNormalUnion3 =
  | { foo: "lol"; charAt(pos: number): string }
  | { foo: "bar"; charAt(pos: number): string }
  | string;

// `x` has no keys
function someFunc3(x: discriminatedUnionWithNormalUnion2) {
  // this is `never`
  type lmao = keyof typeof x;
}

All three variants have the charAt(pos: number): string function property.

disjoint objects become object with never key

Discriminated unions [1/2]

We need to distinguish unions from discriminated unions because subtyping logic is different for discriminated unions. (See “Object edge case” and “Disjoint edge case” above).

First we need to know what precisely is a discriminated union:

A discriminated union, is a union of N object types that all contain a discriminant key/value pair. This discriminant key/value pair has three subtle but important conditions:

  1. the discriminant key must be present in all variants of the union
  2. the value of the discriminant must be a literal in all variants
  3. there can only be one discriminant key/value pair, presence of multiple means the union no longer qualifies as a discriminated union

All three of these conditions must be met.

Now we know what a discriminated union is. How does detecting one look like in code?

  1. Build the union regularly, normalizing it
  2. Check that the union contains a discriminant key/value pair satisfying the three conditions outlined above.

Once we know that the union is discriminanted, we can store some helpful properties to make operations faster. One notable would be storing the discriminated key.

Additionally, discriminated unions have deviant behaviour in subtyping logic.

Specifically, the algorithm for subtyping discriminated unions changes when you construct an object which has the discrimiant key, with a value that is the union of the possible values of the discriminant key.

Bad explanation. Look at this example:

type Extends<A, B> = A extends B ? "extends" : "not extends";

type a = { foo: 1 | 2 };
type b = { foo: 1 } | { foo: 2 };

// The result is: "extends"
type result = Extends<a, b>;

If we use the normal union subtyping algorithm. We would do:

  1. Check that a subtypes any variant of b
  2. Check that { foo: 1 | 2 } subtypes { foo: 1 }
  3. It doesn’t, because 1 | 2 does not subtype 1. Try next variant.
  4. Check that { foo: 1 | 2 } subtypes { foo: 2 }
  5. It doesn’t, because 1 | 2 does not subtype 2.

Under the algorithm, a does not subtype b. But intuitively, it feels like { foo: 1 | 2 } should be equivalent to { foo: 1 } | { foo: 2 }.

So we want this behaviour. How do we make it work?

The key insight is that the | operator distributes over the discriminant key of a discriminated union.

Detect discriminated unions and store their tag

Implement specialized discriminated union subtype logic

Flappy bird

VM checks Value extends DrawCommand and then serializes it to JS object

In the wrapper, execute draw commands

In the wrapper, call the VM function again

Keys on non-object types

all types are actually objects and have additional keys

for example number has toExponential, toFixed, etc.

so each type has like its base number of keys, so we dont have to make every type an Object, instead when we type check string, bool, number, etc. against object we just check against these known base keys

the problem is when you add more types to this base amount:

type agumentednumber = number & { foo(): void }

We have several options:

  1. put a base field on the Object struct which tells us where to get the base keys :: problem is that number, string aren’t actually objects see the Test<A, B> thing below. you can’t make an object with all the fields of a number and have it be a number

    so we want to still have our distinction

  2. or just merge the base keys onto the fields of an Object :: same problem as above
  3. create an “augmented” type which contains the type and additional fields :: Something like this:
    struct AugmentedType {
        base: Value,
        augmentation: Object
    }
        
  4. have special heap allocated variants of keyword types Number, String, Boolean
type Test<A, B> = A extends B ? "extends" : "not extends";
type fakenumber = {
  toExponential(fractionDigits?: number | undefined): string;
  toFixed(fractionDigits?: number | undefined): string;
  toLocaleString(
    locales?: string | string[] | undefined,
    options?: Intl.NumberFormatOptions | undefined
  ): string;
  toPrecision(precision?: number | undefined): string;
  toString(radix?: number | undefined): string
  valueOf(): number
};

// this returns "not extends"
type result = Test<fakenumber, number>

Also note that the object keyword

Optional fields/types

we have optional types:

type OptionalInObject = {
  foo?: boolean
}
type OptionalInTuple = [foo?: string]

how do we handle this?

first intuition is to treat them as union like: T | undefined. indeed, when you hover over OptionalInObject or OptionalInTuple, the optional fields become: foo?: boolean | undefined but we need to make sure that the semantics between foo?: boolean and this union represenation map over sufficiently enough to do this. basically what i want to know is if we can erase the concept of “optionals” completely from the IR and treat them literally as T | undefined

the issue is that an optional field actually allows you to omit the field in instantiating the type:

// is okay
const test: OptionalInObject = {}

type NOOptionalInObject = {
  foo: boolean | undefined
}
// not okay, complains we are missing `foo`
const notOkay: OptionalInObject = {}

and I made sure to check this was the same in type-level typescript:

type OptionalInObject = {
  optional?: boolean;
};
type NOOptionalInObject = {
  optional: boolean | undefined;
};

type Check<T> = T extends NOOptionalInObject ? "YAY" : "NOO";
// result is NOO
type CheckResult = Check<{}>;

so we need to preserve this optional information

Recursive function argument extends optimization

When recursively calling a function, the extends check may be redundantly called:

type FillArrayImpl<
  Count extends number,
  I extends number,
  Value,
  Array extends any[]
> = I extends Count
  ? Array
  : FillArrayImpl<Count, Add<I, 1>, Value, [Value, ...Array]>;

In the above example, when FillArrayImpl<...> is recursively called, all of its arguments will undergo a needless extends check.

We can optimize them away.

In the most trivial case, identifiers that are reused can have their extends check omitted. For example, Count is untouched and passed again to FillArrayImpl. When we first call FillArrayImpl, we check Count extends number, then we don’t touch it and then pass it again to the function.

These optimizations would require a refactor to when the extends check takes place. Right now, the checks happen inside the function. We’ll have to do them before the function.

Template literals with non-literal types

Template literals work with literal types:

// "hehe 420"
type foo = `hehe ${420}`

But when a non-literal type is added, the type stays as a template literal type:

// "hehe ${any}"
type foo = `hehe ${any}`

You can still concat new strings though:

// `hehe ${any}wow 420`
type lmao = `${foo}wow ${420}`

Index numeric literals optimization

can speed up index operations if we make specialization ops for indexing

for example myarray[0] is a common expression, specifically indexing with a number constant literal

Accessing length key on arrays

GC

Cache function extends checks

Functions have extends checks and the RHS of the extends check can be cached.

For example:

type Bar<X extends { hi: string }> = /* ... */;

Object/array constants

Idea here is that object/array constants will also contain a pointer which represents a cache. At compile time, these will be set to NULL. At runtime, we instantiate all constants and set the pointer.

The idea here is that this serves as a “cache” so we don’t have to re-instantiate object/arrays that are known at compile time.

Indexing default properties

All types have some common default properties, and then properties of their own.

Indexing needs to work on these.

Number:

  • .toString()
  • .valueOf()
  • .toExponential()
  • .toFixed()
  • .toLocaleString()
  • .toPrecision()

Boolean:

  • .valueOf()

String:

  • a bunch of shit

Indexing numeric strings

e.g. foo[0] -> foo['0']

brainstorming

use immutable / persistent data structures

all values in type-level typescript are immutable. langs with focus on immutability tend to be allocation heavy (you need to make more objects)

fp langs solve this by using persistent data structures which typically have some structural sharing mechanism to reduce allocations for copies

https://github.com/immutable-js/immutable-js/ readme links:

  • hash map tries (link)
  • vector tries (link)

there are some rust crates that implement persistent data structures, but they are designed for safe rust. they all have some reference counting shit going on. i don’t know

idea for using stack for objects

what if the key/vals for an object were a slice/window of the stack?

Make ir repr more compact

some places where we can box stuff

Everything on the stack?

What if we avoided heap allocations entirely and put everything on the stack.

This is possible because values in type-level Typescript are immutable and follow stack based rules. For example, there aren’t anything fancy like closures and captures variables that make the lifetimes of values go outside of the stack.

If this is possible, it would drastically make everything simpler, especially memory management. The stack is a form of memory management, and freeing memory is as simple as popping values of the stack. The one exception is strings, but adding reference counting to strings alone is simple.

Even things more complex like arrays and objects can live on the stack. There can be a “header” value that stores how the slice of values that make up the array/object. The only complexity is in popping the stack, this requires popping also the array elements / object fields. Same for peeking N values back in the stack.

Maybe this is not worth it when we have reference counting.

archive

Need distinction between Tuple vs. Array

Unable to distinguish arrays and single-item tuples

I realized I made a very naive error in the way arrays/tuples are represented, which means that single-items and regular arrays are indistinguishable from each other which is bad.

String interpolation

string interning

this is really important, otherwise string value equality won’t work

globals non-forward declarations all fucked lol

evaluation of order can’t be strictly top down

this a little more complicated

need to build DAG of global declarations and the declarations they depend on

What to do with this?

type Fib<X extends number> = FibIter<X, 2, 1, 0>
type FibIter<X, I, Prev, PrevPrev> = /* ... */
type Main = WriteFile<
  "./fib-result.ts",
  ToTypescriptSource<"FibonacciResult", Fib<amount>>
>;

type amount = Main;

all globals will be the roots of the stmt nodes process them first, add to globals then when actually compiling the global, need to check if we already compiled so we dont have people redefining vars

main argv

let decls

this is my strategy:

executing a let decl will add another local to the function. since we have the requirement that all exprs when finished executing will leave the stack as it was before, we can be certain that pushing this local to the stack will be after the locals of the params + any other locals form let decls:

STACK:
param1 param2 param3 letdecl1 letdecl2 letdecl3

when you enter the true branch of a let decl, we should push that local

type TestLet<Arg> = Arg extends infer val extends 0 ? val : "nope";

(the new bound var doesn’t exist in the else branch)

the only problem is then getting rid of these new let decl vars.

arrays

to store we just need a vec of types

there are three kinds of arrays:

  1. Array<T> or T[]
  2. [T, K, etc]
  3. [and: T, labeled: K, etc]

2 & 3 are actually tuples, but we should treat them the same bc they tuple extends array

also note that the labels in tuples dont matter for typechecking, they are just to make shit readable. so we can store them elsewhere and not give a fuck.

we could make a inline special representation. Vec is 24 bytes. Value is 8 usually the array type is just a Value if its a tuple with 2-3 elements we can inline it probably else just pass the Vec

we shouldnt even use vec (because of borrowck) instead we can use our own repr with ptr + len

also

Handle spread properly

Problem is spread can be in any arbitrary position in array literal, not just end:

type foo = [...Lol, 1, 2, ...Lmao, 4, 5, ...Nice]

The problem is knowing at runtime which elements are spread elements, and which are regular types and shouldn’t be spread.

One idea is to pass this information as instruction operands
You pass the indices of the spread arguments as instruction operands. I see two options:
  • pass them in as bit sets
  • pass them in as count + N indices