in terms of representation, we just need Vec<Value>
for now. later we can do specialization for common like T | undefined
Typically Typescript “normalizes” unions:
// string | number | boolean
type a = string | number | boolean | string
Basically when “merging” types into a union, we check if the in-progress constructed union doesn’t already have a type that is a “super-type”.
Bad explanation. In the example above, the second “string” is ommitted because string already exists in the union. In another example:
// string | number | boolean
type a = string | number | boolean | "hello"
"hello"
is ommitted because “hello” subtypes string
which is already in the union.
Perplexingly, this doesn’t work for object literals:
// Simplifies to:
// type b = {
// foo: 1;
// } | {
// foo: 2;
// } | {
// foo: 1;
// }
type b = { foo: 1 } | { foo: 2 } | { foo: 1 };
It even works in cases like this:
// Simplifies to:
// type e = {
// foo: string;
// bar: number;
// } | {
// foo: string;
// bar: number;
// }
type e = { foo: string, bar: number } | { foo: string, bar: number }
Which is very confusing because the objects are actually identical.
This is actually part of the “Object edge case” explained in the next section.
this is a weird edge case:
type Extends<A, B> = A extends B ? "extends" : "not extends";
type a = { foo: 1 | 2 };
type b = { foo: 1 } | { foo: 2 };
// extends
type result = Extends<a, b>;
if we do the algorithm on objects where we check every key of a
is subtype of every variant of union b
it will fail, because:
// a['foo'] ==> 1 | 2
// not extends
type result = Extends<1 | 2, 1>
// not extends
type result2 = Extends<1 | 2, 2>
there are a couple ways to solve this:
- specialized subtype impl for objects vs unions
-
how would this look like?
first we need to know when do apply this specialization. we need to be able to tell if
a
is an obect andb
is a union of N objects where each object has a similar field but with different values.when we know the conditions for this specialization are met, then we dispatch to
extends_object_union
function.this will iterate the keys of
a
. if one key has a value that is a union, we check if the value subtypes the corresponding value inb
.
- when “normalizing” unions we turn
{ foo: 1 } | { foo: 2 }
into{ foo: 1 | 2 }
, or do this just before subtype checking
Another interesting property is that you can think of an “object expression with a member as a union” as having the distributed property:
type a = { foo: 1 | 2 };
type b = { foo: 1 } | { foo: 2 };
We can think of the |
operator as distributing the union over the object. Similar to distribute property from math:
a ⋅ (b + c) = a ⋅ b + a ⋅ c
This analogy goes quite far, records are actually product types in type theory / FP.
{ "foo" * (`1` + `2`) }
{ "foo" * `1`} + { "foo" * `2`}
With a -> "foo"
, b -> 1
, c -> 2
.
Let’s see with a more practical example:
type circle = { type: 'circle', radius: number };
type rectangle = { type: 'rectangle', w: number, h: number };
type shape = circle | rectangle
type a = { type: 'circle' | 'rectangle', }
The key insight to understanding this behaviour is to recognize that it only works when subtyping against a discriminated union.
A discriminated union is a union of N objects who share a single similar key, whose value is a literal.
Those three subtle conditions are important:
- there must be one similar key, the discriminant key
- the discriminat key’s value must be a literal type
The union must satisfy all above conditions for it to be considered a discriminated union.
Now that we understand what a discriminated union, we can observe that this weird edge case only works on them:
type Extends<A, B> = A extends B ? "extends" : "not extends";
// `b` is discriminated union...
type a = { foo: 1 | 2 };
type b = { foo: 1 } | { foo: 2 };
// so this extends check works
type result1 = Extends<a, b>;
// `d` is NOT a discriminated union...
type c = { bar: number | string };
type d = { bar: number } | { bar: string };
// so this extends check doesn't work!
type result2 = Extends<c, d>;
So it seems like the rule is that a discriminated union can be subtyped by an object whose keys and values are merged from the discriminated union, example:
type a = { foo: 1 | 2, a: string, b: number};
type b = { foo: 1, a: string } | { foo: 2, b: number };
// this extends check works
type result1 = Extends<a, b>;
Unfortunately, this doesn’t seem to be the case, not exactly:
type a = { foo: 1 | 2, bar: string | number };
type b = { foo: 1, bar: string } | { foo: 2, bar: number };
// doesn't extend
type result1 = Extends<e, f>
Logically, we can see why. The type a
allows us to put { foo: 1, bar: number }
, which is not valid because its a mix-match of the properties from the two variants.
The rule here is the |
operator can only distribute over the discriminant keys in a discriminant union.
If we take an algebraic approach at looking at it as sum types and product types:
foo ^ `1` * bar ^ string + foo ^ `2` * bar ^ number (foo ^ (`1` + `2`)) (bar ^ string + bar ^ number) // b. but this is invalid in a record // a. cant distribute over non discriminant fields, so cant do this: (foo ^ (`1` + `2`)) (bar ^ (string + number))
Now this does work in this situation:
type a = { foo: 1 | 2, a: string, b: number };
type b = { foo: 1, a: string } | { foo: 2, b: number };
type result1 = Extends<a, b>
The algebraic derivation:
foo^`1` * a^string + foo^2 * b^number (foo^(`1` + `2`)) * (a^string + b^number) // Which is type of `a`
So we need to: determine if we are doing A subtypes B where: B is discriminant union, A is object with discriminant key with union of B’s values for that key. A can only have a subset of the discriminant values, e.g.:
type a = { foo: 2 | 3, b: string, c: number };
type b = { foo: 1, a: string } | { foo: 2, b: number } | { foo: 3, b: number };
type result1 = Extends<a, b>
if two unions are disjoint:
type ExtendsCheck<A, B> = A extends B ? "yes" : "no";
type t5 = ExtendsCheck<
"NICE" | boolean | { lmao: boolean },
string | boolean | number | { nice: string }
>;
the result is: “yes” | “no”
Unions of objects have some slight alterations when it comes to normalizing/flattening them.
Well, first normalization usually works as you expect.
type discriminatedUnionWithNormalUnion =
| { foo: "lol"; hi: number }
| { foo: "bar" }
| { foo: string };
// normalization works as expected, they all get flattened to `{ foo: string }`
function someFunc(x: discriminatedUnionWithNormalUnion) {
if (x.foo === "lol") {
// @ts-expect-error
x.hi;
}
}
However, one discrepancy is that disjoint objects get normalized into an object with no keys at all:
type discriminatedUnionWithNormalUnion2 =
| { foo: "lol" }
| { foo: "bar" }
| { hi: string };
// `x` has no keys
function someFunc(x: discriminatedUnionWithNormalUnion2) {
// this is `never`
type lmao = keyof typeof x;
}
When objects form a union with non-object types, they are considered disjoint, even though non-object types have properties themselves:
type discriminatedUnionWithNormalUnion3 =
| { foo: "lol"; charAt(pos: number): string }
| { foo: "bar"; charAt(pos: number): string }
| string;
// `x` has no keys
function someFunc3(x: discriminatedUnionWithNormalUnion2) {
// this is `never`
type lmao = keyof typeof x;
}
All three variants have the charAt(pos: number): string
function property.
We need to distinguish unions from discriminated unions because subtyping logic is different for discriminated unions. (See “Object edge case” and “Disjoint edge case” above).
First we need to know what precisely is a discriminated union:
A discriminated union, is a union of N object types that all contain a discriminant key/value pair. This discriminant key/value pair has three subtle but important conditions:
- the discriminant key must be present in all variants of the union
- the value of the discriminant must be a literal in all variants
- there can only be one discriminant key/value pair, presence of multiple means the union no longer qualifies as a discriminated union
All three of these conditions must be met.
Now we know what a discriminated union is. How does detecting one look like in code?
- Build the union regularly, normalizing it
- Check that the union contains a discriminant key/value pair satisfying the three conditions outlined above.
Once we know that the union is discriminanted, we can store some helpful properties to make operations faster. One notable would be storing the discriminated key.
Additionally, discriminated unions have deviant behaviour in subtyping logic.
Specifically, the algorithm for subtyping discriminated unions changes when you construct an object which has the discrimiant key, with a value that is the union of the possible values of the discriminant key.
Bad explanation. Look at this example:
type Extends<A, B> = A extends B ? "extends" : "not extends";
type a = { foo: 1 | 2 };
type b = { foo: 1 } | { foo: 2 };
// The result is: "extends"
type result = Extends<a, b>;
If we use the normal union subtyping algorithm. We would do:
- Check that
a
subtypes any variant ofb
- Check that
{ foo: 1 | 2 }
subtypes{ foo: 1 }
- It doesn’t, because
1 | 2
does not subtype1
. Try next variant. - Check that
{ foo: 1 | 2 }
subtypes{ foo: 2 }
- It doesn’t, because
1 | 2
does not subtype2
.
Under the algorithm, a
does not subtype b
. But intuitively, it feels like { foo: 1 | 2 }
should be equivalent to { foo: 1 } | { foo: 2 }
.
So we want this behaviour. How do we make it work?
The key insight is that the |
operator distributes over the discriminant key of a discriminated union.
all types are actually objects and have additional keys
for example number
has toExponential
, toFixed
, etc.
so each type has like its base number of keys, so we dont have to make every type an Object
, instead when we type check string
, bool
, number
, etc. against object we just check against these known base keys
the problem is when you add more types to this base amount:
type agumentednumber = number & { foo(): void }
We have several options:
- put a
base
field on theObject
struct which tells us where to get the base keys :: problem is thatnumber
,string
aren’t actually objects see the Test<A, B> thing below. you can’t make an object with all the fields of a number and have it be a numberso we want to still have our distinction
- or just merge the base keys onto the
fields
of anObject
:: same problem as above - create an “augmented” type which contains the type and additional fields ::
Something like this:
struct AugmentedType { base: Value, augmentation: Object }
- have special heap allocated variants of keyword types
Number
,String
,Boolean
type Test<A, B> = A extends B ? "extends" : "not extends";
type fakenumber = {
toExponential(fractionDigits?: number | undefined): string;
toFixed(fractionDigits?: number | undefined): string;
toLocaleString(
locales?: string | string[] | undefined,
options?: Intl.NumberFormatOptions | undefined
): string;
toPrecision(precision?: number | undefined): string;
toString(radix?: number | undefined): string
valueOf(): number
};
// this returns "not extends"
type result = Test<fakenumber, number>
Also note that the object
keyword
we have optional types:
type OptionalInObject = {
foo?: boolean
}
type OptionalInTuple = [foo?: string]
how do we handle this?
first intuition is to treat them as union like: T | undefined
.
indeed, when you hover over OptionalInObject
or OptionalInTuple
, the optional fields become: foo?: boolean | undefined
but we need to make sure that the semantics between foo?: boolean
and this union represenation map over sufficiently enough to do this.
basically what i want to know is if we can erase the concept of “optionals” completely from the IR and treat them literally as T | undefined
the issue is that an optional field actually allows you to omit the field in instantiating the type:
// is okay
const test: OptionalInObject = {}
type NOOptionalInObject = {
foo: boolean | undefined
}
// not okay, complains we are missing `foo`
const notOkay: OptionalInObject = {}
and I made sure to check this was the same in type-level typescript:
type OptionalInObject = {
optional?: boolean;
};
type NOOptionalInObject = {
optional: boolean | undefined;
};
type Check<T> = T extends NOOptionalInObject ? "YAY" : "NOO";
// result is NOO
type CheckResult = Check<{}>;
so we need to preserve this optional information
When recursively calling a function, the extends
check may be redundantly called:
type FillArrayImpl<
Count extends number,
I extends number,
Value,
Array extends any[]
> = I extends Count
? Array
: FillArrayImpl<Count, Add<I, 1>, Value, [Value, ...Array]>;
In the above example, when FillArrayImpl<...>
is recursively called, all of its arguments will undergo a needless extends check.
We can optimize them away.
In the most trivial case, identifiers that are reused can have their extends check omitted. For example, Count
is untouched and passed again to FillArrayImpl
. When we first call FillArrayImpl
, we check Count extends number
, then we don’t touch it and then pass it again to the function.
These optimizations would require a refactor to when the extends check takes place. Right now, the checks happen inside the function. We’ll have to do them before the function.
Template literals work with literal types:
// "hehe 420"
type foo = `hehe ${420}`
But when a non-literal type is added, the type stays as a template literal type:
// "hehe ${any}"
type foo = `hehe ${any}`
You can still concat new strings though:
// `hehe ${any}wow 420`
type lmao = `${foo}wow ${420}`
can speed up index operations if we make specialization ops for indexing
for example myarray[0]
is a common expression, specifically indexing with a number constant literal
Functions have extends checks and the RHS of the extends check can be cached.
For example:
type Bar<X extends { hi: string }> = /* ... */;
Idea here is that object/array constants will also contain a pointer which represents a cache. At compile time, these will be set to NULL. At runtime, we instantiate all constants and set the pointer.
The idea here is that this serves as a “cache” so we don’t have to re-instantiate object/arrays that are known at compile time.
All types have some common default properties, and then properties of their own.
Indexing needs to work on these.
Number:
.toString()
.valueOf()
.toExponential()
.toFixed()
.toLocaleString()
.toPrecision()
Boolean:
.valueOf()
String:
- a bunch of shit
e.g. foo[0]
-> foo['0']
all values in type-level typescript are immutable. langs with focus on immutability tend to be allocation heavy (you need to make more objects)
fp langs solve this by using persistent data structures which typically have some structural sharing mechanism to reduce allocations for copies
https://github.com/immutable-js/immutable-js/ readme links:
there are some rust crates that implement persistent data structures, but they are designed for safe rust. they all have some reference counting shit going on. i don’t know
what if the key/vals for an object were a slice/window of the stack?
some places where we can box stuff
What if we avoided heap allocations entirely and put everything on the stack.
This is possible because values in type-level Typescript are immutable and follow stack based rules. For example, there aren’t anything fancy like closures and captures variables that make the lifetimes of values go outside of the stack.
If this is possible, it would drastically make everything simpler, especially memory management. The stack is a form of memory management, and freeing memory is as simple as popping values of the stack. The one exception is strings, but adding reference counting to strings alone is simple.
Even things more complex like arrays and objects can live on the stack. There can be a “header” value that stores how the slice of values that make up the array/object. The only complexity is in popping the stack, this requires popping also the array elements / object fields. Same for peeking N values back in the stack.
Maybe this is not worth it when we have reference counting.
I realized I made a very naive error in the way arrays/tuples are represented, which means that single-items and regular arrays are indistinguishable from each other which is bad.
this is really important, otherwise string value equality won’t work
evaluation of order can’t be strictly top down
this a little more complicated
need to build DAG of global declarations and the declarations they depend on
What to do with this?
type Fib<X extends number> = FibIter<X, 2, 1, 0>
type FibIter<X, I, Prev, PrevPrev> = /* ... */
type Main = WriteFile<
"./fib-result.ts",
ToTypescriptSource<"FibonacciResult", Fib<amount>>
>;
type amount = Main;
all globals will be the roots of the stmt nodes
process them first, add to globals
then when actually compiling the global, need to check if we already compiled so we dont have people redefining vars
this is my strategy:
executing a let decl will add another local to the function. since we have the requirement that all exprs when finished executing will leave the stack as it was before, we can be certain that pushing this local to the stack will be after the locals of the params + any other locals form let decls:
STACK:
param1 param2 param3 letdecl1 letdecl2 letdecl3
when you enter the true branch of a let decl, we should push that local
type TestLet<Arg> = Arg extends infer val extends 0 ? val : "nope";
(the new bound var doesn’t exist in the else branch)
the only problem is then getting rid of these new let decl vars.
to store we just need a vec of types
there are three kinds of arrays:
Array<T>
orT[]
[T, K, etc]
[and: T, labeled: K, etc]
2 & 3 are actually tuples, but we should treat them the same bc they tuple extends array
also note that the labels in tuples dont matter for typechecking, they are just to make shit readable. so we can store them elsewhere and not give a fuck.
we could make a inline special representation. Vec
is 24 bytes. Value
is 8
usually the array type is just a Value
if its a tuple with 2-3 elements we can inline it probably
else just pass the Vec
we shouldnt even use vec (because of borrowck) instead we can use our own repr with ptr + len
also
Problem is spread can be in any arbitrary position in array literal, not just end:
type foo = [...Lol, 1, 2, ...Lmao, 4, 5, ...Nice]
The problem is knowing at runtime which elements are spread elements, and which are regular types and shouldn’t be spread.
- One idea is to pass this information as instruction operands
-
You pass the indices of the spread arguments as instruction operands. I see two options:
- pass them in as bit sets
- pass them in as count + N indices