Is Parameter Inference with Integer Subtyping still Deterministic?

I've simplified the issue to make it as concrete as possible. Because Latency Counting behaves similarly to subtyping, I believe that we can leave it out completely and just express the problem in terms of integer subtyping. 

## Example module that has parameters:
```sus
module SplitAt #(T, int SIZE, int SPLIT_POINT) {
	interface SplitAt : T[SIZE] i -> T[SPLIT_POINT] left, T[SIZE - SPLIT_POINT] right

	for int I in 0..SPLIT_POINT {
		left[I] = i[I]
	}
	for int I in 0..SIZE - SPLIT_POINT {
		right[I] = i[I+SPLIT_POINT]
	}
}
```

Example use:
```sus
module use_SplitAt {
    gen int MY_ARR_SIZE = 5

    bool[MY_ARR_SIZE] myArr

    bool[3] a, bool[ ] b = SplitAt(myArr)
    // ^ Infers to 
    // bool[3] a, bool[2] b = SplitAt #(T: bool, SIZE: 5, SPLIT_POINT: 3)(myArr)
}
```

As an example, Synthesizing `use_SplitAt` goes as follows:
- First all modules are parsed, global resolution, etc, and "abstract typechecked" (irrelevant here)
- We execute the generative code in `use_SplitAt`. So `MY_ARR_SIZE` is assigned '5', `myArr` gets the type `bool[5]`, and an "instantiation candidate" is created for `SplitAt`. Represented as `SplitAt#(T: bool, SIZE: var_1, SPLIT_POINT: var_2)`
- Concrete typechecking starts unifying the type value parameters, and through inverting the equations on module ports of `SplitAt`, it can infer `var_1` and `var_2`. 
- As soon as all parameters of a submodule become known, it is executed (and typechecked) in turn. If execution/typechecking is successful, the concrete type parameters of its ports are injected back into the parent typing context. In this case `SplitAt` execution is successful, and the types for the ports `bool[5] i`, `bool[3] left`, and `bool[2] right` are injected back into the typing context of `use_SplitAt`. From this we can now assign the type of b, namely: `bool[2] b`
- Once all variables are inferred, and no type errors happened, the instantiation of `use_SplitAt` is successful. 

### More in depth
Submodule (and struct) parameter inference always happens from both known values (size of array `bool[3] a`), and unknown values (`bool[] b`). This does not lead to the need for any symbolic manipulation, since concrete typing inference is only started after execution of the generative code of a module, and thus any symbols or expressions will have resolved to concrete values. (Eg: `bool[MY_ARR_SIZE]` in will have resolved to the concrete type `bool[5]` before concrete type checking has started)

The typechecker is purely value-based. Any "structure" of the types (like `bool[][]...`) has been resolved at the earlier Abstract Typing stage and falls outside of the scope here. Concrete typing is just responsible for filling out the concrete value parameters of types. So in `bool[var_1][var_2]`, resolving `var_1` and `var_2` to concrete values like 3 and 5. 

Parameters of submodules are inferred by inverting the symbolic formulas on the modules ports (Like `SIZE` in `T[SIZE] i`). If it is known that `i` connects to `bool[5]` then we can infer `SIZE` to be 5. 

#### Inferrable expressions
I think, given the next bits that making a distinction between "inferrable" expressions and "non-inferrable" expressions is pertinent. 

`bool[SIZE + 3] <: bool[10]` can be used to infer `SIZE = 7`, even `bool[V * 4 + 2] <: bool[10]` could be used to infer `V = 2`. But of course, `bool[V % 4] <: bool[3]` cannot infer `V`

### Feature 1: use already-inferred parameters to invert more complex expressions
If we already know one parameter, then we could use that information to infer other parameters. If `SIZE == 5` in `T[SIZE - SPLIT_POINT] right` and `right` connects to `bool[2]`, then we can infer `SPLIT_POINT == 3`

### Feature 2: Integer Subtyping
Of course this would be more general, but integers are a useful enough use case

Integer is defined in [std/core.sus](../tree/master/std/core.sus) as:
```sus
__builtin__ struct int #(int MIN, int MAX) {}
```

Integers are defined by their minimum and maximum (inclusive) values. 

Integer subtyping lets us assign an `int #(MIN: 5, MAX: 10)` to an `int #(MIN: 0, MAX: 255)` for instance. 

Example:
```sus
module Ints {
	interface Ints : int #(MIN: 0, MAX: 15) a, bool c

    int b // Infers to int #(MIN: 0, MAX: 20)

    when c { // "Hardware 'if'"
        b = a
    } else {
        b = 20 // "20" is of type int #(MIN: 20, MAX: 20)
    }
}
```

Because the concrete typing system does not work on type trees, but rather just on the type parameters, we specify the subtyping rules to operate on such value variables rather. 

Constraints can be of three kinds: (We use "must be" to make it clear that the constraints are applied to `val_a` and only constrain `val_a`, not `val_b`)
- Exact equality: `val_a == val_b`
- Less than: `val_a` must be `<= val_b`
- Greater than: `val_a` must be `>= val_b`

Typing assignments then produce these constraints. Examples:
- `bool[val_a] <: bool[val_b]` produces `val_a == val_b`
- `int #(MIN: min_a, MAX: max_a) <: int #(MIN: min_b, MAX: max_b)` produces `min_b` must be `<= min_a` and `max_b` must be `>= max_a`

Exact equality is handled by immediate unification of the variables and can be done for any type, whereas the inequality constraints are explicitly stored and can only be performed for `int` parameters. 

If a variable is *only* bounded by Less than (respectively Greater than) constraints, then once all the right sides are resolved to a value, the variable is unified with the minimum (respectively maximum) of all the right side values. 

As an aside, as a consequence of this way of generating and resolving constraints means that type inference for inequalities can only go forwards (from subexpressions to their parent expressions), whereas 

## Desirable properties: 
### Property 1: Determinism
If a module typechecks, there should be exactly one obvious typing, regardless of the order in which submodule parameters are inferred, or the order in which they are executed. 
If a module doesn't typecheck, this isn't necessary of course. 
Of course, spurious failures due to this ordering are also disallowed. 

## Formal-ish Problem statement
Without Feature 1 and Feature 2, the base implementation is fully deterministic, as it's just a really simple kind of Hindley-Milner. It is when we add these features that things get murky. 

### Does only Feature 1 break Determinism?
With only Feature 1, I believe Determinism still holds, since if any order in which differing options for inference would be found should still produce a type error. 

### How to design Inference such that Feature 2 doesn't break Determinism?
A simple example to show that Feature 2 might break determinism is:
```sus
module eq #(int MAX) {
    interface eq : int #(MIN: 0, MAX: MAX) a, int #(MIN: 0, MAX: MAX) b -> bool is_eq

    is_eq = a == b
}
module use_eq #(int MAX) {
    int #(MIN: 0, MAX: 3) a
    int #(MIN: 0, MAX: 5) b
    
    bool c = eq(a, b)
    // Might get inferred from a.MAX to be eq#(MAX: 3), in which case b doesn't typecheck
    // Or gets inferred from b.MAX to be eq#(MAX: 5), in which case a is a subtype of int#(MIN: 0, MAX: 5)
}
```

Perhaps doing a similar thing for inferring parameters as for integer subtyping (as in, Less than / Greater than relations are created for inferrable variables. Of course, if there's any exact equality constraints on the same variable they'd take prescedence and immediately resolve the variable). 

Then again, requiring that we grab the minimum / maximum from *all* possible places from where a value might be inferred could come with its own drawbacks, and leaves non-inferrable expressions still containing a parameter in an awkward place. 

Perhaps we can constrain "inference" site a bit better still, as outputs could never receive a valid inferrable value. That way there being a `MAX` parameter on `total` does not prevent it from being inferred if both `MAX` params for `a` and `b` are known. 
```sus
module modulus_add #(int MAX) {
    interface modulus_add : int #(MIN: 0, MAX: MAX) a, int #(MIN: 0, MAX: MAX) b -> int #(MIN: 0, MAX: MAX) total

    //...
}
```

In summary:
**Is the base typing system extended with Feature 2 (where inequality constrained inference requires valid values for *all* inferrable sites)**

### *If* Base + Feature 2 is Deterministic, does adding in Feature 1 re-break Determinism? 
Say Base + Feature 2 turns out Deterministic, the if we were to add in Feature 1, then that might turn previously non-inferrable sites into inferrable sites. For example, `bool[A - B] <: bool[10]` is not an inferrable site for `A` or `B`, but once one of them is known, the other becomes inferrable. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Is Parameter Inference with Integer Subtyping still Deterministic? #87

Example module that has parameters:

More in depth

Inferrable expressions

Feature 1: use already-inferred parameters to invert more complex expressions

Feature 2: Integer Subtyping

Desirable properties:

Property 1: Determinism

Formal-ish Problem statement

Does only Feature 1 break Determinism?

How to design Inference such that Feature 2 doesn't break Determinism?

If Base + Feature 2 is Deterministic, does adding in Feature 1 re-break Determinism?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Is Parameter Inference with Integer Subtyping still Deterministic? #87

Description

Example module that has parameters:

More in depth

Inferrable expressions

Feature 1: use already-inferred parameters to invert more complex expressions

Feature 2: Integer Subtyping

Desirable properties:

Property 1: Determinism

Formal-ish Problem statement

Does only Feature 1 break Determinism?

How to design Inference such that Feature 2 doesn't break Determinism?

If Base + Feature 2 is Deterministic, does adding in Feature 1 re-break Determinism?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions