-
Notifications
You must be signed in to change notification settings - Fork 6
Description
I've simplified the issue to make it as concrete as possible. Because Latency Counting behaves similarly to subtyping, I believe that we can leave it out completely and just express the problem in terms of integer subtyping.
Example module that has parameters:
module SplitAt #(T, int SIZE, int SPLIT_POINT) {
interface SplitAt : T[SIZE] i -> T[SPLIT_POINT] left, T[SIZE - SPLIT_POINT] right
for int I in 0..SPLIT_POINT {
left[I] = i[I]
}
for int I in 0..SIZE - SPLIT_POINT {
right[I] = i[I+SPLIT_POINT]
}
}
Example use:
module use_SplitAt {
gen int MY_ARR_SIZE = 5
bool[MY_ARR_SIZE] myArr
bool[3] a, bool[ ] b = SplitAt(myArr)
// ^ Infers to
// bool[3] a, bool[2] b = SplitAt #(T: bool, SIZE: 5, SPLIT_POINT: 3)(myArr)
}
As an example, Synthesizing use_SplitAt
goes as follows:
- First all modules are parsed, global resolution, etc, and "abstract typechecked" (irrelevant here)
- We execute the generative code in
use_SplitAt
. SoMY_ARR_SIZE
is assigned '5',myArr
gets the typebool[5]
, and an "instantiation candidate" is created forSplitAt
. Represented asSplitAt#(T: bool, SIZE: var_1, SPLIT_POINT: var_2)
- Concrete typechecking starts unifying the type value parameters, and through inverting the equations on module ports of
SplitAt
, it can infervar_1
andvar_2
. - As soon as all parameters of a submodule become known, it is executed (and typechecked) in turn. If execution/typechecking is successful, the concrete type parameters of its ports are injected back into the parent typing context. In this case
SplitAt
execution is successful, and the types for the portsbool[5] i
,bool[3] left
, andbool[2] right
are injected back into the typing context ofuse_SplitAt
. From this we can now assign the type of b, namely:bool[2] b
- Once all variables are inferred, and no type errors happened, the instantiation of
use_SplitAt
is successful.
More in depth
Submodule (and struct) parameter inference always happens from both known values (size of array bool[3] a
), and unknown values (bool[] b
). This does not lead to the need for any symbolic manipulation, since concrete typing inference is only started after execution of the generative code of a module, and thus any symbols or expressions will have resolved to concrete values. (Eg: bool[MY_ARR_SIZE]
in will have resolved to the concrete type bool[5]
before concrete type checking has started)
The typechecker is purely value-based. Any "structure" of the types (like bool[][]...
) has been resolved at the earlier Abstract Typing stage and falls outside of the scope here. Concrete typing is just responsible for filling out the concrete value parameters of types. So in bool[var_1][var_2]
, resolving var_1
and var_2
to concrete values like 3 and 5.
Parameters of submodules are inferred by inverting the symbolic formulas on the modules ports (Like SIZE
in T[SIZE] i
). If it is known that i
connects to bool[5]
then we can infer SIZE
to be 5.
Inferrable expressions
I think, given the next bits that making a distinction between "inferrable" expressions and "non-inferrable" expressions is pertinent.
bool[SIZE + 3] <: bool[10]
can be used to infer SIZE = 7
, even bool[V * 4 + 2] <: bool[10]
could be used to infer V = 2
. But of course, bool[V % 4] <: bool[3]
cannot infer V
Feature 1: use already-inferred parameters to invert more complex expressions
If we already know one parameter, then we could use that information to infer other parameters. If SIZE == 5
in T[SIZE - SPLIT_POINT] right
and right
connects to bool[2]
, then we can infer SPLIT_POINT == 3
Feature 2: Integer Subtyping
Of course this would be more general, but integers are a useful enough use case
Integer is defined in std/core.sus as:
__builtin__ struct int #(int MIN, int MAX) {}
Integers are defined by their minimum and maximum (inclusive) values.
Integer subtyping lets us assign an int #(MIN: 5, MAX: 10)
to an int #(MIN: 0, MAX: 255)
for instance.
Example:
module Ints {
interface Ints : int #(MIN: 0, MAX: 15) a, bool c
int b // Infers to int #(MIN: 0, MAX: 20)
when c { // "Hardware 'if'"
b = a
} else {
b = 20 // "20" is of type int #(MIN: 20, MAX: 20)
}
}
Because the concrete typing system does not work on type trees, but rather just on the type parameters, we specify the subtyping rules to operate on such value variables rather.
Constraints can be of three kinds: (We use "must be" to make it clear that the constraints are applied to val_a
and only constrain val_a
, not val_b
)
- Exact equality:
val_a == val_b
- Less than:
val_a
must be<= val_b
- Greater than:
val_a
must be>= val_b
Typing assignments then produce these constraints. Examples:
bool[val_a] <: bool[val_b]
producesval_a == val_b
int #(MIN: min_a, MAX: max_a) <: int #(MIN: min_b, MAX: max_b)
producesmin_b
must be<= min_a
andmax_b
must be>= max_a
Exact equality is handled by immediate unification of the variables and can be done for any type, whereas the inequality constraints are explicitly stored and can only be performed for int
parameters.
If a variable is only bounded by Less than (respectively Greater than) constraints, then once all the right sides are resolved to a value, the variable is unified with the minimum (respectively maximum) of all the right side values.
As an aside, as a consequence of this way of generating and resolving constraints means that type inference for inequalities can only go forwards (from subexpressions to their parent expressions), whereas
Desirable properties:
Property 1: Determinism
If a module typechecks, there should be exactly one obvious typing, regardless of the order in which submodule parameters are inferred, or the order in which they are executed.
If a module doesn't typecheck, this isn't necessary of course.
Of course, spurious failures due to this ordering are also disallowed.
Formal-ish Problem statement
Without Feature 1 and Feature 2, the base implementation is fully deterministic, as it's just a really simple kind of Hindley-Milner. It is when we add these features that things get murky.
Does only Feature 1 break Determinism?
With only Feature 1, I believe Determinism still holds, since if any order in which differing options for inference would be found should still produce a type error.
How to design Inference such that Feature 2 doesn't break Determinism?
A simple example to show that Feature 2 might break determinism is:
module eq #(int MAX) {
interface eq : int #(MIN: 0, MAX: MAX) a, int #(MIN: 0, MAX: MAX) b -> bool is_eq
is_eq = a == b
}
module use_eq #(int MAX) {
int #(MIN: 0, MAX: 3) a
int #(MIN: 0, MAX: 5) b
bool c = eq(a, b)
// Might get inferred from a.MAX to be eq#(MAX: 3), in which case b doesn't typecheck
// Or gets inferred from b.MAX to be eq#(MAX: 5), in which case a is a subtype of int#(MIN: 0, MAX: 5)
}
Perhaps doing a similar thing for inferring parameters as for integer subtyping (as in, Less than / Greater than relations are created for inferrable variables. Of course, if there's any exact equality constraints on the same variable they'd take prescedence and immediately resolve the variable).
Then again, requiring that we grab the minimum / maximum from all possible places from where a value might be inferred could come with its own drawbacks, and leaves non-inferrable expressions still containing a parameter in an awkward place.
Perhaps we can constrain "inference" site a bit better still, as outputs could never receive a valid inferrable value. That way there being a MAX
parameter on total
does not prevent it from being inferred if both MAX
params for a
and b
are known.
module modulus_add #(int MAX) {
interface modulus_add : int #(MIN: 0, MAX: MAX) a, int #(MIN: 0, MAX: MAX) b -> int #(MIN: 0, MAX: MAX) total
//...
}
In summary:
Is the base typing system extended with Feature 2 (where inequality constrained inference requires valid values for all inferrable sites)
If Base + Feature 2 is Deterministic, does adding in Feature 1 re-break Determinism?
Say Base + Feature 2 turns out Deterministic, the if we were to add in Feature 1, then that might turn previously non-inferrable sites into inferrable sites. For example, bool[A - B] <: bool[10]
is not an inferrable site for A
or B
, but once one of them is known, the other becomes inferrable.