Skip to content

TSS Multiuse

Rachit Nigam edited this page Feb 25, 2021 · 4 revisions

Multiple Uses

Hardware components distinguish between instantiation and invocation. A hardware instance can be invoked multiple times. How do we typecheck multiple, chained invocations of the same hardware component?

We use the following interface for mult:

component mult<G, @latency(L)>(@exact(G, G+L) go,
                               @within(G, G+1) left,
                               @within(G, G+1) right) -> (
                               @exact(G+L, G+L+1) done,
                               @exact(G+L, G+L+2) out)

Note that the out signal is live for two cycles after the execution of the component.

Example (mult is a multiplier and r is a register):

// Psuedosyntax for FSMs
f:
  true ? 0 -> 1
  mult.done ? 1 -> 2
  mult.done ? 2 -> 3;

// Invoked in state 1 & 2
mult.go = f.out == 1 ^ f.out == 2;

mult.left = f.out == 1 ^ f.out == 2 ? 10;

// First invocation uses value from register.
mult.right = f.out == 1 ? r.out;
// Second invocation uses value from *previous* invocation.
mult.right = f.out == 2 ? mult.out;

This example is interesting because it demonstrates chaining where the output from the first invocation of mult is used as an input to the second invocation of mult. This style of input/output chaining is common in accumulators. For example, a dot product computation can send the two inputs every cycle and have the multiply-accumulate unit perform the multiplication and sum it with the currently accumulated value.

Key Idea: A read can be driven by any write. Check each read using every drive available.

Abstract Lifetimes

We do not know how long expressions like these stay true:

f.out == 1

In general, we'd like to abstract away the number of cycles for which this expression is true and reason about it separately.

For each expression like this, we can define an abstract lifetime f1 = [f1.s, f1.e] such that the expression becomes true at f1.s and false at f1.e.

Checking the Example

mult.go = f.out == 1 ^ f.out == 2

We define two instantiation constraints:

m0 = mult[G = f1.s]
m1 = mult[G = f2.s]

where f1 and f2 are the abstract lifetimes of f.out == 1 and f.out == 2 respectively.

Next, we generate a constraint for the lifetime of the go signal:

[m0.G, m0.G + 1] ∪ [m1.G, m1.G + 1] = f1 ∪ f2

This states that the lifetime of the go signal derived from the two invocations should be equal to the lifetime of the signals generated from the two expressions.

mult.left = f.out == 1 ^ f.out == 2 ? 10

The challenge with generating the constraint for this expression is calculating the lifetime of the RHS. The reason for this complication is because the guarded expression (10) might (and often does) have different lifetime from the guard expression (f.out == 1 ^ f.out == 2).

Given the lifetimes of the guard and the expression:

10 // {0, ∞}
f.out == 1 ^ f.out == 2 // f1 ∪ f2

the lifetime of the RHS is:

(f1 ∪ f2) ∩ {0, ∞}

In general, the lifetime of a guarded expression g ? e is lifetime(g) ∩ lifetime(e).

Using the RHS constraint, we can calculate the constraint for the left port, which needs to be driven for one cycle after the go signal is driven:

[m0.G, m0.G + 1] ∪ [m1.G, m1.G + 1] \subset f2 ∪ f3

mult.right = ...

For simplicity, we assume that there is a single assignment to each port. This requirement can be eliminated by a pass that generates a ternary expression from all the writes to a port.

mult.right = f.out == 1 ? r.out :
             f.out == 2 ? mult.out;

The challenging part of verifying this statement is the read from mult.out. Given an arbitrary statement, it is not clear which mult invocation is being used for the write. Of course, the FSM assignment makes this fact clear but again, there is no way to convey an ordering between the first invocation of the mult (m0) and the second one (m1).

Solution: Generate constraints from every available invocation of mult.out.

mult.out is available during:

L = [m0.G + m0.L, m0.G + m0.L + 1] ∪
    [m1.G + m1.L, m1.G + m1.L + 1]

Using the invocation constraints on m0 and m1, we can derive:

L = [f1.e, f1.e + 1] ∪ [f2.e, f2.e + 1]

For the expression f.out == 2 ? mult.out, we get the lifetime: L ∩ f2.

Finally, the lifetime for the complete RHS for mult.right is:

({0, ∞} ∩ f1) ∪ (L ∩ f2)

The requirement for the LHS is:

[m0.G, m0.G + 1] ∩ [m1.G, m1.G + 1]

And we need to show:

[m0.G, m0.G + 1] ∩ [m1.G, m1.G + 1] \subset ({0, ∞} ∩ f1) ∪ (L ∩ f2)

FSM Constraints

The final set of constraints are generated from the FSM definition itself:

f:
  true ? 0 -> 1
  mult.done ? 1 -> 2
  mult.done ? 2 -> 3;

To analyze these, we can rewrite them as guarded assignments:

f.in = f.out == 0 ? 1;
f.in = f.out == 1 & mult.done ? 2;
f.in = f.out == 2 & mult.done ? 3;

The analysis for most of these expressions is the same. We'll focus on this one:

f.in = f.out == 1 & mult.done ? 2;

To calculate the lifetime of the guard, we use the fact that the guard is true only when all the conjuncts are also true. Therefore, the lifetime of the guard is:

G = f1 ∩ ([m0.G + m0.L, m0.G + m0.L + 1] ∪ [m1.G + m1.L, m1.G + m1.L + 1])
  = [f1.s, f1.e] ∩ ([f1.e, f1.e + 1] ∪ [f2.e, f2.e + 1])
  = [f1.e, f1.e]

We seem to have caught a subtle bug in the go/done protocol. When using the latency-insensitive interface, if the go signal has to necessarily be kept alive for one signal—there is no way for a parent to switch off the go signal in the same cycle when the done signal becomes high. More information in this issue.

For now, we can fix this by updating the interface to be:

component mult<G, @latency(L)>(@exact(G, G+L+1) go,
                               @within(G, G+1) left,
                               @within(G, G+1) right) -> (
                               @exact(G+L, G+L+1) done,
                               @exact(G+L, G+L+2) out)

Given this new interface, we have a new equalities: f1.e = m0.G + m0.L + 1 and f2.e = m1.G + m1.L + 1. We can simplify the lifetime G to be:

G = f1 ∩ ([m0.G + m0.L, m0.G + m0.L + 1] ∪ [m1.G + m1.L, m1.G + m1.L + 1])
  = [f1.s, f1.e] ∩ ([f1.e - 1, f1.e] ∪ [f2.e - 1, f2.e])
  = [f1.e - 1, f1.e]

This shows that the guard for the following assignment is live during the cycle [f1.e - 1, f1.e].

f.in = f.out == 1 & mult.done ? 2;

This gives us a way to relate f1 and f2: since a value written to the input port of a register (f) becomes available one cycle after, we can say:

f2.s = f1.e

We don't say f1.s = f1.e + 1 because f1.s and f1.e are not intervals, they are points that define the interval. The correct way to read the equation is, "f2 starts in the cycle when f1 ends". f1 and f2 are still disjoint.

Proving Chaining Complete

To finish of this example, we can use the relationship between the FSM states to prove the chaining of mult.out correct.

We have the following equation for lifetime of mult.out:

L = [m0.G + m0.L, m0.G + m0.L + 2] ∪
    [m1.G + m1.L, m1.G + m1.L + 2]
  = [f1.e - 1, f1.e + 1] ∪ [f2.e - 1, f2.e + 1]
  = [f2.e - 1, f2.s + 1] ∪ [f2.e - 1, f2.e + 1]

Next, we intersect it with the lifetime of the guard f.out == 2:

L ∩ f2 = [f2.e - 1, f2.s + 1] ∪ [f2.e - 1, f2.e + 1] ∩ [f2.s, f1.e]
       = [f2.s, f2.s + 1]

Using this, we can prove mult.right reads valid values:

[m0.G, m0.G + 1] ∩ [m1.G, m1.G + 1] \subset ({0, ∞} ∩ f1) ∪ (L ∩ f2)