Calyx as an infra for compiler generation of programmable accelerators #892

JosseVanDelm · 2022-02-02T15:14:35Z

JosseVanDelm
Feb 2, 2022

Calyx as an infra for compiler generation of programmable accelerators?

Hi everyone!

I'm creating this post as a follow-up on an email of myself to @sampsyo, @rachitnigam, @sgpthomas and @tissue3 .
@sampsyo suggested in his reply to continue the discussion here so others could weigh in as well.

Intro

My name is Josse, I'm a PhD researcher at KULeuven (Belgium), where we are researching programmable heterogeneous accelerator SoC for TinyML workloads. Part of my current research is "porting" TVM to an existing programmable embedded SoC with a RISC-V core and two coarse-grained accelerators.

Proposal

I'm looking for some type of HDL or ADL which can describe programmable compute systems at a high level, so it can be lowered to two things:

A more concrete/detailed low-level hardware description (like Calyx?) to eventually create the hardware.
A very detailed target description for generating a compiler middle-end and back-end.

Please do note that I'm not proposing an HLS flow. In an HLS flow a single algorithm gets lowered to an efficient hardware equivalent. I'm proposing a set of hardware design intrinsics (pragma's maybe?) which can be inserted in a hardware description to make certain compiler-use semantics explicit there, to generate programmable hardware and a compiler back-end.

Rationale

Development of current programmable accelerators, or even cpus and gpus and their compilers are separated, which makes it quite labour-intensive to:

Make acceleration hardware other than common CPUs or GPUs which can ingest common ML framework workload descriptions
Adapt ("port") currently exisiting compilation frameworks (like TVM or MLIR) to existing hardware
Characterize the hardware at design time for different ML workloads
Perform automated design space exploration for the hardware

Discussion

I think (and @sampsyo agrees) that this idea is quite ambitious, so I'm looking for experts that can comment on this idea so we can start narrowing this down towards a minimum viable solution.
I personally believe that the MLIR, CIRCT, and Calyx/Dahlia projects can serve as a big part of this solution.
We are currently also looking into extending TVM's VTA project as an alternative.

@sampsyo also suggested to look into Stanford AHA group's CGRA's and their Halide-to-Hardware compiler, though this seems more like an HLS-flow. And homogeneous flexible CGRAs are quite different from the fixed-size yet heterogeneous platforms we want to target.

Please let me know if you have any suggestions or questions! Thanks!

rachitnigam · 2022-02-02T15:26:11Z

rachitnigam
Feb 2, 2022
Maintainer

Hey @JosseVanDelm, this sounds super exciting. I recommend a few resources:

The Calyx Dialect for CIRCT/MLIR. This will allow you to express Calyx programs and lower them into synthesizable hardware using the Calyx compiler
SCFToCalyx which allows you to transform high-level programs written in the scf dialect.

I would recommend getting started with Calyx and following the language tutorial. We have a lot of tool to support Calyx development and if you run into problems, you can open an issue in this repository.

The programmability question is interesting and has basically not been addressed by any research I know, especially in the context of generating programmable accelerators. I would probably start by first hand-writing a simple programmable accelerator in Calyx and then seeing what we can do to automatically add that programmability.

A good starting point in my opinion is the Systolic Array Generator for Calyx which generates pretty simple, fixed-function systolic arrays for Calyx. I would first take a 4x4 array and try to add some programmability to it. Once that is done, it should be possible to change the generator code to add that programmability to every systolic array it generates making it quite powerful.

Let us know how this goes!

0 replies

sampsyo · 2022-02-02T16:32:17Z

sampsyo
Feb 2, 2022
Maintainer

Hi, @JosseVanDelm! Just wanted to copy some things I wrote via email, for completeness:

Correct me if I’m wrong, but it seems like what you’re aiming for is an ADL (i.e., a DSL for specifying hardware) that also serves as an input to a compiler that specifies the target details. So by writing one program in this ADL, you obtain two things: the hardware itself and a compiler backend.

If so, I think that’s a very important category of research! Solving it completely seems very ambitious, so the key is to narrow the scope—to focus on specific kinds of architectures, etc. Focusing on TVM specifically could be a good “narrowing.”

On a very self-serving note, I think Calyx would be a perfect infrastructure fit for this. It would not help with the front-end compiler part of things, but it would definitely help with the part of the puzzle that compiles from this hypothetical ADL to hardware!

Broadly, I think it's a really interesting question to ask what explicit support for programmability in a Calyx program would look like. I think an important step in getting started, as @rachitnigam suggested in a sibling comment, would be to try designing such a programmable accelerator so you know what "programmability" looks like exactly: is it a handful of configuration bits? More like a processor ISA? Something else entirely? Then you can think about how to take that programming interface and "bake" it into the design of the hardware that implements it.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The Calyx Infrastructure

Calyx as an infra for compiler generation of programmable accelerators #892

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

The Calyx Infrastructure

Calyx as an infra for compiler generation of programmable accelerators #892

JosseVanDelm Feb 2, 2022

Calyx as an infra for compiler generation of programmable accelerators?

Intro

Proposal

Rationale

Discussion

Replies: 2 comments

rachitnigam Feb 2, 2022 Maintainer

sampsyo Feb 2, 2022 Maintainer

JosseVanDelm
Feb 2, 2022

rachitnigam
Feb 2, 2022
Maintainer

sampsyo
Feb 2, 2022
Maintainer