Replies: 2 comments 1 reply
-
Perhaps the following obvious strategy would not be a bad place to start: take a decimal string This would be an alternative, for example, to starting with |
Beta Was this translation helpful? Give feedback.
-
Sounds great! I would probably add hexadecimal floats (e.g., |
Beta Was this translation helpful? Give feedback.
-
The Calyx ecosystem has always relied on a bespoke data conversion system for running programs. This conversion is a part of fud, and it translates between flat "binary blobs" and human-readable JSON files that look like this.
The JSON representation is better for humans and other software-y tools, and the flat binary blobs (or their hex-encoded equivalent) are necessary for driving hardware simulation (or real hardware execution). This conversion stuff is a load-bearing component for a couple of reasons:
cat
. Life would be much worse without this.As part of our numerics efforts, the current Python tooling is showing some limitations:
So the proposal here is to build, in Rust, a new standalone converter for numerical data blobs. The new converter would—in the limit—support arbitrary-precision fixed-point and floating-point formats, and it would convert between that binary data and various human-readable text file formats.
That is, there are two orthogonal axes going on here:
We would like a tool that converts from any file/numerical format to any other file/numerical format, and which does so with correct rounding (i.e., with the best accuracy achievable given the destination format).
Some Details
Here are some file formats I think we should support:
readmemb
/writememb
use, and it is also the "raw" format we would use to feed data into actual hardware.readmemh
/writememh
use. This is the only other file format (beyond its native JSON) that the current fur-embedded converter works.Here are some numerical formats I think we should support:
float
,double
,double double
, 16-bit half precision.At the most basic level, we want a tool you can invoke like this:
That is, the 4 different formats involved in a conversion should be specified separately. With the possible exception of the text formats (plain text and JSON), in which case the representation comes with its own precision. That is, the decimal string "123.456" doesn't correspond to any specific numerical format; we just want to represent a number as close as possible to that decimal value.
(FWIW, this business about accurately representing text/decimal numbers is the part I am most fuzzy on. I guess we need to pick a binary format that is guaranteed to be a fully faithful representation of any such binary number, in some range? What is the set of decimal numbers that can be exactly represented (i.e., round-tripped without error) in at most 64 bits under some strategy? We would have to figure out what to do here without going fully into "BigRational" representations if it's not necessary…)
How to Approach It
I suggest that we start with these limitations:
Once all that works, we can adopt fud's JSON format for compatibility. That will let us start using it in real Calyx executions! Then we can start addressing more of the above desiderata, including new file formats and optimizations for speed.
Related Work
I have actually been super surprised to find that there are apparently not any tools like this already out there. It seems like something the world of hardware tooling would want? I just can't find much at all on GitHub, and certainly nothing with a focus on correctness. But I'd be very interested if other people know about something I don't!
I actually gave a very half-hearted try to implementing something along these lines long ago, in a repo called samizdat. This was a failure. I don't think I even learned very much from the experience, except that this is a deceptively interesting problem. Perhaps the only salvageable code from that effort might be my little implementation of a numerical format enum, which also includes a string representation, so "f32" means single-precision float and "u4.2" means "unsigned fixed point with 4 integer and 2 fractional bits." Of course, I used the wrong parameters for fixed-point formats! So maybe even this is not very useful.
Beta Was this translation helpful? Give feedback.
All reactions