-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sail configuration system #865
base: sail2
Are you sure you want to change the base?
Conversation
lib/sail_config.h
Outdated
/* | ||
* This file sets the runtime JSON config file | ||
*/ | ||
void sail_config_set_file(const char *path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be done without global state and instead passing a sail_json struct to each of the functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory, but that would break existing users of the generated C. It could be an optional thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I was just thinking about potentially instantiating multiple instances of the same model with different config options (e.g. for fuzzing) which is not possible if there is only one global. But that can always be added as an incremental change. I'm not sure if this has any impact on @Timmmm's approach of using C++ classes/namespaces for different model instances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think removing global state is probably an orthogonal thing (which would be nice to fix one day!).
For this I will just do the same thing I do for the existing global callbacks - copy & paste them into a base class of the model. It's like this:
struct ModelRuntime {
explicit ModelRuntime(Architecture arch);
...
void setup_rts();
void cleanup_rts();
...
unit load_reservation(mach_bits addr);
bool match_reservation(mach_bits addr) const;
unit cancel_reservation(unit u);
...
};
namespace {
struct RiscvModel : public ModelRuntime {
RiscvModel() : ModelRuntime(Architecture::RV32) {
}
#include "riscv_model_rv32.c"
};
} // namespace
I will just reimplement all these sail_config_
functions in ModelRuntime
. Probably not the best long term solution but I think it's fine for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think there are some nicer C++ json libraries you could wrap and include too. I opted for cJSON for now because it's easy to vendor, but it doesn't do exactly what I want (expose the raw JSON string data for integers so I can just pass them to the GMP parse functions and support arbitrary precision). I'm trying to resist the urge to write my own JSON parser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think https://github.com/nlohmann/json is one of the most popular ones and looks to have a decently nice API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah nlohmann is nice but it also doesn't support arbitrary precision integers.
I'm trying to resist the urge to write my own JSON parser.
Ha, well I definitely wouldn't fancy doing that in C! One of my colleagues wrote one in C++ that supports uint64. I don't really understand why they didn't just use nlohmann but it has been surprisingly nice to have uint64s - especially because we generate the JSON with Python and it is very happy to output any size integer.
f997f2b
to
241e60f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't read the internal compiler code but this looks good. Some other questions:
- It doesn't seem like you can read the config as arrays/lists? That would be really helpful for PMAs. Currently we have this nonsense:
val init_pma_regions : unit -> unit
function init_pma_regions() = {
// Clear regions so this is idempotent.
pma_regions = [||];
// Iterate backwards because pma_regions is a singly linked list that we
// prepend to.
foreach (i from (plat_pma_count() - 1) downto 0) {
let attr : PMA = struct {
cacheable = plat_pma_cacheable(i),
coherent = plat_pma_coherent(i),
executable = plat_pma_executable(i),
readable = plat_pma_readable(i),
writable = plat_pma_writable(i),
readIdempotent = plat_pma_readIdempotent(i),
writeIdempotent = plat_pma_writeIdempotent(i),
relaxed = plat_pma_relaxed(i),
misalignmentCausesAccessFault = plat_pma_misalignmentCausesAccessFault(i),
misalignmentCausesAlignmentFault = plat_pma_misalignmentCausesAlignmentFault(i),
atomicSupport = plat_pma_atomicSupport(i),
reservability = plat_pma_reservability(i),
taggable = plat_pma_taggable(i),
untaggableCausesAccessFault = plat_pma_untaggableCausesAccessFault(i),
supportsCboZero = plat_pma_supportsCboZero(i),
};
assert(not(attr.misalignmentCausesAccessFault & attr.misalignmentCausesAlignmentFault));
assert(not(attr.taggable & attr.untaggableCausesAccessFault));
let region : PMA_Region = struct {
base = physaddr(plat_pma_address(i)),
size = plat_pma_size(i),
attributes = attr,
};
pma_regions = region :: pma_regions;
};
}
- How would you recommend dealing with enums? E.g.
plat_pma_atomicSupport()
above returns
enum AtomicSupport = { AMONone, AMOSwap, AMOLogical, AMOArithmetic }
I think it would be easy to use strings - we already have
mapping atomic_support_name : AtomicSupport <-> string = {
AMONone <-> "AMONone",
AMOSwap <-> "AMOSwap",
AMOLogical <-> "AMOLogical",
AMOArithmetic <-> "AMOArithmetic",
}
so could do:
... = config atomic_support_name(pma[i].atomic_support)
which is quite nice, but I vaguely recall you said something about the Sail compiler ensuring that string operations don't affect execution or something? Presumably that wouldn't work nicely with the SVA/SMT backends?
The alternative would be an integer mapping, but then that makes the JSON not very robust or readable.
lib/json/sail_config.c
Outdated
|
||
void sail_config_unwrap_int(sail_int *n, const sail_config_json config) | ||
{ | ||
char *str = cJSON_GetStringValue((cJSON *)config); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand it this means the JSON never uses JSON numbers - only strings of numbers?
I guess that makes sense, especially because cjson
's number handling looks whack (silently clamping to INT_MAX???), but it's a slight shame. The JSON spec doesn't actually put a limit on integer size so we theoretically could support arbitrary precision integers "natively". Although I guess that might make it hard to generate them from some programming languages.... And it would require using a better parser. Hmmm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, although I probably do need to use integers for the JSON schema output to work correctly. So maybe I will need to find a better parser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Sail flag --config
to statically instantiate the config does support arbitrary precision integers.
@@ -38,6 +38,9 @@ def test_c(name, c_opts, sail_opts, valgrind, compiler='cc'): | |||
basename = os.path.splitext(os.path.basename(filename))[0] | |||
tests[filename] = os.fork() | |||
if tests[filename] == 0: | |||
if basename.startswith('config'): | |||
sail_opts += ' --c-include sail_config.h' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be included automatically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, the idea is to avoid breaking existing Makefiles. To include it by default everything in lib/json
would need to be in lib
but that would make it harder to swap out the JSON implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok fair enough. I guess this is one of those things where a --target-version, or --edition type flag would help.
[91m |[0m | ||
[91m |[0m [96mfail/config_mismatch.sail[0m:8.18-28: | ||
[91m |[0m 8[96m |[0m print_endline(config a.b); | ||
[91m |[0m [91m |[0m [91m^--------^[0m [91mprevious type found here[0m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean it infers the type of a config foo.bar
from the context the first time you use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For any type used as a configuration option, the requirement will be that it must be well-typed in the global typing context (so it cannot be dependent on runtime values in functions), and it should correspond to some precise JSON schema definition.
For the case where the configuration option is used twice as config x : T
and config x : U
, there are essentially two options.
- Globally keep track of the most specific type we've seen. So if we see
T
first then when encounteringU
we check ifT < U
orU < T
and remember the more specific type. IfT
andU
are incomparable, then the error in that tests gets raised (this is what is currently implemented) - Keep track of all the types we've seen. Interpret each config instance separately with it's own type, and rely on the JSON schema output to ensure everything makes sense. In the case where two types are incompatible, we'll just generate an impossible schema (such as
{"allOf": [{"type" : "string"}, {"type" : "integer"}]}
)
I am leaning towards switching to the second option, as it could allow lifting the well-typed in the global typing context restriction and it doesn't rely on subtyping (as it can fail when both U
and T
are existentially quantified).
@@ -0,0 +1,3 @@ | |||
{ | |||
"k" : [true, false, false, true, "not a bit"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't look like a fun bit vector syntax! Could we also support Sail bit vector literals? I.e. "0x1234"
and "0b0101"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, although I probably can't support underscores as separators due to how JSON schema works.
This will be implemented before the PR is merged. I just have an |
A couple more questions (sorry!):
It would be cool if you could do:
and have it convert JSON
|
My plan for this was to generate a JSON schema when building the model, then you can guarantee that any json passed at runtime won't violate type-safety as long as it validates against that schema. |
That assumes you actually do the schema check though right? I think it might be quite easy to not do that? Maybe it could do both - when you generate the schema, also generate type checking functions in C? I guess that can be added later quite easily without any interface changes. |
Yes. There is a slight issue that by the time we reach the C generation some type information has been lost, as the types are simplified before then. We could work around that in this case because of the additional requirement that we can generate a JSON schema that represents each type used in the configuration, but that extra information would need to be passed down somehow. I'm not sure this is worth doing in the first version, but I agree it would be nice to have. |
e57998b
to
94781a6
Compare
Since the meeting on Monday:
This is probably now in a state where it can be merged. There are a few missing things like union parsing (including allowing null for |
11687c8
to
9a396d6
Compare
This commit introduces a method for configuring Sail models using a JSON configuration file. It introduces a new construct into the language: ``` config a.b.c : T ``` which gets the value of the json object a.b.c in the configuration file and interprets it as the Sail type T Configuration options can be instantiated at compile time using ``` sail --config c.json <files> ``` which is useful for Sail targets like SMT or SystemVerilog, or they can be accessed at runtime, which allows for building a configurable emulator
The init instruction used to just have the form: ``` ctyp id = cval ``` which was always equivalent to: ``` ctyp id id = cval ``` However this doesn't work with some C and C++ constructs where we want to generate something that is only valid as an initializer, i.e. ```C const_sail_string[] = {"Hello", "World"}; ``` To fix this the init instruction has the form ``` ctyp id = init ``` where `init` can either be a plain cval as before or a special initializer form that can't be split into two instructions. Before this patch we just had to hope that the cval that compiled into an initializer wouldn't be constant propagated, but with this that is statically guaranteed.
Second, for each type [T] attached to a configuration node (which may have been inferred from context by the type system), we will also attempt to synthesise a JSON Schema. If [T] cannot be turned into a valid schema then this function raises a fatal type error exception. This schema cannot capture every possible invariant of an ISA configuration, but it does provide enough to guarantee that a configuration applied at runtime will not break type-safety assuming it validates against the schema. Parse more kinds of JSON values, including vectors and lists
This commit removes checking constraints from synonym expansion, so the type: ``` type foo('n), 'n >= 32 = bar('n) ``` when appearing as `foo(N)` will just expand directly to `bar(N)` without checking `N >= 32`. This is possible because the well-formedness check already checks this, i.e. `foo(N)` is well-formed only if `N >= 32` holds, so we don't need to check again when we do the expansion. This fixes a bug where synonym expansion wasn't handling existentials correctly.
Improve support for large integers and ranges
dea22bc
to
8389cfe
Compare
Make sure we allocate enough space for raw string parsing, and add address santizer mode to set of C tests
Looking forward to this improvement! One thing I was wondering is if cJSON supports (or if we could change it to support JSON with comments (https://code.visualstudio.com/docs/languages/json#_json-with-comments) by stripping |
Part of the reason I'm using JSON is because it's a common format I can read in all the different Sail targets, so in that sense I'm actually using it in the more intended use case as a data-interchange format. If I start relying on extensions that becomes a bit more awkward, although comments are widely supported enough that I could do it. I also want to generate a lot of the configuration from riscv-config, as it has a lot of information about WARL fields and similar. So it partly depends on whether that's successful or whether we do end up having to write our own configurations by hand. |
cJSON does ignore comments. |
In that case there isn't a problem, as the OCaml library we use (yojson) does too. I already modified cJSON to include the raw string representation of numbers, so I could pass them to the GMP parse functions. |
Looks like nlohmann/json also supports it (opt-it) so that is good: https://json.nlohmann.me/features/comments/ |
This commit introduces a method for configuring Sail models using a JSON configuration file. It introduces a new construct into the language:
which gets the value of the json object a.b.c in the configuration file and interprets it as the Sail type T
Configuration options can be instantiated at compile time using
which is useful for Sail targets like SMT or SystemVerilog, or they can be accessed at runtime, which allows for building e.g. a configurable emulator.