Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First-class compile time type manipulation #3669

Open
cjhowedev opened this issue Jul 9, 2024 · 9 comments
Open

First-class compile time type manipulation #3669

cjhowedev opened this issue Jul 9, 2024 · 9 comments

Comments

@cjhowedev
Copy link

Zig supports comptime, which is a keyword that allows for the evaluation of arbitrary expressions and entire functions at compile time. It is also used in Zig's implementation of generics, so that types can be manipulated at compile time to generate code. In particular, Zig supports the following compile time features:

  • Treating types as first-class compile-time values
  • Treating arguments as compile-time arguments as opposed to runtime arguments (that is, their value must be known at compile time)
  • Generating structs/enums at compile time
  • Emitting compiler errors from compile-time contexts, compile-time type validation
  • Manually unroll loops (compile time for) and generate runtime code at compile time

Rust has a few existing features that address compile-time code evaluation:

  • Const fns allow execution of a subset of Rust at compile time when a function is explicitly marked as const
  • Procedural macros allow for execution of code at compile time to manipulate syntactical elements. They exist at compile time but work at a domain above the type system, so are limited in the kinds of analysis they can perform (especially around types)

These are a few of the goals that could be addressed with compile-time code evaluation and type manipulation:

  • Variadic generics could be represented as a list of compile-time types without needing to worry about their runtime representation (including tuple byte layout issues in Draft RFC: variadic generics #376)
  • Reflection can be supported without any runtime type representation necessary
  • Manipulating types with regular code is a lot simpler than trying to generate code using procedural or even declarative macros in many cases
  • Compile-time code can be run to validate preconditions necessary for generated code
  • Optimized or specialized code can be generated using compile time code evaluation, without needing partial specialization, based on arbitrary conditions
  • Contracts
  • Tools (such as linters and IDEs) can run compile time code during development to ensure that types validate and produce correct code
  • Compile-time code could access compiler features that aren't available at runtime (IE, ask the compiler to check for trait conformances)

I have some ideas for how this could be implemented, but ultimately I wanted to build a path towards variadic generics through compile-time code evaluation and generation. There are many features we would need to expand on first before we could work on many of the features listed above, mostly around making const and generics more powerful in Rust. And we would need a way to manipulate types in const fns and return them to be used as generic arguments.

What are the community's thoughts about this as a future direction for Rust?

@cjhowedev
Copy link
Author

cjhowedev commented Jul 9, 2024

Here are the ideas I had for how we could go about implementing this feature by feature.

First, we could start by introducing const params to non-const functions that must be known at compile time. For a const fn all parameters must be known at compile time. For a non-const fn, some parameters may be const and others may be passed at runtime. I suspect this feature can be useful even with the existing const primitives we have today, even if limited. For example:

const myXs: [i32] = [1, 2, 5]

// Standard const fn (all variables and parameters const)
const fn my_const_fn(xs: [i32], y: i32) -> i32 {
  let sum: i32 = 0;
  for x in xs {
    sum += x;
  }
  return sum + y
}

// `my_const_fn(myXs, y)` not allowed unless y is const
// `my_const_fn(myXs, 6)` becomes `return (1 + 2 + 5) + 6 == 14` after compile time evaluation

// Non-const fn with const parameters
fn my_non_const_fn(const xs: [i32], y: i32) -> i32 {
  // Can use const parameters within const blocks
 return const {
    let mut sum: i32 = 0;
    for x in xs {
      sum += x;
    }
    sum
  } + y
}

// `my_non_const_fn(myXs, y)` is valid with non-const `y` and becomes `return (1 + 2 + 5) + y`
// `my_non_const_fn(myXs, 6)` becomes `return (1 + 2 + 5) + 6 == 14`, same as above

The next step would involve allowing the generation of runtime code within compile time contexts. This allows one to put runtime code within an inline const block:

fn runtime_in_const(const xs: [i32], ys: [i32]) -> i32 {
 const {
    for (i, x) in xs.enumerate() {
      runtime {
        if x > ys[i] {
          return x;
        }
      }
    }
  }
  return 0;
}

// runtime_in_const([1, 2], ys) would generate the following specialized function at runtime:
fn runtime_in_const_specialized(ys: [i32]) -> i32 {
  if 1 > ys[0] {
    return 1;
  }
  if 2 > ys[1] {
    return 2;
  }
  return 0;
}

The final step would be to reify types so they can be queried and manipulated in const contexts, and then provide a way to call a const fns on generic arguments to produce new type arguments. I'm still figuring out how this might look, but should start to resemble Zig.

@SichangHe

This comment was marked as off-topic.

@SichangHe
Copy link

This would be closer to what we have (except const generics only takes integers, bool and char):

// Non-const fn with const parameters
fn my_non_const_fn<const XS: &'static [i32]>(y: i32) -> i32 {
    let mut sum = 0;
    let mut index = 0;
    while index < XS.len() {
        sum += XS[index];
        index += 1;
    }
    sum + y
}

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=376058daa0d54bc0956af2512e3f50ce

@cjhowedev
Copy link
Author

There are several const-related features that should be handled first to improve the applicability of this feature. Mainly, it would be ideal if we can use existing types in unique ways to generate new types and functions.

To do this, we could make const expressions even more powerful by implementing multi-stage compilation. This would allow const expressions to generate additional stages of compilation by referencing the results of other const expression. That generated const expression is then executed as part of compilation after the expressions it references (directly or indirectly). The subexpression may then generate even more const code to produce additional compilation stages dynamically.

Ideally, we could easily interleave these const expressions with regular functions. Each function would be executed twice: once at the const level to generate all the runtime code, then again at runtime. There are other uses for this: for example, the type checker has to "symbolically execute" each generic function to generate its prototype, passing in any generic type parameters. We could make this more dynamic by allowing the const expressions to dynamically execute code as part of the type checking process.

This requires validation of the const expressions generated at each stage of compilation. Each const expression must not reference variables that are bound at a later stage of compilation. The stage of an expression is simply the highest stage of any variables it references, plus one. This allows it to recursively calculate stages for each level of compilation, and if it encounters a stage that is higher than the stage that references it, the validation fails. If we ensure there are no cyclic dependencies, we should be able to execute the entire graph of const expressions in topological order.

Another important piece will be sandboxing the code so we can ensure it is safe to execute. The simplest idea I have for this is to compile the Rust code to WebAssembly, then use wasmtime to host the sandbox for execution. The advantage of this is that WebAssembly/wasmtime is designed to run untrusted code from remote sources in a safe sandbox. And, since it only needs to run as part of the compiler anyways, it might be acceptable to have limited platform support for now. Platforms not supported by wasmtime could use cross-compilation to compile for any supported Rust target. The wasmtime sandbox would only be required at compile time.

This would all have to come after many other const features in the compiler, such as what was mentioned above.

@workingjubilee
Copy link
Member

we already allow compile-time function evaluation, it is done by compiling the code to MIR which is then executed. we know the semantics of the code because we specify and control the semantics of MIR, so we can bar evaluation of constructs that would need to be sandboxed if we so please. there is no need to introduce a webassembly runtime here.

@cjhowedev
Copy link
Author

The problem with this approach is performance and scalability, but yes we can do this for now.

@black7375
Copy link

Certainly zig's comptime looks cool.
In macros, I'd like to focus on the DSL, and have the compile-time computations be done in const.

https://news.ycombinator.com/item?id=43447324

#[derive(Debug)]
const fn List<T>() -> type {
    return struct {
        items: T[],
        len: usize,
    };
}

fn main() {
    let mut buffer: [i32; 10] = [0; 10];
    
    let list = List::<i32> {
        items: &buffer,
        len: 0
    };    
    println!("{:?}", list);
}

@LoopyAshy
Copy link

LoopyAshy commented Apr 5, 2025

Honestly this feature would be leaps and bounds of useful to me for several reasons.
One major usage I have for proc macros for example is features such as serialization and similar needs. Achieving these systems in multiple newer languages is beyond easy and I'd adore if rust supported compile time reflection and compile time loops/ifs/etc to achieve such systems for example.

struct Test {
    a: i32,
    b: f32,
    c: String
}

impl Serialize {
    fn serialize<W: Writer>(&self, wtr: &mut W);
}

impl Serialize for Test {
    fn serialize<W: Writer>(&self, wtr: &mut W) {
        const for field in Test.FIELDS {
            field.data.serialize(wtr);
        }
    }
}

this is a very sloppy example but the idea is there you could even do something like this to make it more powerful but this is quite a change

impl<T> Serialize for T where T::FIELDS.all_have_trait(Serialize) {
    fn serialize<W: Writer>(&self, wtr: &mut W) {
        const for field in Test.FIELDS {
            field.data.serialize(wtr);
        }
    }
}

this feature would remove a ton of my reasoning for needing proc macros honestly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants