An experimental implementation of R, with embellishments
Check out the live demo
cargo run
# R version 0.3.1 -- "Art Smock"
x <- function(a = 1, ...) { a + c(...) }
# function(a = 1, ...) {
# a + c(...)
# }
# <environment 0x6000005b8e28>
y <- function(...) x(...)
# function(...) x(...)
# <environment 0x6000005b8e28>
y(4, 3, 2, 1)
# [1] 7 6 5
This amounts to (most) of R's grammar parsing, basic primitives, scope management and ellipsis argument passing.
This project is not just a rewrite of R, but a playground for features and reinterpretations. It is not meant to reimplement a compatible R layer, but to rethink some of R's assumptions.
To start, there are a few superficial changes:
# 'fn' keyword
f <- fn(a, b, c) {
a + b + c
}
# vector syntax
v <- [1, 2, 3, 4]
# list syntax
l <- (a = 1, b = 2, c = 3)
# lowercase keywords
kws <- (na, null, inf, true, false)
# destructuring assignment
(a, b) <- (1, 2)
There are plenty of more substantial changes being considered. If you enjoy mulling over the direction of syntax and features, feel free to join the conversation.
All experiments are feature-gated and enabled by running (or building) with
cargo run -- --experiments "<experiment>"
Please try them out and share your thoughts in the corresponding issues!
Current work is focused on ..args
named ellipsis arguments and ..args
unpacking in function calls. However, due to the experimental nature of this
syntax it is currently behind a feature gate.
f <- function(..args) {
args
}
f(1, 2, 3) # collect ellipsis args into a named variable
# (1, 2, 3)
args <- (a = 1, b = 2, c = 3)
f <- function(a, b, c) {
a + b + c
}
f(..args) # unpack lists into arguments
# [1] 6
more_args <- (c = 10)
f(..args, ..more_args) # duplicate names okay, last instance takes priority
# [1] 13
Note
--experiments tail-calls
(discussed in #60)
Tail recursion allows for arbitrarily recursive call stacks - or, more accurately, it discards frames from the call stack in this special case allowing for recursion without overflowing of the call stack.
f <- function(n) if (n > 0) f(n - 1) else "done"
f(10000)
# [1] "done"
The details of how this is achieves requires the tail call's arguments to be executed eagerly instead of R's typical lazy argument evaluation. This change can result in some unexpected behaviors that need discussion before the feature can be fully introduced.
You might be thinking rust
is fast, and therefore this project must be
fast. Well, unfortunately you'd be wrong. That's probably more of a
reflection on me than rust
. To get the basic skeleton in place,
my focus has been on getting things working, not on getting them working
well. For now, expect this interpreter to be about 1000x slower
than R.
I'm feeling good about the general structure of the internals, but there have been plenty of quick proofs of concept that involve excess copies, extra loops, panics and probably less-than-ideal data structures. If you're an optimization fiend and you want to help narrow the gap with R, your help would be very much appreciated!
This project is primarily a personal exploration into language design.
At the outset, many of the choices are researched one-by-one and are almost certainly naive implementations. My goal is to learn and explore, and in that way the project is already a success in my eyes. Beyond advancing my own understanding of language internals, I'd love to see the project garner enough interest to become self-sustaining.
If you see value in the project for anything beyond prototyping ideas, then pushing the project toward something practical is contingent on your support. Contributions, suggestions, feedback and testing are all appreciated.
Being primarily a one-person project, the values currently map closely to my own. Somethings I want to aim for:
- A reasonably approachable language for R users (possibly with the ability to interpret R code).
- Improved R constructs for complex calls, including argument packing and unpacking, partial function calls, destructuring assignment
- Guardrails on non-standard-evaluation, allowing for user-facing domain-specific-languages, while allowing a more rigid evaluation scheme internally.
- Lean into the things that
rust
does well, such as threading, arguably async evaluation, first-class data structures and algebraic error types. - Learn from more general languages like
TypeScript
to better understand how static typing can be comfortably embedded in a high-level language.
If you also want to learn some rust
or want to explore language design with
me, I'm happy to have you along for the ride. There are plenty of ways to
contribute. In order of increasing complexity, this might include:
- Documenting internals
- Improving documentation throughout
- Helping to improve the demo page hosted on GitHub pages
- Implementing new language concepts
- Providing feedback on internals
Any and all contributions are appreciated, and you'll earn yourself a mention in release notes!
I welcome other contributors, but also have not thoughtfully selected a long- term license yet. For now there's a CLA in place so that the license can be altered later on. I don't intend to keep it around forever. If you have suggestions or considerations for selecting an appropriate license, your feedback would be much appreciated.
My current preference is toward a copyleft license like GPL as opposed to a permissive license like MIT, as I believe that languages are a best-case candidate for such licenses and it fits well with the ethos of the R community as being scientific-community first. If you disagree strongly with that decision, now is your time to let me know.