Skip to content

Commit

Permalink
Merge pull request #1816 from usethesource/rascal-module-parser-storage
Browse files Browse the repository at this point in the history
a little compile for generating cached module parsers
  • Loading branch information
jurgenvinju committed Jun 23, 2023
2 parents 329130b + b6c0b93 commit cb35d4b
Show file tree
Hide file tree
Showing 4 changed files with 195 additions and 42 deletions.
25 changes: 1 addition & 24 deletions src/org/rascalmpl/interpreter/result/SourceLocationResult.java
Original file line number Diff line number Diff line change
Expand Up @@ -509,31 +509,8 @@ else if (name.equals("extension")) {
if (!replType.isString()) {
throw new UnexpectedType(getTypeFactory().stringType(), replType, ctx.getCurrentAST());
}
String ext = newStringValue;

boolean endsWithSlash = path.endsWith(URIUtil.URI_PATH_SEPARATOR);
if (endsWithSlash) {
path = path.substring(0, path.length() - 1);
}

if (path.length() > 1) {
int slashIndex = path.lastIndexOf(URIUtil.URI_PATH_SEPARATOR);
int index = path.substring(slashIndex).lastIndexOf('.');

if (index == -1 && !ext.isEmpty()) {
path = path + (!ext.startsWith(".") ? "." : "") + ext;
}
else if (!ext.isEmpty()) {
path = path.substring(0, slashIndex + index) + (!ext.startsWith(".") ? "." : "") + ext;
}
else if (index != -1) {
path = path.substring(0, slashIndex + index);
}

if (endsWithSlash) {
path = path + URIUtil.URI_PATH_SEPARATOR;
}
}
path = URIUtil.changeExtension(loc, newStringValue).getPath();
uriPartChanged = true;
}
else if (name.equals("top")) {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
@synopsis{Functionality for caching module parsers}
@description{
The Rascal interpreter can take a lot of time while loading modules.
In particular in deployed situations (Eclipse and VScode plugins), the
time it takes to load the parser generator for generating the parsers
which are required for analyzing concrete syntax fragments is prohibitive (20s).
This means that the first syntax highlighting sometimes can only appear
after more than 20s after loading an extension (VScode) or plugin (Eclipse).

This "compiler" takes any number of Rascal modules and extracts a grammar
for each of them, in order to use the ((Library::ParseTree)) module's
functions ((saveParsers)) on them respectively to store each parser
in a `.parsers` file.

After that the Rascal interpreter has a special mode for using ((loadParsers))
while importing a new module if a cache `.parsers` file is present next to
the `.rsc` respective file.
}
@benefits{
* loading modules without having to first load and use a parser generator can be up 1000 times faster.
}
@pitfalls{
:::warning
This caching feature is _static_. There is no automated cache clearance.
If your grammars change, any saved `.parsers` files do not change with it.
It is advised that you programmatically execute this compiler at deployment time
to store the `.parsers` file _only_ in deployed `jar` files. That way, you can not
be bitten by a concrete syntax parser that is out of date at development time.
:::
}
@license{
Copyright (c) 2009-2023 NWO-I CWI
All rights reserved. This program and the accompanying materials
are made available under the terms of the Eclipse Public License v1.0
which accompanies this distribution, and is available at
http://www.eclipse.org/legal/epl-v10.html
}
@contributor{Jurgen J. Vinju - [email protected] - CWI}
@bootstrapParser
module lang::rascal::grammar::storage::ModuleParserStorage

import lang::rascal::grammar::definition::Modules;
import lang::rascal::\syntax::Rascal;
import util::Reflective;
import util::FileSystem;
import Location;
import ParseTree;
import Grammar;
import IO;

@synopsis{For all modules in pcfg.srcs this will produce a `.parsers` stored parser capable of parsing concrete syntax fragment in said module.}
@description{
Use ((loadParsers)) to retrieve the parsers stored by this function. In particular the
Rascal interpreter will use this instead of spinning up its own parser generator.
}
@benefits{
* the single pathConfig parameter makes it easy to wire this function into Maven scripts (see generate-sources maven plugin)
* time spent here generating parsers, once, does not have to be spent while running IDE plugins, many times.
}
@pitfalls{
* this compiler has very weak error reporting. it just crashes with stacktraces in case of trouble.
* for large projects running this can take a few minutes; it is slower than importing the same modules in the interpreter.
* this compiler assumes the grammars are all correct and can be used to parse the concrete syntax fragments in each respective module.
* this compiler may have slight differences in semantics with the way the interpreter composes grammars for modules, since
it is implemented differently. However, no such issues are currently known.
}
@examples{
Typically you would call the generate-sources MOJO from the rascal-maven-plugin, in `pom.xml`, like so:

```xml
<plugin>
<groupId>org.rascalmpl</groupId>
<artifactId>rascal-maven-plugin</artifactId>
<version>0.14.6</version>
<configuration>
<mainModule>YourMainModule</mainModule>
</configuration>
<executions>
<execution>
<id>it-compile</id>
<phase>generate-test-sources</phase>
<goals>
<goal>generate-sources</goal>
</goals>
</execution>
</executions>
</plugin>
```

And you'd write this module to make it work:

```rascal
module YourMainModule

import util::Reflective;
import lang::rascal::grammar::storage::ModuleParserStorage;

int main(list[str] args) {
pcfg = getProjectPathConfig(|project://yourProject|);
storeParsersForModules(pcfg);
}
```
}
void storeParsersForModules(PathConfig pcfg) {
storeParsersForModules({*find(src, "rsc") | src <- pcfg.srcs, bprintln("Crawling <src>")}, pcfg);
}

void storeParsersForModules(set[loc] moduleFiles, PathConfig pcfg) {
storeParsersForModules({parseModule(m) | m <- moduleFiles, bprintln("Loading <m>")}, pcfg);
}

void storeParsersForModules(set[Module] modules, PathConfig pcfg) {
for (m <- modules) {
storeParserForModule("<m.header.name>", m@\loc, modules, pcfg);
}
}

void storeParserForModule(str main, loc file, set[Module] modules, PathConfig pcfg) {
// this has to be done from scratch due to different ways combining layout definitions
// with import and extend. Each main module has a different grammar because of this.
def = modules2definition(main, modules);

// here the layout semantics comes really into action
gr = fuse(def);

// find a file in the target folder to write to
target = pcfg.bin + relativize(pcfg.srcs, file)[extension="parsers"].path;

println("Generating parser for <main> at <target>");
if (type[Tree] rt := type(sort("Tree"), gr.rules)) {
storeParsers(rt, target);
}
}
51 changes: 33 additions & 18 deletions src/org/rascalmpl/semantics/dynamic/Import.java
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,7 @@
import io.usethesource.vallang.IValue;
import io.usethesource.vallang.IValueFactory;
import io.usethesource.vallang.type.Type;
import io.usethesource.vallang.type.TypeFactory;

public abstract class Import {

Expand Down Expand Up @@ -394,7 +395,7 @@ private static void addImportToCurrentModule(ISourceLocation src, String name, I
current.setSyntaxDefined(current.definesSyntax() || module.definesSyntax());
}

public static ITree parseModuleAndFragments(char[] data, ISourceLocation location, IEvaluator<Result<IValue>> eval){
public static ITree parseModuleAndFragments(char[] data, ISourceLocation location, IEvaluator<Result<IValue>> eval) {
eval.__setInterrupt(false);
IActionExecutor<ITree> actions = new NoActionExecutor();

Expand Down Expand Up @@ -454,24 +455,38 @@ public static ITree parseModuleAndFragments(char[] data, ISourceLocation locatio

// parse the embedded concrete syntax fragments of the current module
ITree result = tree;
if (!eval.getHeap().isBootstrapper() && (needBootstrapParser(data) || (env.definesSyntax() && containsBackTick(data, 0)))) {
RascalFunctionValueFactory vf = eval.getFunctionValueFactory();
IFunction parsers = null;

if (env.getBootstrap()) {
parsers = vf.bootstrapParsers();
}
else {
IConstructor dummy = TreeAdapter.getType(tree); // I just need _any_ ok non-terminal
IMap syntaxDefinition = env.getSyntaxDefinition();
IMap grammar = (IMap) eval.getParserGenerator().getGrammarFromModules(eval.getMonitor(),env.getName(), syntaxDefinition).get("rules");
IConstructor reifiedType = vf.reifiedType(dummy, grammar);
parsers = vf.parsers(reifiedType, vf.bool(false), vf.bool(false), vf.bool(false), vf.set());
}

result = parseFragments(vf, eval.getMonitor(), parsers, tree, location, env);
}
try {
if (!eval.getHeap().isBootstrapper() && (needBootstrapParser(data) || (env.definesSyntax() && containsBackTick(data, 0)))) {
RascalFunctionValueFactory vf = eval.getFunctionValueFactory();
URIResolverRegistry reg = URIResolverRegistry.getInstance();
ISourceLocation parserCacheFile = URIUtil.changeExtension(env.getLocation(), "parsers");

IFunction parsers = null;

if (env.getBootstrap()) {
// no need to generste a parser for the Rascal language itself
parsers = vf.bootstrapParsers();
}
else if (reg.exists(parserCacheFile)) {
// if we cached a ModuleFile.parsers file, we will use the parser from that (typically after deployment time)
parsers = vf.loadParsers(parserCacheFile, vf.bool(false),vf.bool(false),vf.bool(false), vf.set());
}
else {
// otherwise we have to generate a fresh parser for this module now
IConstructor dummy = TreeAdapter.getType(tree); // I just need _any_ ok non-terminal
IMap syntaxDefinition = env.getSyntaxDefinition();
IMap grammar = (IMap) eval.getParserGenerator().getGrammarFromModules(eval.getMonitor(),env.getName(), syntaxDefinition).get("rules");
IConstructor reifiedType = vf.reifiedType(dummy, grammar);
parsers = vf.parsers(reifiedType, vf.bool(false), vf.bool(false), vf.bool(false), vf.set());
}

result = parseFragments(vf, eval.getMonitor(), parsers, tree, location, env);
}
}
catch (URISyntaxException | ClassNotFoundException | IOException e) {
eval.warning("reusing parsers failed during module import: " + e.getMessage(), env.getLocation());
}

return result;
}

Expand Down
28 changes: 28 additions & 0 deletions src/org/rascalmpl/uri/URIUtil.java
Original file line number Diff line number Diff line change
Expand Up @@ -433,4 +433,32 @@ public static ISourceLocation removeOffset(ISourceLocation prev) {
public static ISourceLocation createFromURI(String value) throws URISyntaxException {
return vf.sourceLocation(createFromEncoded(value));
}
public static ISourceLocation changeExtension(ISourceLocation location, String ext) throws URISyntaxException {
String path = location.getPath();
boolean endsWithSlash = path.endsWith(URIUtil.URI_PATH_SEPARATOR);
if (endsWithSlash) {
path = path.substring(0, path.length() - 1);
}

if (path.length() > 1) {
int slashIndex = path.lastIndexOf(URIUtil.URI_PATH_SEPARATOR);
int index = path.substring(slashIndex).lastIndexOf('.');

if (index == -1 && !ext.isEmpty()) {
path = path + (!ext.startsWith(".") ? "." : "") + ext;
}
else if (!ext.isEmpty()) {
path = path.substring(0, slashIndex + index) + (!ext.startsWith(".") ? "." : "") + ext;
}
else if (index != -1) {
path = path.substring(0, slashIndex + index);
}

if (endsWithSlash) {
path = path + URIUtil.URI_PATH_SEPARATOR;
}
}

return URIUtil.changePath(location, path);
}
}

0 comments on commit cb35d4b

Please sign in to comment.