-
Notifications
You must be signed in to change notification settings - Fork 46
JIT VM API DRAFT
This document describes the interaction between the J2ME compiler and the rest of the runtime system.
Java class inheritance is modeled using JavaScript prototype chains. This gives us virtual dispatch for free and also lets us benefit from many of the optimizations in the JavaScript engine that are tuned to common JavaScript code patterns.
Classes are mapped to JavaScript as constructor functions that initializes class instance fields to their default values. In the VM this is referred to as a Klass
. All instance fields of a class as well as any of the instance fields of its base class live on the object itself and are initialized by a Klass
constructor function. The important thing to remember is that class members can never be undefined
.
function $java_lang_Object() {
this._hashCode = 0;
}
The above code snippet is Java's java/lang/Object
constructor and the Klass object itself. To create instances of the Java object class one would have to write:
var o = new $java_lang_Object();
The java/lang/Object
class has no instance members, so there is nothing to initialize. All objects however have a _hashCode
property that is defaulted to 0
. We could define this property lazily whenever it is needed but we try to never change the shape of Java objects, to keep the JavaScript engine happy.
Klass constructors are shared across all runtimes, and they are sometimes referred to as template klasses. Klass objects are constructed (eval'ed / bound) or loaded dynamically when new classes are loaded. Runtime specific versions of Klass objects are called a RuntimeKlass
. These objects hold static properties and multiple copies may exist, one per each runtime. To access a runtime class, you'll need to look it up on the current runtime object.
$.$java_lang_Object
The $
global property always refers to the current runtime (so there is no need to thread it around). The runtime object itself is an instance of the Runtime
class which has as its __prototype__
, a reference to a instance of the RuntimeTemplate
class. All symbolic references to classes are added to the RuntimeTemplate
as memoizing getters. When accessing $.$java_lang_Object
for the first time, a getter in the RuntimeTemplate
triggers class loading and patches $.$java_lang_Object
with the RuntimeKlass
that represents the java/lang/Object
class.
The correct way to initialize an instance of the java/lang/Object
class is:
$.$java_lang_Object;
var o = new $java_lang_Object();
This is to ensure that the java/lang/Object
is initialized before it is used.
Consider the following Java classes.
class A {
static int baz = 1;
static void foo(int x) {}
int car = 2;
void bar(int x) {}
}
class B extends A {
static int cat;
void bar(int x) {}
}
Static class methods are stored in the global object, using a mangled name that is derived from the class name, method name, and method signature. This guarantees no name collisions:
var A_foo_ov20z = function () { ... };
Method signature can be quite lengthy, so they are mangled to shorter names. Here we mangle the signature to ov20z
.
Instance functions are stored on the $A.prototype
object.
$A.prototype.B_bar_ov20z = function () { ... };
Static fields are stored in the RuntimeKlass object $.$A
.
Linking up B
to derive from A
is straight forward. The overall code along with the constructors is:
function $A() {
this._hashCode = 0;
this.$car = 0;
}
function A_dinite_FBIwCo() {
java_lang_Object_dinite_FBIwCo.call(this);
this.$car = 2;
}
$A.prototype.xxx = ...;
function $B() {
this._hashCode = 0;
this.$car = 0;
}
function B_dinite_FBIwCo() {
A_dinite_FBIwCo.call(this);
}
$B.prototype = Object.create(A.prototype, null);
$B.prototype.xxx = ...;
An instance of class B
is constructed and initialized with:
var o = new $B(); // Create empty object.
B_dinite_FBIwCo.call(o); // Call constructor.
Common exceptions like the NullPointerExceptions
exception doesn't need to be explicitly checked. These kinds of checks are very common in Java and checking for them explicitly would be quite expensive. Instead, we simply ignore the checks and let the JavaScript engine throw TypeErrors.
var a = null;
a.foo();
Whenever this exception is caught, it is translated into a proper NullPointerException
Java exception and handled or rethrown. Check casts, array bounds, etc. need explicit checks.
Class type checks are emitted as calls to two runtime methods: checkCastKlass(o, C)
and checkCastInterface(o, C)
. These are implemented efficiently using a display table. Each klass holds a table of its base klasses and its depth in the class hierarchy. This table is created during class loading. The subtype check becomes a lookup in this table:
export function isAssignableTo(from: Klass, to: Klass): boolean {
if (to.isInterfaceKlass) {
return from.interfaces.indexOf(to) >= 0;
} else if (to.isArrayKlass) {
if (!from.isArrayKlass) {
return false;
}
return isAssignableTo(from.elementKlass, to.elementKlass);
}
return from.display[to.depth] === to;
}
See Fast Subtype Checking in the Hotspot JVM for details.
The VM supports multi-threading. This means that a thread may need to yield to another thread. Suspending and resuming threads is easy for the interpreter since it has full control of the stack, but more difficult for the compiler.
A root set of methods that can potentially yield need to be annotated as such. The compiler performs a call graph analysis to determine what call sites yield and then for each such call site it emits code that saves the state of the current frame following the call site. Consider the following function that calls a method y
that yields.
int x() {
int a = 1;
int b = 2 + y();
int c = 3;
return a + b + c;
}
Method x
is compiled as:
function x() {
var t = y();
if (U) { $.B(7, [1, 0, 0], [2]); return; }
return t + 6;
}
The state of the Java frame at the y
call site (bytecode position 7) is locals = [1, 0, 0]
, stack = [2]
.
function y() {
if (...) {
U = true; // Yield
return;
}
}
If y
yields by setting the global flag U
to true
then the code following the call to y
saves its frame by calling $.B
and returns. This unwinds the stack of compiled methods all the way up to an interpreter frame where a proper interpreter call stack can be built and saved. This is possible because the compiler keeps a mapping of live values at each call yielding call site.
Yielding code is verbose, thus it's important that we only insert it when we really need to. The call graph analysis is conservative. More work should be done to make it precise. Yielding code also needs to be inserted whenever a monitor enter bytecode is executed, or right after a class initilization check, as the static constructor could trigger yielding.
The compiler uses a name mangling scheme to uniquely encode references to class names, fields and methods. Names are hashed to 32-bit numbers and then encoded using variable encoding of the following 64 chars: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789$_
, a combination of which happen to be valid identifier names.
Classes are mangled by escaping the package name and class name.
Methods are mangled by escaping the class name, method name and hashing the signature.
Field names are mangled by ensuring there are no naming conflicts in the class hierarchy.