-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mappings v2: More Efficient Encoding #155
Comments
Updated the gist and analysis with 2 new modifications to the bit flags design:
Omit NamesIn the bit flags design, we're using 2 bits to represent 3 mapping lengths, leaving us with a valid 4th value. We can use that 4th value to signal when a 5-length mapping has a 0 delta in the function readMapping() {
const data = readPosInt();
const length = data & 0b0101; // 0, 1, 4, or 5
const sourcesIndexPresent = data & 0b0010;
const sourceLinePresent = data & 0b1000;
const namesIndexPresent = length === 0;
const genCol = data >>> 4;
if (length === 1) return [genCol];
const index = sourcesIndexPresent ? readInt() : lastIdx;
const line = sourceLinePresent ? readInt() : lastLine;
const col = readInt();
if (length === 4) return [genCol, index, line, col];
const name = namesIndexPresent ? readInt() : lastName;
return [genCol, index, line, col, name];
} This saves an additional 1% Remove sourceless (1-length) mappingsThe other option is to remove sourceless segments entirely. 99.999999999% of mappings have some original source location, but we're using 2 bits to encode the 3 mapping lengths. If we remove sourceless segments, then we only need 1 bit to represent 4/5 length mappings. function readMapping() {
const data = readPosInt();
const length = 4 | (data & 0b001)
const sourcesIndexPresent = data & 0b010;
const sourceLinePresent = data & 0b100;
const genCol = data >>> 3;
const index = sourcesIndexPresent ? readInt() : lastIdx;
const line = sourceLinePresent ? readInt() : lastLine;
const col = readInt();
if (length === 4) return [genCol, index, line, col];
return [genCol, index, line, col, readInt()];
} This saves an additional 6% (in everything but the Closure source maps used by Google) |
In the last Scopes meeting, I mentioned that I had been working on a project that reduced reduced Google's module graph encoding by ~30% by using a packed VLQ encoding, essentially removing any separators like
,
or;
. Applying this to our mappings encoding, I think we can remove ~30% (or ~50% if we switch to an 8-bit VLQ and binary encoding).Removing Separators
In order to remove separators, we first need to know exactly how many lines are present in the map, and how many mappings are present on each line. That should allow us to do a simple loop (ignore the relative deltas, this is just psuedocode):
The problem is that each mapping has a variable number of fields (either 1, 4, or 5). Without a
,
separator, we don't know when to stop reading the fields for the current mapping. So we also need to encode the length of each mapping. It's easy to do this with a field before each mapping:This alone is pretty good, but we can still do better.
genColumn
is frequently very small, just a few bits of data. Instead of wasting 8 bits to encode the length of the mapping, we can use the low bits ofgenColumn
:We can still do better.
genColumn
is never negative in practice. Instead of using zigzag encoding, we could just encode it as a positive int.Just eliminating separators can save us ~10-15%.
Omitting
sourcesIndex
andsourceLine
The next thing that I've noticed is that
sourcesIndex
rarely changes between mappings, and the same withsourceLine
. This makes a lot of sense, if we're transpiling we'll be outputting a lot of mappings that are on the same line as the previous one.If the
sourcesIndex
delta or thesourceLine
delta are 0, we could omit them from the encoding. This just requires 2 more bits, bringing our total to 4 bits of data packing. We can still encode this pretty easily ingenColumn
:This saves us ~25-35%
Analysis sheet, code
This is a highlight from Google Search's internal source map:
The text was updated successfully, but these errors were encountered: