Skip to content

Voodoo ZPAQL

Franco Corbelli edited this page Sep 19, 2023 · 1 revision

ZPAQL

ZPAQ supports arbitrary compression algorithms in addition to the built in levels. For example, method "x4.0c0.0.255.255i4" compression could alternatively be specified using the ZPAQL language description of the compression algorithm:

int args[9]={0} c.startBlock( "(min.cfg - equivalent to level 1) " "comp 1 2 0 0 2 (log array sizes hh,hm,ph,pm and number of components n) " " 0 icm 16 (order 2 indirect context model using 4 MB memory) " " 1 isse 19 0 (order 4 indirect secondary symbol estimator, 32 MB) " "hcomp (context computation, input is last modeled byte in A) " " *b=a a=0 (save in rotating buffer M pointed to by B) " " d=0 hash b-- hash *d=a (put order 2 context hash in H[0] pointed by D)" " d++ b-- hash b-- hash d=a (put order 4 context in H[1]) " " halt " "end " (no pre/post processing) ", args, // Arguments $1 through $9 to ZPAQL code (unused, can be NULL) &out); // Writer to write pcomp command (default is NULL)

The first argument is a description of the compression algorithm in the ZPAQL language. It is compiled into byte code and saved in the archive block header so that the decompressor knows how read the data. A ZPAQL program accepts up to 9 numeric arguments, which should be passed in array.

A decompression algorithm has two optional parts, a context mixing model and a postprocessor. The context model is identical for both the compressor and decompressor, so is used in both instances. The postprocessor, if present, is generally different, which presents the possibility that the user supplied code may not restore the original data exactly. It is assumed that the input has already been preprocessed by the application but that the hash supplied to endSegment() is of the original input before preprocessing. The following functions allow you to test the postprocesser during compression:

c.setVerify(true); // before c.compress(), may run slower c.getSize(); // return 64 bit size of postprocessed output c.getChecksum(); // after c.readSegmentEnd(), return hash, reset size to 0

This example has no postprocessor, but if it did, then setVerify(true) would cause compress() to run the preprocessed input through the postprocessor and into a SHA1 in parallel with compression. Then, getChecksum() would return the hash which could be compared with the hash of the input computed by the application. Also,

int64_t size; c.endSegmentChecksum(&size, true);

instead of c.endSegment() will automatically add the computed checksum if setVerify is true and return the checksum, whether or not there is a postprocessor. If &size is not NULL then the segment size is written to size. If the second argument is false then the computed checksum is not saved to output. Default is true. If setVerify is false, then no checksum is saved and the function returns 0 with size not written.

A context model consists of two parts, an array COMP of n components, and some code HCOMP that computes contexts for the components. The model compresses one bit at a time (MSB to LSB order) by computing a probability that the next bit will be a 1, and then arithmetic coding that bit. Better predictions compress smaller.

If n is 0 then the data is uncompressed. Otherwise, there is an array of n = 1..255 components each taking a context and possibly the predictions of previous components as input and outputting a new prediction. The output of the last prediction is used to encode the bit. After encoding, the state of each component is updated to reduce the prediction error when the same context occurs again. Components are as follows. Most arguments have range 0...255.

CONST c predict a 1 (c > 128) or 0 (c < 128). CM s t context model with 2^s contexts, learning rate 1/4t. ICM s indirect context model with 2^(s+6) contexts. MATCH s b match model with 2^s context hashes and 2^b history. AVG j k wt average components j and k with weight wt/256 for j. MIX2 s j k r x average j and k with 2^s contexts, rate r, mask x. MIX s j m r x average j..j+m-1 with 2^s contexts, rate r, mask x. ISSE s j adjust prediction j using 2^(s+6) indirect contexts. SSE s j t1 t2 adjust j using 2^s direct contexts, rate 1/t1..1/4t2.

A CONST predicts a 1 with probability 1/(1+exp((128-c)/16)), i.e numbers near 0 or 255 are the most confident.

A CM maps a context to a prediction and a count. It is updated by adjusting the prediction to reduce the error by 1/count and incrementing the count up to 4t.

A ICM maps a s+10 bit context hash to a bit history (8 bit state) representing a bounded count of zeros and ones previously seen in the context and which bit was last. The bit history is mapped to a prediction, which is updated by reducing the error by 1/1024. The initial prediction is estimated from the counts represented by each bit history.

A MATCH looks up a context hash and predicts whatever bit came next following the previous occurrence in the history buffer. The strength of the prediction depends on the match length.

AVG, MIX2, and MIX perform weighted averaging of predictions in the logistic domain (log(p/(1-p))). AVG uses a fixed weight. MIX2 and MIX adjust the weights (selected by context) to reduce prediction error by a rate that increases with r. The mask is AND-ed with the current partially coded byte to compute that context. Normally it is 255. A MIX takes a contiguous range of m components as input.

ISSE adjusts a prediction using a bit history (as with an ICM) to select a pair of weights for a 2 input MIX. It mixes the input prediction with a constant 1 in the logistic domain.

SSE adjusts a logistic prediction by quantizing it to 32 levels and selecting a new prediction from a table indexed by context, interpolating between the nearest two steps. The nearest prediction error is reduced by 1/count where count increments from t1 to 4*t2.

Contexts are computed and stored in an array H of 32 bit unsigned integers by the HCOMP program written in ZPAQL. The program is called after encoding a whole byte. To form a complete context, these values are combined with the previous 0 to 7 bits of the current parital byte. The method depends on the component type as follows:

CM: H[i] XOR hmap4(c). ICM, ISSE: hash table lookup of (H[i]*16+c) on nibble boundaries. MIX2, MIX: H[i] + (c AND x). SSE: H[i] + c.

where c is the previous bits with a leading 1 bit (1, 1x, 1xx, ..., 1xxxxxxx where x is a previously coded bit). hmap4(c) maps c to a 9 bit value to reduce cache misses. The first nibble is mapped as before and the second nibble with 1xxxx in the high 5 bits. For example, after 6 bits, where c = 1xxxxxx, hmap4(c) = 1xxxx01xx with the bits in the same order.

There are two ZPAQL virtual machines, HCOMP to compute contexts and PCOMP to post-process the decoded output. Each has the following state:

PC: 16 bit program counter. A, B, C, D, R0...R255: 32 bit unsigned registers. F: 1 bit condition register. H: array of 2^h 32 bit unsigned values (output for HCOMP). M: array of 2^m 8 bit unsigned values.

All values are initialized to 0 at the beginning of a block and retain their values between calls. There are two machines. HCOMP is called after coding each byte with the value of that byte in A. PCOMP, if present, is called once for each decoded byte with that byte in A, and once more at the end of each segment with 2^32 - 1 in A.

Normally, A is an accumulator. It is the destination of all binary operations except assignment. The low m bits of B and C index M. The low h bits of D indexes H. We write *B, *C, *D to refer to the elements they point to. The instruction set is as follows, where X is A, B, C, D, *B, *C, *D except as indicated. X may also be a constant 0...255, written with a leading space if it appears on the right side of an operator, e.g. "*B= 255". Instructions taking a numeric argument are 2 bytes, otherwise 1. Arithmetic is modulo 2^32.

X<>A Swap X with A (X cannot be A). X++ Add 1. X-- Subtract 1. X! Complement bits of X. X=0 Clear X (1 byte instruction). X=X Assignment to left hand side. A+=X Add to A A-=X Subtract from A A*=X Multipy A/=X Divide. If X is 0 then A=0. A%=X Mod. If X is 0 then A=0. A&=X Clear bits of A that are 0 in X. A&~X Clear bits of A that are 1 in X. A|=X Set bits of A that are 1 in X. A^=X Complement bits of A that are set in X. A<<=X Shift A left by (X mod 32) bits. A>>=X Shift right (zero fill) A by (X mod 32) bits. A==X Set F=1 if equal else F=0. AX Set F=1 if greater else F=0. X=R N Set A,B,C,D to RN (R0...R255). R=A N Set R0...R255 to A. JMP N Jump N=-128...127 bytes from next instruction. JT N Jump N=-128...127 if F is 1. JF N Jump N=-128...127 if F is 0. LJ N Long jump to location 0...65535 (only 3 byte instruction). OUT Output A (PCOMP only). HASH A=(A+*B+512)*773. HASHD *D=(*D+A+512)*773. HALT Return at end of program. ERROR Fail if executed.

Rather than using jump instructions, the following constructs are allowed and translated appropriately.

IF ... ENDIF Execute if F is 1. IFNOT ... ENDIF Execute if F is 0. IF ... ELSE ... ENDIF Execute first part if F is 1 else second part. IFNOT ... ELSE ... ENDIF Execute first part if F is 0 else second part. DO ... WHILE Loop while F is 1. DO ... UNTIL Loop while F is 0. DO ... FOREVER Loop unconditionally.

Forward jumps (IF, IFNOT, ELSE) will not compile if beyond 127 instructions. In that case, use the long form (IFL, IFNOTL, ELSEL). DO loops automatically use long jumps if needed. IF and DO loops may intersect. For example, DO ... IF ... FOREVER ENDIF is equivalent to a while-loop.

A config argument without a postprocessor has the following syntax:

COMP hh hm ph pm n i COMP args... HCOMP zpaql... END (or POST 0 END for backward compatibility)

With a postprocessor:

COMP hh hm ph pm n i COMP args... HCOMP zpaql... PCOMP command args... ; zpaql... END

In HCOMP, H and M have sizes 2^hh and 2^hm respectively. In PCOMP, H and M have sizes 2^ph and 2^pm respectively. There are n components, which must be numbered i = 0 to n-1. If a postprocessor is used, then "command args..." is written to the Writer* passed as the 4'th argument, but otherwise ignored. A typical use in a development environment might be to call an external program that will be passed two additional arguments on the command line, the input and output file names respectively.

You can pass up to 9 signed numeric arguments in args[]. In any place that a number "N" is allowed, you can write "$M" or "$M+N" (like "$1" or $9+25") and value args[M-1]+N will be substituted.

ZPAQL allows (nested) comments in parenthesis. It is not case sensitive. If there are input errors, then error() will report the error. If the string contains newlines, it will report the line number of the error.

ZPAQL is compiled internally into a byte code, and then to native x86 32 or 64 bit code (unless compiled with -DNOJIT, in which case the byte code is interpreted). You can also specify the algorithm directly in byte code, although this is less convenient because it requires two steps:

c.startBlock(hcomp); // COMP and HCOMP at start of block c.postProcess(pcomp, 0); // PCOMP right before compress() in first segment

This is necessary because the COMP and HCOMP sections are stored in the block header, but the PCOMP section is compressed in the first segment after the filename and comment but before any data.

To retrive compiled byte code in suitable format after startBlock():

c.hcomp(&out); // writes COMP and HCOMP sections c.pcomp(&out); // writes PCOMP section if any

Or during decompression:

d.hcomp(&out); // valid after findBlock() d.pcomp(&out); // valid after decompress(0) in first segment

Both versions of pcomp() write nothing and return false if there is no PCOMP section. The output of hcomp() and pcomp() may be passed to the input of startBlock() and postProcess(). These are strings in which the first 2 bytes encode the length of the rest of the string, least significant byte first. Alternatively, postProcess() allows the length to be omitted and passed separately as the second argument. In the case of decompression, the HCOMP and PCOMP strings are read from the archive. The preprocessor command (from "PCOMP cmd ;") is not saved in the compressed data.

ARRAY

The libzpaq::Array template class is convenient for creating arrays aligned on 64 byte addresses. It calls error("Out of memory") if needed. It is used as follows:

libzpaq::Array a(n); // array a[0]..a[n-1] of type T, zeroed a.resize(n); // change size and zero contents a[i] // i'th element a(i) // a[i%n], valid only if n is a power of 2 a.size() // n (as a size_t) a.isize() // n (as a signed int)

T should be a simple type without constructors or destructors. Arrays cannot be copied or assigned. You can also specify the size:

Array a(n, e); // n << e a.resize(n, e); // n << e

which is equivalent to n << e except that it calls error("Array too big") rather than overflow if n << e would require more than 32 bits. If compiled with -DDEBUG, then bounds are checked at run time.

Clone this wiki locally