View Single Post
Old 10 December 2021, 18:28   #68
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,474
Quote:
Originally Posted by introspec View Post
Can you document the specific format extension that you are using for your project? I.e., to have larger window size, you had to modify the end of the stream marker. How exactly did you do it? Did you make any other changes to the compressed data format?
Yes, bitcode is slightly different and end code marker is moved to literal run path.
Everything has been thought to be as fast as possible for 68k architecture (and it is much faster that original decoder).
Now the elias bits are inserted in/for two different configurations: positive or negative return numbers, depending on usage;
as I use unrolled code it's not a problem at all, but you cannot use anymore subroutines to extract bits.

The single bit 'unrelated' to the offset (which is used for the length of the match and is in the LSB position of the read byte) is moved in main bitcode stream, and in its place, but in the MSB position, is inserted a (Elias) bit that tells me if I have a short (7-bit) offset or a long one, to be completed with the streamed elias-gamma bits (to form a possible 2^23 offset, encoded as negative).

The exit is moved to literals run for two reasons:
- can eventually shorten the bitcode (and add a free helper for in-place unpack)
- it's the only way to handle >32k literals that can overflow the bitcoding
It basically add an escape code to copy till 163835 (repeatable 'ad infinitum') bytes at a time; it requires exactly two bytes +1bit (to re-align to bit stream).

If your encoding ends with a match (both rep and normal) then add the specific 31-bit code for the output without further copying (direct exit).
If your encoding ends with literals you can free use a copy till 32764 bytes.
If your encoding ends with >32764 you can use the escape code (literals overflow) until you end the queue.
During normal code flow you can use directly till 32769 byte of literals.
This way there is also an escape code usable to facilitate in-place unpack.

Well, yes, can be confusing without examples, but 68k handle it very well with some implicit properties
Probably the source code is better for a correct understanding (both encoder and decoder).
ross is offline  
 
Page generated in 0.04426 seconds with 11 queries