16 April 2017, 17:44 | #521 | |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,474
|
Quote:
Quite good and fast format (for the autoimposed limits) but sure you haven't missed http://eab.abime.net/showpost.php?p=...&postcount=471? Code:
1.024 bootblock.bin 877 bootblock.bin.lz32b 922 bootblock.bin.wlz.ref 68.708 copchunky.exe 49.147 copchunky.exe.lz32b 51.482 copchunky.exe.wlz.ref 449.132 simple.bin 15.893 simple.bin.lz32b 16.884 simple.bin.wlz.ref 164.344 twister.exe 64.063 twister.exe.lz32b 66.028 twister.exe.wlz.ref Mixed 8/16bit packed data and 32byte decoder: http://eab.abime.net/showpost.php?p=...&postcount=480 Bye! ross |
|
16 April 2017, 21:21 | #522 | ||
Registered User
Join Date: Feb 2017
Location: Denmark
Posts: 1,104
|
Quote:
FWIW my final byte-oriented coder performs as follows: Quote:
Code:
; a0 = dest ; a1 = src unpack: movem.l d2-d3/a2, -(a7) .refill: moveq #8, d0 ; make sure upper bits of d0 are clear .getcontrolbits: move.w d0, d1 ; d1 = number of control bits (8 normally, less on short run) move.b (a1)+, d0 ; d0 = control bits bne.s .mainloop ; special case? move.b (a1)+, d0 ; yes load another byte beq.s .mainloop ; the control bits just happened to be 0 cmp.b d0, d1 ; See if this is a short run/the end or a literal run beq.s .quit ; =8 -> quit bcc.s .getcontrolbits ; <8 -> the byte specified the number of control bits, d0 is copied to d1 and refilled (with a value guaranteed not to be 0) addq.w #6, d0 ; >8 -> literal run length - 7, add 6 (for dbf) .litcopy: move.b (a1)+, (a0)+ dbf d0, .litcopy bra.s .refill .mainloop: ; d0.b = control bits ; d1.b = num control bits subq.b #1, d1 ; we consume one control bit bmi.s .refill ; are we out of bits? add.b d0, d0 ; get control bit bcc.s .match ; clear -> match move.b (a1)+, (a0)+ ; copy literal bra.s .mainloop ; and continue loop .match: moveq #-1, d2 ; d2 = !!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!! move.b (a1)+, d2 ; d2 = !!!!!!!!!!!!!!!! ObOaO9O8L3L2L1L0 move.b d2, d3 ; d3 = ???????????????? ObOaO9O8L3L2L1L0 lsl.w #4, d2 ; d2 = !!!!!!!!ObOaO9O8 L3L2L1L0........ move.b (a1)+, d2 ; d2 = !!!!!!!!ObOaO9O8 O7O6O5O4O3O2O1O0 lea (a0,d2.w), a2 ; a2 = &out[-1-offset] move.b (a2)+, (a0)+ ; copy first byte moveq #$f, d2 ; d2 = ................ ........!!!!!!!! and.w d2, d3 ; d3 = ................ ........L3L2L1L0 bne.s .copyloop move.b (a2)+, (a0)+ ; copy another byte move.b (a1)+, d3 ; get extra length add.w d2, d3 ; + $f (and with one extra iteration from dbf and the above copies we get +$12 (18)) .copyloop: move.b (a2)+, (a0)+ ; copy the rest of the match dbf d3, .copyloop bra.s .mainloop ; handle next lz value .quit: movem.l (a7)+, d2-d3/a2 ; restore registers rts ; and return |
||
17 April 2017, 15:12 | #523 | |||
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,474
|
Quote:
It's useful for two thing: bootblock packing (surprisingly good!) and code prepacking/deduplication. My real (unborn) packer, a sort of titanics replacer, is overbloated, so a prepacker was a necessity. I assure you that i've (mine) better compressors algorithms Quote:
Quote:
Bye! ross |
|||
02 May 2017, 09:24 | #524 |
Registered User
Join Date: Mar 2016
Location: Ozherele
Posts: 229
|
This message was misplaced. Please delete it. It was for the other thread.
Last edited by litwr; 02 May 2017 at 09:33. Reason: it was a wrong place for this message. |
22 September 2018, 20:43 | #525 | |
Registered User
Join Date: Dec 2007
Location: Aarhus / Denmark
Posts: 40
|
Wow, a long discussion about compression, and I wasn't there to take part in it!
Anyway, reading through the posts, I found my name mentioned by @ross: Quote:
Code:
Decrunch: ; Source in A0, destination in A1 move.b (a0)+,d0 .literal: move.b (a0)+,(a1)+ bsr.b GetBit bcc.b .literal bsr.b GetBit bcc.b .ref moveq.l #-1,d1 move.b (a0)+,d1 beq.b End .ref: move.b (a1,d1.w*2),(a1)+ bsr.b GetBit bcs.b .ref bra.b .literal GetBit: add.b d0,d0 bne.b .same move.b (a0)+,d0 addx.b d0,d0 .same: ; C = bit End: rts The first byte of the bit stream contains only 7 bits of data. The least significant bit is always 1. This is used as a marker bit by the bit stream decoder to know when it has decoded all bits of the current byte. For subsequent bytes, the marker bit is inserted by the decoder, so all 8 bits can be used. The first symbol is always a literal. After that, bits from the bit stream indicate what follows: 0: Literal. Copy one byte from the byte stream. 1 0: Match with repeated offset. 1 1: Match with new offset. Read a byte from the byte stream. Zero means end of data. Otherwise, multiply by two and one-extend to a word. Thus, all even offsets up to 510 can be represented (680x0 code has mostly even-offset matches). For both kinds of match, the length is unary coded in the bit stream: a match of length k is coded as k-1 one bits followed by a zero bit. A match is always followed by a literal, so the literal/match indicator bit is omitted in this case. To compare the cruncher to the lz32b algorithm discussed here, I ran some of my bootblocks through the lz32b compressor. The bootblocks of course contain a bit more overhead than just this code (12 byte bootblock header + some code to set up the pointers, flush the cache and call the decompressed code). So the actual payload (which is what is compressed here) compress to around 950-960 bytes using my algorithm. These are the sizes I got (all were compressed with 10 bit offsets, which was the best setting in all cases): Space Onion: 1364 -> 1046 BeelzeBub's Beat-Block: 1336 -> 1081 Swirl: 1280 -> 1042 Fluid Fire: 1388 -> 1076 Splatblob: 1186 -> 1055 As can be seen, my method beats lz32b by quite a significant margin (around 100 bytes), so it is well worth the slightly bigger decrunch code for a 1k target size. The comparison is of course not entirely fair, since these particular payloads were tuned to compress well using my cruncher. But I still think it would have an edge in most practical cases, as long as the important matches are within the 510 bytes maximum distance (which can usually be achieved with some tweaking). I have attached a zip containing the payload files in case anyone wants to experiment. Last edited by Blueberry; 22 September 2018 at 21:32. |
|
22 September 2018, 21:35 | #526 |
Registered User
Join Date: Dec 2007
Location: Aarhus / Denmark
Posts: 40
|
Another data point:
The bootblock.bin data file that @paraj posted crunches to 790 bytes. |
22 September 2018, 21:38 | #527 | |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,474
|
Quote:
But this cruncher is on another level: match with repeated offset and the even offset make too much difference. There is a thread somewhere about a modified aplib packer: it simply beat everything I thrown at it because use repeated offset (logically not counting for entropy encoder or cm but compared to other bare and fast lz, near 100kb/s on A500). I simply tried to write the smallest depacker possible with maximum compression for that size and 32 byte seemed to me a great value to reach Thanks for the code and the description! PS: I always try to read what you write about the algorithms you build. The last one in the checkboard challenge is simply awesome, I had to read it 3 times to understand it |
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Starting ASM coding on A1200. Which Assembler? | Nosferax | Coders. Asm / Hardware | 68 | 27 November 2015 16:14 |
4th tutorial on ASM- and HW-coding | Vikke | Coders. Asm / Hardware | 11 | 10 April 2013 20:32 |
3rd tutorial on ASM- and HW-coding | Vikke | Coders. Asm / Hardware | 6 | 26 March 2013 15:57 |
First tutorial on ASM- and HW-coding | Vikke | Coders. Asm / Hardware | 46 | 18 March 2013 12:33 |
2nd tutorial on ASM- and HW-coding | Vikke | Coders. Asm / Hardware | 10 | 17 March 2013 11:49 |
|
|