30 March 2020, 10:39 | #1 |
Registered User
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,438
|
RNC Propack "in place" decompression does not always work
I've been using RNC Propack using it's built in support for "in place" decompression for a while now and never had any issues. But yesterday I came across a file that would not unpack properly if the source and destination were the same address. It would work properly if I used a different destination address. As I expected, merely changing one or two bytes in the source file also made it work again. I also tried increasing BUFSIZE in the source, but that didn't change anything.
Since others might also use this feature, I thought it a good idea to post a "heads-up" about this. It's a bit of a pity, as in place decompression saves the need for larger buffers or indirect loading. I have not tested if pushing the crunched buffer to the end of the destination makes a difference (I did quickly glance at the RNC source code and if I understood it, it seems the unpacking code pushes stuff to the end of the destination buffer almost immediately so I guess that wouldn't work). Apart from the heads-up I also have a small question: does anyone know a cross-platform (PC for crunching/Amiga for decrunching) cruncher/decruncher that does not need additional buffer space* while decompressing and that gets a similar crunching rate? It's ok if that cruncher requires the packed file to be positioned at the end of the destination buffer, but not if it requires extra leeway or similar at the end of the buffer. For reference, I use this version of the RNC unpack source: http://aminet.net/package/util/pack/RNC_ProPack And this version of the RNC pack source: https://github.com/lab313ru/rnc_propack_source *) stack use is OK |
30 March 2020, 15:48 | #2 |
Moderator
Join Date: Nov 2001
Location: Germany
Posts: 876
|
If you used lab313 to compress the file maybe it doesn't set correctly the byte at offset 10 which contains the required overlap distance?
I found this compressor anyway not reliable. |
30 March 2020, 16:31 | #3 |
Registered User
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,438
|
I suppose I could test it vs the Amiga executable in the "official" RCN_ProPack.lha to see if this is done differently. That might be a good idea, thanks!
|
30 March 2020, 16:44 | #4 |
Moderator
Join Date: Nov 2001
Location: Germany
Posts: 876
|
You also should know, that RNC is not really able to to decompress over the compressed data. It saves the amount of bytes at offset 10 after the uncompressed length on the stack. Then moves the compressed data so that the end of compressed data is at the end of uncompressed data + offset from 10. After decompression it restores the data saved before.
So it trashes temporary data after the uncompressed length. Most times this will not harm but you should know about (e.g. don't use it in a multitaking environment). Besides that it isn't optimal in performance because copying the compressed data. Imploder for example can really decompress inplace at the drawback that files containing noncompressable parts can often not be compressed. And it has worse ratio. |
30 March 2020, 16:56 | #5 |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 54
Posts: 4,498
|
There is a nice compression testbed made by Antiriad_UK with good alternatives to RNC (also for in-place decompression)
|
30 March 2020, 16:58 | #6 |
move.l #$c0ff33,throat
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 6,865
|
|
30 March 2020, 17:00 | #7 | ||
Registered User
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,438
|
Quote:
Thanks for the explanation by the way, it's very helpful to understand what is going on. Quote:
Last edited by roondar; 30 March 2020 at 17:02. Reason: Added missing quote |
||
30 March 2020, 17:02 | #8 |
Moderator
Join Date: Nov 2001
Location: Germany
Posts: 876
|
; RNC1.new fileformat:
; 0 BYTE "RNC",$01 ; 4 LONG unpacked size ; 8 LONG packed stream size ; c WORD unpacked crc ; e WORD packed crc ; 10 UBYTE required offset if decrunching over itself ; 11 BYTE ; 12 STRUCT packed stream ; decrunching source++ destination++ |
30 March 2020, 17:04 | #9 |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 54
Posts: 4,498
|
|
30 March 2020, 17:04 | #10 |
Registered User
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,438
|
Super, thanks
|
30 March 2020, 17:10 | #11 |
Going nowhere
Join Date: Oct 2001
Location: United Kingdom
Age: 50
Posts: 9,020
|
|
30 March 2020, 17:13 | #12 |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 54
Posts: 4,498
|
My two nrv2x decoder sources are publicly available.
Decompress in-place from bottom (r-version) or top (s-version) memory buffer. It does not need to move data (like RNC), so fast and compression is very good. Try it |
30 March 2020, 17:17 | #13 |
Registered User
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,438
|
Sounds good, where can I find the source for cruncher/decruncher?
|
30 March 2020, 17:30 | #14 | |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 54
Posts: 4,498
|
Quote:
I do not find the nrv2r, but I'm pretty sure I've posted it somewhere, so here the generic one, extract the version you need: Code:
; nrv2x decoder in pure 68k asm for Rygar AGA ; On entry: ; a0 buffer start ; d0 offset inside buffer for packed data ; On exit: ; buffer filled with unpacked data ; all registers preserved nrv2x_decoder: movem.l d0-d5/a0-a4,-(sp) move.l d0,d3 lea (a0),a1 adda.l d0,a0 ; common setup moveq #-$80,d0 moveq #-1,d2 moveq #2,d4 moveq #-2,d5 ; select between s/r algorithms tst.l d3 beq.w nrv2r_unpack ; nrv2s decompression in pure 68k asm ; by ross ; ; On entry: ; a0 src packed data pointer ; a1 dest pointer ; (decompress from a0 to a1) ; ; On exit: ; a0 = dest start ; a1 = dest end ; ; Register usage: ; a2 m_pos ; a3 constant: -$d00 ; a4 2nd src pointer (in stack) ; ; d0 bit buffer ; d1 m_off ; d2 m_len or -1 ; ; d3 last_m_off ; d4 constant: 2 ; d5 reserved space on stack (max 256) ; ; ; Notes: ; we have max_offset = 2^23, so we can use some word arithmetics on d1 ; we have max_match = 65535, so we can use word arithmetics on d2 ; nrv2s_unpack ; movem.l d0-d5/a1-a4,-(sp) move.b (a0)+,d1 ; ~stack usage ; moveq #-2,d5 and.b d1,d5 lea (sp),a4 adda.l d5,sp ; reserve space ._stk move.b (a0)+,-(a4) addq.b #1,d1 bne.b ._stk ; ------------- setup constants ----------- ; moveq #-$80,d0 ; d0.b = $80 (byte refill flag) ; moveq #-1,d2 moveq #-1,d3 ; last_off = -1 ; moveq #2,d4 movea.w #-$d00,a3 ; ------------- DECOMPRESSION ------------- .decompr_literal move.b (a0)+,(a1)+ .decompr_loop add.b d0,d0 bcc.b .decompr_match bne.b .decompr_literal move.b (a0)+,d0 addx.b d0,d0 bcs.b .decompr_literal .decompr_match moveq #-2,d1 .decompr_gamma_1 add.b d0,d0 bne.b ._g_1 move.b (a0)+,d0 addx.b d0,d0 ._g_1 addx.w d1,d1 ; max 2^23! add.b d0,d0 bcc.b .decompr_gamma_1 bne.b .decompr_select move.b (a0)+,d0 addx.b d0,d0 bcc.b .decompr_gamma_1 .decompr_select addq.w #3,d1 beq.b .decompr_get_mlen ; last m_off bpl.b .decompr_exit_token lsl.l #8,d1 move.b (a0)+,d1 move.l d1,d3 ; last_m_off = m_off .decompr_get_mlen ; implicit d2 = -1 add.b d0,d0 bne.b ._e_1 move.b (a0)+,d0 addx.b d0,d0 ._e_1 addx.w d2,d2 add.b d0,d0 bne.b ._e_2 move.b (a0)+,d0 addx.b d0,d0 ._e_2 addx.w d2,d2 lea (a1,d3.l),a2 addq.w #2,d2 bgt.b .decompr_gamma_2 .decompr_tiny_mlen move.l d3,d1 sub.l a3,d1 addx.w d4,d2 .L_copy2 move.b (a2)+,(a1)+ .L_copy1 move.b (a2)+,(a1)+ dbra d2,.L_copy1 .L_rep bra.b .decompr_loop .decompr_gamma_2 ; implicit d2 = 1 add.b d0,d0 bne.b ._g_2 move.b (a0)+,d0 addx.b d0,d0 ._g_2 addx.w d2,d2 add.b d0,d0 bcc.b .decompr_gamma_2 bne.b .decompr_large_mlen move.b (a0)+,d0 addx.b d0,d0 bcc.b .decompr_gamma_2 .decompr_large_mlen move.b (a2)+,(a1)+ move.b (a2)+,(a1)+ cmp.l a3,d3 bcs.b .L_copy2 move.b (a2)+,(a1)+ dbra d2,.L_copy1 .decompr_exit_token lea (a4),a0 bclr d2,d2 ; ;) bne.b .L_rep bra.w _common_exit ; suba.l d5,sp ; movem.l (sp)+,d0-d5/a0-a4 ; rts ; nrv2r decompression in 68000 assembly ; by ross ; ; On entry: ; a0 src pointer ; [a1 dest pointer] ; (decompress also to a1=a0) ; ; On exit: ; all preserved but ; a1 = dest start ; ; Register usage: ; a2 m_pos ; a3 constant: $cff ; a4 2nd src pointer (in stack) ; ; d0 bit buffer ; d1 m_off ; d2 m_len or -1 ; ; d3 last_m_off ; d4 constant: 2 ; d5 reserved space on stack ; ; ; Notes: ; we have max_offset = 2^23, so we can use some word arithmetics on d1 ; we have max_match = 65535, so we can use word arithmetics on d2 ; nrv2r_unpack ; movem.l d0-d5/a0/a2-a4,-(sp) ; lea (a0),a1 ; if (a1) lea (a0),a4 adda.l (a0),a0 ; end of packed data move.l -(a0),(a1) ; if (a1) move.l -(a0),(a4) adda.l -(a0),a1 ; end of buffer move.b -(a0),d1 ; ~stack usage ; moveq #-2,d5 and.b d1,d5 adda.l d5,sp ; reserve space lea (sp),a4 ._stk move.b -(a0),(a4)+ addq.b #1,d1 bne.b ._stk ; ------------- setup constants ----------- ; moveq #-$80,d0 ; d0.b = $80 (byte refill flag) ; moveq #-1,d2 ; moveq #0,d3 ; last_off = 0(1) ; moveq #2,d4 movea.w #$cff,a3 ; ------------- DECOMPRESSION ------------- .decompr_literal move.b -(a0),-(a1) .decompr_loop add.b d0,d0 bcc.b .decompr_match bne.b .decompr_literal move.b -(a0),d0 addx.b d0,d0 bcs.b .decompr_literal .decompr_match moveq #1,d1 .decompr_gamma_1 add.b d0,d0 bne.b ._g_1 move.b -(a0),d0 addx.b d0,d0 ._g_1 addx.w d1,d1 ; max 2^23! add.b d0,d0 bcc.b .decompr_gamma_1 bne.b .decompr_select move.b -(a0),d0 addx.b d0,d0 bcc.b .decompr_gamma_1 .decompr_select subq.w #3,d1 bcs.b .decompr_get_mlen ; last m_off bmi.b .decompr_exit_token lsl.l #8,d1 move.b -(a0),d1 move.l d1,d3 ; last_m_off = m_off .decompr_get_mlen ; implicit d2 = -1 add.b d0,d0 bne.b ._e_1 move.b -(a0),d0 addx.b d0,d0 ._e_1 addx.w d2,d2 add.b d0,d0 bne.b ._e_2 move.b -(a0),d0 addx.b d0,d0 ._e_2 addx.w d2,d2 lea 1(a1,d3.l),a2 addq.w #2,d2 bgt.b .decompr_gamma_2 .decompr_tiny_mlen move.l a3,d1 sub.l d3,d1 addx.w d4,d2 .L_copy2 move.b -(a2),-(a1) .L_copy1 move.b -(a2),-(a1) dbra d2,.L_copy1 .L_rep bra.b .decompr_loop .decompr_gamma_2 ; implicit d2 = 1 add.b d0,d0 bne.b ._g_2 move.b -(a0),d0 addx.b d0,d0 ._g_2 addx.w d2,d2 add.b d0,d0 bcc.b .decompr_gamma_2 bne.b .decompr_large_mlen move.b -(a0),d0 addx.b d0,d0 bcc.b .decompr_gamma_2 .decompr_large_mlen move.b -(a2),-(a1) move.b -(a2),-(a1) cmpa.l d3,a3 bcs.b .L_copy2 move.b -(a2),-(a1) dbra d2,.L_copy1 .decompr_exit_token lea (a4),a0 bclr d2,d2 ; ;) bne.b .L_rep _common_exit suba.l d5,sp movem.l (sp)+,d0-d5/a0-a4 rts |
|
30 March 2020, 17:56 | #15 |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 54
Posts: 4,498
|
And if you want to smile a little, here is the code I used inside the compressor to test that it was perfectly compatible with the 68k decompressor:
Code:
/*********************************************************************** // 68k real backwards decoder: a0=src pointer, a1=dest pointer ************************************************************************/ int32_t _68k_real_backwards_decoder(uint8_t *a0, uint8_t *a1, uint32_t len) { uint8_t stack[256]; int32_t flag; uint8_t *end=a1; /* protect from overflow */ uint8_t *a2; /* m_pos */ int32_t C; /* carry */ int32_t d0=0x80; /* bit buffer */ int32_t d1; /* m_off */ int32_t d2=-1; /* m_len */ int32_t d3=0; /* last_m_off */ /* int32_t d4=2; */ /* constant: 2 */ int32_t a3=0xcff; /* constant: M2_MAX_OFFSET-1 */ flag=(int32_t)a0; /* setup */ a0+=__builtin_bswap32(*(uint32_t*)a0); *(uint32_t*)flag=*(uint32_t*)(a0-4); a1+=__builtin_bswap32(*(uint32_t*)(a0-8)); a0-=8; flag=*--a0|0xffffff00; /* move.b -(a0),d0, ~stack usage */ for (;flag<0;flag++) stack[256+flag]=*--a0; /* move.b -(a0),(a4)+ */ /* flag=0; */ /* first run */ decompr_literal: if (a1==end) return 0; *--a1=*--a0; /* move.b -(a0),-(a1) */ decompr_loop: d0+=d0; /* add.b d0,d0 */ C=d0>>8; d0&=0xff; if (!C) {goto decompr_match;} /* bcc.b decompr_match */ if (d0) {goto decompr_literal;} /* bne.b decompr_literal */ d0=*--a0; /* move.b -(a0),d0 */ d0+=(d0+C); /* addx.b d0,d0 */ C=d0>>8; d0&=0xff; if (C) {goto decompr_literal;} /* bcs..b decompr_literal */ decompr_match: d1=1; /* moveq #1,d1 */ decompr_gamma_1: d0+=d0; /* add.b d0,d0 */ C=d0>>8; d0&=0xff; if (d0) {goto _g_1;} /* bne.b _g_1 */ d0=*--a0; /* move.b -(a0),d0 */ d0+=(d0+C); /* addx.b d0,d0 */ C=d0>>8; d0&=0xff; _g_1: d1+=(d1+C); /* addx.l d1,d1 */ d0+=d0; /* add.b d0,d0 */ C=d0>>8; d0&=0xff; if (!C) {goto decompr_gamma_1;} /* bcc.b decompr_gamma_1 */ if (d0) {goto decompr_switch;} /* bne.b decompr_switch */ d0=*--a0; /* move.b -(a0),d0 */ d0+=(d0+C); /* addx.b d0,d0 */ C=d0>>8; d0&=0xff; if (!C) {goto decompr_gamma_1;} /* bcc.b decompr_gamma_1 */ decompr_switch: d1-=3; /* subq.w #3,d1 */ if (d1==-1) /* bcs.b decompr_get_mlen */ {goto decompr_get_mlen;} /* (last m_off) */ if ((d1&0xffff)>=0x8000) {goto decompr_end;} /* bmi.b decompr_end */ d1<<=8; /* lsl.l #8,d1 */ d1|=*--a0; /* move.b -(a0),d1 */ d3=d1; /* move.l d1,d3 */ decompr_get_mlen: d0+=d0; /* add.b d0,d0 */ C=d0>>8; d0&=0xff; if (d0) {goto _e_1;} /* bne.b _e_1 */ d0=*--a0; /* move.b -(a0),d0 */ d0+=(d0+C); /* addx.b d0,d0 */ C=d0>>8; d0&=0xff; _e_1: d2+=(d2+C); /* addx.w d2,d2 */ d0+=d0; /* add.b d0,d0 */ C=d0>>8; d0&=0xff; if (d0) {goto _e_2;} /* bne.b _e_2 */ d0=*--a0; /* move.b -(a0),d0 */ d0+=(d0+C); /* addx.b d0,d0 */ C=d0>>8; d0&=0xff; _e_2: d2+=(d2+C); /* addx.w d2,d2 */ a2=a1+d3+1; /* lea 1(a1,d3.l),a2 */ d2+=2; /* addq.w #2,d2 */ if (d2>0) {goto decompr_gamma_2;} /* bgt.b decompr_gamma_2 */ /* decompr_tiny_mlen: */ d1=a3; /* move.l a3,d1 */ d1-=d3; /* sub.l d3,d1 */ C=(d1<0)?1:0; d2+=(C+2); /* addx.w d4,d2 */ L_copy2: if (a1==end) return 0; *--a1=*--a2; /* move.b -(a2),-(a1) */ L_copy1: if (a1==end) return 0; *--a1=*--a2; /* move.b -(a2),-(a1) */ d2--; if (d2!=-1) {goto L_copy1;} /* dbra d2,L_copy1 */ goto decompr_loop; /* bra.b decompr_loop */ decompr_gamma_2: /* implicit d2 = 1 */ d0+=d0; /* add.b d0,d0 */ C=d0>>8; d0&=0xff; if (d0) {goto _g_2;} /* bne.b _g_2 */ d0=*--a0; /* move.b -(a0),d0 */ d0+=(d0+C); /* addx.b d0,d0 */ C=d0>>8; d0&=0xff; _g_2: d2+=(d2+C); /* addx.w d2,d2 */ d0+=d0; /* add.b d0,d0 */ C=d0>>8; d0&=0xff; if (!C) {goto decompr_gamma_2;} /* bcc.b decompr_gamma_2 */ if (d0) {goto decompr_large_mlen;} /* bne.b decompr_large_mlen */ d0=*--a0; /* move.b -(a0),d0 */ d0+=(d0+C); /* addx.b d0,d0 */ C=d0>>8; d0&=0xff; if (!C) {goto decompr_gamma_2;} /* bcc.b decompr_gamma_2 */ decompr_large_mlen: if (a1==end) return 0; *--a1=*--a2; /* move.b -(a2),-(a1) */ if (a1==end) return 0; *--a1=*--a2; /* move.b -(a2),-(a1) */ /* cmp.l a3,d3 */ if (a3<d3) {goto L_copy2;} /* bcs.b L_copy2 */ if (a1==end) return 0; *--a1=*--a2; /* move.b -(a2),-(a1) */ d2--; goto L_copy1; /* dbra d2,L_copy */ decompr_end: if (!flag++) { a0=&stack[256]; goto decompr_loop; } return len; /* rts */ } |
30 March 2020, 17:58 | #16 |
Registered User
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,438
|
@thread: I have now tested the "official" Amiga RNC packing executable (PPAMI.exe) and it creates exactly the same file as the lab313 version, so they act consistently.
@ross: thanks, exe only cruncher is fine - I only wanted the source so that I could have a Windows compatible version, but the exe is for Windows so no problems there. I'll be trying this out for sure Also... That does make me smile |
30 March 2020, 19:25 | #17 |
OCS forever!
Join Date: Mar 2019
Location: Birmingham, UK
Posts: 418
|
If anyone interested I can setup a new git project that is a completely stripped down framework so it’s not all mingled with other source. I only included the nrv2s sources in that intro link. What I’d done for myself was to create a wrapper around each different packer so that each could be called with pretty much the same parameters. Source in a0 dest in a1 etc. So I could change packer and not worry about checking all the calls.
I’d actually left out RNC because I couldn’t find a widows x64 compatible packer when I was looking for it. I’ll have a look at the lab313 one mentioned. |
30 March 2020, 19:55 | #18 | |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 54
Posts: 4,498
|
Quote:
|
|
31 March 2020, 00:09 | #19 |
OCS forever!
Join Date: Mar 2019
Location: Birmingham, UK
Posts: 418
|
Here is a first stab. I've tried to strip my framework down to the basics.
https://github.com/jonathanbennett73/amiga-depack To test various packers on a test.bin file (a couple of uncompressed raw bitmaps). 1. Run _DevCmdPrompt.cmd to open command prompt 2. Run ConvertAssets.cmd This compacts Assets\test.bin into various files in AssetsConverted. NRV2S,Packfile,Shrinker,LZ4,Cranker,RNC,Doynamite (I've also got Arjm7 but not sure if I'm supposed to make source public) I've created a dummy program "main.s" that you can run to test but it's blank at the moment. Needs a few dos library functions to do some timing/output of time. To build: 1. Run _DevCmdPrompt.cmd to open command prompt 2. Run Build.cmd 3. Output file for amiga in Build\DepackTest All the depackers are in Framework\Depack and the library that calls them is in Framework\IntroLibrary.s. You enable and disable the various packers in IntroConfig.i I've not redone any tests but the rough comparison last time I tested was: Code:
;Packers available, lz4 is reference for speed/size comparison ;LZ4 (1x speed, 100% size) ;Cranker (1.3x, 93%) ;Doynamite68k (1.8x, 88%) ;nrv2s (2.0x, 85%) ;Arj m7 (3.4x, 79%) ;Shrinkler (23x, 71%) ;PackFire (46x, 70%) Let me know if any tweaks needed. I've not intentionally looked at in-place depack as I've not needed it so far (RNC,Cranker,NRV) so api might need tweaking slightly for that. If someone wants to edit main.s to add some timing and output then that would be good I come out in hives if I do anything non system-killing... Last edited by Antiriad_UK; 31 March 2020 at 00:35. |
31 March 2020, 11:21 | #20 |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 54
Posts: 4,498
|
I always wanted to do such a thing, very well that you did it.
About RNC decoding speed: it's slower than NRV. Cheers! |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
"Voices8" 8 Channel Soundtracker "DemoSongI" song - "This is the Amiga with 8 Voices" | DemosongIHunter | request.Music | 45 | 23 May 2022 20:07 |
Is there a place for traditional "Computer Shops" in modern retail? | 005AGIMA | Nostalgia & memories | 34 | 25 March 2022 18:08 |
cannot RNC/propack a given file | jotd | support.Apps | 3 | 27 December 2019 23:25 |
Decrunching protected RNC ProPack | chip | support.Other | 28 | 23 May 2019 15:47 |
RNC ProPack source code? | Dr. MefistO | support.Apps | 4 | 07 June 2018 16:29 |
|
|