![]() |
![]() |
#41 | |
Going nowhere
Join Date: Oct 2001
Location: United Kingdom
Age: 50
Posts: 9,020
|
Quote:
So Disk 1 is part interrupt loading format and part standard MFM, with $1900 size tracks. Disk 2 is entirely interrupt loading format, and that format is like this. So interrupt disk format is two different sync marks, $4489 and $4522, but the data is structured not as one complete $1900, but as two big "sectors" of $c80 bytes. So first part of track is $4489 and $c80 worth of data, then the second part of that track is $4522 and also $c80 worth of data. The problem is that the programmer only loads in $c80 sizes when interrupt loader is running, which means he has set a REALLY small track buffer to decode MFM data, which is neatly positioned inbetween code and onscreen graphics, when wanting to load next half of the track, he just swaps the SYNC and it will load the next part, of course AmigaDOS $4489 style track can not be decoded in such a tiny memory space and there is literally NO room in 512k in which to do it. The disk format is also quite large, $FA000 per disk, so even if I could find the room, I wouldn't be able to fit all the data on the disks and keep it as two like the original. Lots of sneaky stuff, like if you try and bypass his interrupt loader entirely, it will miss stuff that is setup in memory for other routines which will crash, lots of self modifying encryption (the weakest part of the protection), checksums (again, strangely weaker than the disk protection), if you try and mess with the interrupt loader, it does stuff like seek to track 83 which is all kinds of fun the first time it happens!!!!! So now i've had to change the interrupt loader system to a multi loader, I have to preserve chip memory to extra memory so I can load the different parts and then restore that used memory, it was quite the headache to figure out a system that wouldn't fall foul of the programmers stuff trying to trip me up. Very competent protection, and a few schoolboy errors by me along the way didn't help either! ![]() |
|
![]() |
![]() |
#42 | ||
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 54
Posts: 4,501
|
Very detailed description, thanks!
![]() Quote:
but fortunately big MFM buffer, same sync and no tight IRQ loader (so one track load per rotations). Quote:
![]() |
||
![]() |
![]() |
#43 | |
Registered User
Join Date: May 2004
Location: Somewhere secret
Age: 50
Posts: 366
|
Quote:
I can confirm this ![]() Been busy on other stuff (c64, playing some CTF's), but you can never leave Amigaaaaa! |
|
![]() |
![]() |
#44 |
Registered User
Join Date: Sep 2016
Location: Deventer - Netherlands
Posts: 599
|
I love to read this kinda stories, and i'm perfectly clear now why we never where able to crack this back in the dayz....recognize some things about the disk format and the buffer between the code and the graphics, but i'm sure we couldn't crack this kinda impressive code anywayz. but i'll like it to see that this kinda things is possible after so many years.... a big, big cheers for all involved by this project......
|
![]() |
![]() |
#45 |
Inviyya Dude!
Join Date: Sep 2016
Location: Amiga Island
Posts: 2,798
|
I'd need something that encodes on Mac OS X (I build all my stuff in a shellskript there), and decodes on the Amiga (off course
![]() Seems LZ4 would be the way to go, but I'd also love to have a good compresson rate (which LZ4 doesn't seem to have?) |
![]() |
![]() |
#46 |
Registered User
Join Date: Apr 2013
Location: paris
Posts: 133
|
if anyone is interested by a fast LZ4 68k depacker, you can use one of my three version here: (tiny, normal and fast )
https://github.com/arnaud-carre/lz4-68k If you need extreme packing ratio and you don't care of decompression time then use Shrinkler. If you need very good packing ratio with average depacking speed ( about 15KiB/s, same speed as floppy loading), use ARJ mode 7 If you need almost same good packing ratio than ARJ mode 7 and about 2 times faster depacking, use UPX (nrv2b ) If you need extremly fast depacking speed withtout ridiculous packing ratio, use LZ4 I did a ATARI depacking benchmark, you can look results here: https://ibb.co/JKtQFVt first column is the packed binary file size ( smaller is better ). Then the name of the file ( you have lz77, pft=packfire tiny, lz, am7=arj mode 7, shk=shrinkler the number between bracket () is the decompressor code size. Then the last column is the decompressor speed ( number of 50hz tick to depack). Smaller is faster. Last edited by leonard; 24 August 2019 at 16:26. |
![]() |
![]() |
#47 | |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 54
Posts: 4,501
|
Quote:
Is an improvement and a fork over nrv2b, born for in-place decompression. Depacker code: Code:
; nrv2r decompression in 68000 assembly ; by ross ; ; On entry: ; a0 src pointer ; [a1 dest pointer] ; (decompress also to a1=a0) ; ; On exit: ; all preserved but ; a1 = dest start ; ; Register usage: ; a2 m_pos ; a3 constant: $cff ; a4 2nd src pointer (in stack) ; ; d0 bit buffer ; d1 m_off ; d2 m_len or -1 ; ; d3 last_m_off ; d4 constant: 2 ; d5 reserved space on stack ; ; ; Notes: ; we have max_offset = 2^23, so we can use some word arithmetics on d1 ; we have max_match = 65535, so we can use word arithmetics on d2 ; nrv2r_ross_unpack movem.l d0-d5/a0/a2-a4,-(sp) lea (a0),a1 ; if (a1) lea (a0),a4 adda.l (a0),a0 ; end of packed data move.l -(a0),(a1) ; if (a1) move.l -(a0),(a4) adda.l -(a0),a1 ; end of buffer move.b -(a0),d0 ; ~stack usage moveq #-2,d5 and.b d0,d5 adda.l d5,sp ; reserve space lea (sp),a4 _stk move.b -(a0),(a4)+ addq.b #1,d0 bne.b _stk ; ------------- setup constants ----------- moveq #-$80,d0 ; d0.b = $80 (byte refill flag) moveq #-1,d2 moveq #0,d3 ; last_off = 0(1) moveq #2,d4 movea.w #$cff,a3 ; ------------- DECOMPRESSION ------------- decompr_literal move.b -(a0),-(a1) decompr_loop add.b d0,d0 bcc.b decompr_match bne.b decompr_literal move.b -(a0),d0 addx.b d0,d0 bcs.b decompr_literal decompr_match moveq #1,d1 decompr_gamma_1 add.b d0,d0 bne.b _g_1 move.b -(a0),d0 addx.b d0,d0 _g_1 addx.w d1,d1 ; max 2^23! add.b d0,d0 bcc.b decompr_gamma_1 bne.b decompr_select move.b -(a0),d0 addx.b d0,d0 bcc.b decompr_gamma_1 decompr_select subq.w #3,d1 bcs.b decompr_get_mlen ; last m_off bmi.b decompr_exit_token lsl.l #8,d1 move.b -(a0),d1 move.l d1,d3 ; last_m_off = m_off decompr_get_mlen ; implicit d2 = -1 add.b d0,d0 bne.b _e_1 move.b -(a0),d0 addx.b d0,d0 _e_1 addx.w d2,d2 add.b d0,d0 bne.b _e_2 move.b -(a0),d0 addx.b d0,d0 _e_2 addx.w d2,d2 lea 1(a1,d3.l),a2 addq.w #2,d2 bgt.b decompr_gamma_2 decompr_tiny_mlen move.l a3,d1 sub.l d3,d1 addx.w d4,d2 L_copy2 move.b -(a2),-(a1) L_copy1 move.b -(a2),-(a1) dbra d2,L_copy1 L_rep bra.b decompr_loop decompr_gamma_2 ; implicit d2 = 1 add.b d0,d0 bne.b _g_2 move.b -(a0),d0 addx.b d0,d0 _g_2 addx.w d2,d2 add.b d0,d0 bcc.b decompr_gamma_2 bne.b decompr_large_mlen move.b -(a0),d0 addx.b d0,d0 bcc.b decompr_gamma_2 decompr_large_mlen move.b -(a2),-(a1) move.b -(a2),-(a1) cmpa.l d3,a3 bcs.b L_copy2 move.b -(a2),-(a1) dbra d2,L_copy1 decompr_exit_token lea (a4),a0 bclr d2,d2 ; ;) bne.b L_rep suba.l d5,sp movem.l (sp)+,d0-d5/a0/a2-a4 rts |
|
![]() |
![]() |
#48 |
Registered User
Join Date: Apr 2013
Location: paris
Posts: 133
|
ross: interesting! do you have some numbers to share about perf difference between nrv2b and nrv2r you mentionned? ( both in term of packing ratio and compression speed? )
|
![]() |
![]() |
#49 | |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 54
Posts: 4,501
|
Quote:
![]() http://eab.abime.net/showpost.php?p=...3&postcount=24 or the whole thread: http://eab.abime.net/showthread.php?t=89467 Not exaustive at all, I wanted to do something complete but I never found the time.. |
|
![]() |
![]() |
#50 |
OCS forever!
Join Date: Mar 2019
Location: Birmingham, UK
Posts: 418
|
I've been adding some depacker support to my framework. Doing a onefiler targeting the A500 512+512 config. So I'm shunting data from fast mem to chip as needed. Instead of just copying I've been playing around with some of the packers in this thread. Got packfire/shrinkler/lz4 working nicely. I want to try upx nrv2b next but getting stuck.
First off how exactly do you get a data file compressed? Upx doesn't seem to do it. I saw reference to a modified upx exe on an ST forum but not been able to track it down. What are you guys using? Decompression-wise is this the routine people are using? https://github.com/upx/upx/blob/mast...8k/nrv2b_d.ash |
![]() |
![]() |
#51 | |
Registered User
Join Date: Oct 2017
Location: Sunderland, England
Posts: 2,702
|
Quote:
|
|
![]() |
![]() |
#52 |
OCS forever!
Join Date: Mar 2019
Location: Birmingham, UK
Posts: 418
|
|
![]() |
![]() |
#53 |
Registered User
Join Date: May 2004
Location: Somewhere secret
Age: 50
Posts: 366
|
You could also take a look at DoynaxLZ, which was used in the Oxyron demo Planet Rocklobster.
It's a port of a c64 packer to 68000, he released the tools in the demo source. |
![]() |
![]() |
#54 | |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 54
Posts: 4,501
|
Quote:
I use it when I'm not concerned by space constraints. |
|
![]() |
![]() |
#55 |
Natteravn
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,553
|
Yes, I used it for Trap Runner and Solid Gold. The portable packer is extremely slow, though. Takes hours on real 68k hardware.
|
![]() |
![]() |
#56 |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 54
Posts: 4,501
|
A problem for Doynax 68k is that the way encoding is built is impossible to use for in-place decompression.
But the double stream is conversely a strong point because it allows to reach impossible speeds for the token byte fetch decoders. |
![]() |
![]() |
#57 |
Registered User
Join Date: May 2004
Location: Somewhere secret
Age: 50
Posts: 366
|
|
![]() |
![]() |
#58 |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 54
Posts: 4,501
|
|
![]() |
![]() |
#59 |
OCS forever!
Join Date: Mar 2019
Location: Birmingham, UK
Posts: 418
|
Not a very scientific report here. But in one routine I've got a lev3 interrupt running using 99% of CPU and outside the interrupt I'm doing a depack of an image from fast to chip (just in time for it to be displayed after about 10 seconds)
So the CPU is maxed and depack is very much slowed but the relative results are interesting: Original image: 40KB LZ4, 9247 bytes, 1-2 seconds Doynamite, 8125 bytes, 4 seconds nrv2s, 7840 bytes, 5 seconds shrinkler, 6528 bytes, 20 seconds For all the images i've compressed I'm seeing similar sorts of packing ratios. I'm spoiled for choice really now. I'm thinking lz4 for anything that needs ludicrous speed and doynamite/nrv2s for pretty much anything else (nrv2s/r if in-place needed ofc). Maybe shrinkler where you have loads of free time and need the best ratio. Last edited by Antiriad_UK; 02 November 2019 at 13:52. Reason: Added shrinkler |
![]() |
![]() |
#60 |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 54
Posts: 4,501
|
Thanks for testing.
This is somehow interesting. While the difference in ratio is expected considering the algorithms, I expected greater speed difference between doynamite and nrv2s. This makes me think that maybe a dual stream version for nrv2x might make sense. |
![]() |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
What depacker to use? Premier Manager II | pepsimaxman | Coders. General | 13 | 15 July 2019 11:17 |
Fastest Amiga Games | Djay | Nostalgia & memories | 73 | 09 June 2016 21:42 |
Fastest unZIP on 030? | Amiga1992 | support.Apps | 7 | 04 October 2010 01:15 |
fastest hardfile or directory ? | turrican3 | New to Emulation or Amiga scene | 10 | 06 June 2007 19:08 |
RNC Data File Depacker v2.1 | Nico | New to Emulation or Amiga scene | 8 | 05 May 2002 18:05 |
|
|