English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 02 November 2019, 14:51   #61
Antiriad_UK
OCS forever!
 
Antiriad_UK's Avatar
 
Join Date: Mar 2019
Location: Birmingham, UK
Posts: 418
I like this comment in the depack source, ross.

Code:
bclr	d2,d2		; ;)
The mind boggles
Antiriad_UK is offline  
Old 02 November 2019, 16:35   #62
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,479
Quote:
Originally Posted by antiriad_uk View Post
the mind boggles
Magia!
ross is offline  
Old 02 November 2019, 16:41   #63
hooverphonique
ex. demoscener "Bigmama"
 
Join Date: Jun 2012
Location: Fyn / Denmark
Posts: 1,629
Interesting thread, this!

I'm looking for a fast depacker (comparable to LZ4) for iteratively depacking, from a compressed stream, a fixed output amount of data (0.5-1kB) into the same buffer each frame, similar to depacking a media stream, except it's not media data.

Do any of you have suggestions for such a thing?
hooverphonique is offline  
Old 02 November 2019, 17:00   #64
a/b
Registered User
 
Join Date: Jun 2016
Location: europe
Posts: 1,043
LZ4 can depack in-place but you have to do several extra steps. I don't have a working m68k code, it's on my todo list since I didn't need it "right now", but a prototype in C works fine. You'd have to:
- revert source data (last byte comes first)
- compress with regular lz4
- reverse contents of compressed data blocks (and if there's more than one, revert them as well), for better performance you can leave 16bit match offsets in little-endian format (faster conversion from unaligned 16bit to aligned big-endian)
- modify the decompressor to work from end to start (change postinc to predec for the most part)
a/b is offline  
Old 02 November 2019, 17:17   #65
hooverphonique
ex. demoscener "Bigmama"
 
Join Date: Jun 2012
Location: Fyn / Denmark
Posts: 1,629
Thanks a/b. I'm not looking for in-place decompression per se, but being able to call the decompressor iteratively to get a fixed size chunk of decompressed bytes out of the same compressed stream.

But maybe it'll be easier to just compress each (or several for better ratio?) chunk separately instead of the whole stream of chunks... Hmm...
hooverphonique is offline  
Old 02 November 2019, 17:21   #66
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,479
If the blocks are so small I wouldn't even worry about in-place decompression.
I would put the data block directly on stack and forward unpack from there.

But one important thing is not clear: the LZ window should refer to already decompressed data or every block should refer exclusively to own vocabulary?
Because it would make a huge difference in compression efficiency.

EDIT: posted before seeing the previous message
ross is offline  
Old 03 November 2019, 00:07   #67
Galahad/FLT
Going nowhere
 
Galahad/FLT's Avatar
 
Join Date: Oct 2001
Location: United Kingdom
Age: 50
Posts: 9,014
Just so we don't get any unnecessary mod action, i'm happy for the thread to go in whatever direction it goes in, its all helpful and people are being helped
Galahad/FLT is offline  
Old 03 November 2019, 10:29   #68
Antiriad_UK
OCS forever!
 
Antiriad_UK's Avatar
 
Join Date: Mar 2019
Location: Birmingham, UK
Posts: 418
Couldn't sleep this morning so added Cranker and PackFire (large model) to my framework to give them all a go.

Remember CPU is 99% busy in lev3 interrupt and depacking only happening outside lev3 interrupt so interrupt hell, but I'm mostly interested in relative performance anyway. I depacked 4 times to make it easier to time with stopwatch (So scientific )

Original file is a raw image depacked 4 times : 4 x 40960 bytes
Last column is just rough speed vs LZ4. I've seen someone quote shrinkler as about 22 times slower than lz so looks about right.
Code:
LZ4		9247 bytes	4.7s	(1x)
Cranker		8556 bytes	6.2s 	(1.3x)
Doynamite68k	8125 bytes	8.5s	(1.8x)
nrv2s		7840 bytes	9.7s	(2.0x)
Shrinkler	6528 bytes	110s	(23x)
PackFire	6432 bytes	219s	(46x)
Just so this table not trying to make Shrinkler/PackFire look slow. Packfire does this test in 23s when interrupts turned off. Gets hard to measure the others then though. Definitely a case of choosing the right tool for the job.

I was looking to use Cranker to compress my final exe as it's so fast but it looks pretty good for data too (and does in-place). Choice paralysis!

Edit: Table fixed.

Last edited by Antiriad_UK; 03 November 2019 at 11:06. Reason: Fixed due to bad test
Antiriad_UK is offline  
Old 03 November 2019, 12:44   #69
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 2,002
Quote:
Originally Posted by Antiriad_UK View Post
Couldn't sleep this morning so added Cranker and PackFire (large model) to my framework to give them all a go.

Remember CPU is 99% busy in lev3 interrupt and depacking only happening outside lev3 interrupt so interrupt hell, but I'm mostly interested in relative performance anyway. I depacked 4 times to make it easier to time with stopwatch (So scientific )

Original file is a raw image depacked 4 times : 4 x 40960 bytes
Last column is just rough speed vs LZ4. I've seen someone quote shrinkler as about 22 times slower than lz so looks about right.
Code:
LZ4		9247 bytes	4.7s	(1x)
Cranker		8556 bytes	6.2s 	(1.3x)
Doynamite68k	8125 bytes	8.5s	(1.8x)
nrv2s		7840 bytes	9.7s	(2.0x)
Shrinkler	6528 bytes	110s	(23x)
PackFire	6432 bytes	219s	(46x)
Just so this table not trying to make Shrinkler/PackFire look slow. Packfire does this test in 23s when interrupts turned off. Gets hard to measure the others then though. Definitely a case of choosing the right tool for the job.

I was looking to use Cranker to compress my final exe as it's so fast but it looks pretty good for data too (and does in-place). Choice paralysis!

Edit: Table fixed.
You can add Arj mode 7 and xpk_SHR3 for comparison.
Don_Adan is offline  
Old 03 November 2019, 13:32   #70
hooverphonique
ex. demoscener "Bigmama"
 
Join Date: Jun 2012
Location: Fyn / Denmark
Posts: 1,629
Quote:
Originally Posted by ross View Post
But one important thing is not clear: the LZ window should refer to already decompressed data or every block should refer exclusively to own vocabulary?
Because it would make a huge difference in compression efficiency.
Good question, Ross, and the answer is probably, that best compression is achieved with a vocabulary that "slowly" updates, i.e. not for each block, but also not static over the whole stream.
hooverphonique is offline  
Old 03 November 2019, 13:42   #71
Antiriad_UK
OCS forever!
 
Antiriad_UK's Avatar
 
Join Date: Mar 2019
Location: Birmingham, UK
Posts: 418
Quote:
Originally Posted by Don_Adan View Post
You can add Arj mode 7 and xpk_SHR3 for comparison.
Not managed to get that arj depacker code to work yet. I think the util that was suggested Windows was arjbeta.exe. The depacker code barfs at that but think it’s because the resulting file is an archive so you have to work out how to skip to the right offset in the file first.
Antiriad_UK is offline  
Old 03 November 2019, 13:50   #72
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 2,002
Quote:
Originally Posted by Antiriad_UK View Post
Not managed to get that arj depacker code to work yet. I think the util that was suggested Windows was arjbeta.exe. The depacker code barfs at that but think it’s because the resulting file is an archive so you have to work out how to skip to the right offset in the file first.
You must skip some bytes at begining of packed file with arjbeta.exe. You can check Turrican 2 (WT version).
Don_Adan is offline  
Old 03 November 2019, 14:05   #73
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,479
Quote:
Originally Posted by Don_Adan View Post
You can add Arj mode 7 and xpk_SHR3 for comparison.
Yes, ARJ7m absolutely deserves to enter the test list.
Never checked xpk_SHR3, I should do it.

Quote:
Originally Posted by hooverphonique View Post
Good question, Ross, and the answer is probably, that best compression is achieved with a vocabulary that "slowly" updates, i.e. not for each block, but also not static over the whole stream.
Ok, so that's what I had in mind to do but I never did


Quote:
Originally Posted by Antiriad_UK View Post
Not managed to get that arj depacker code to work yet.
Now you can
ross is offline  
Old 03 November 2019, 18:40   #74
Antiriad_UK
OCS forever!
 
Antiriad_UK's Avatar
 
Join Date: Mar 2019
Location: Birmingham, UK
Posts: 418
Quote:
Originally Posted by ross View Post
Yes, ARJ7m absolutely deserves to enter the test list.

Now you can
Thanks for the arj file. Added to the toolbox
Code:
LZ4		9247 bytes	4.7s	(1x)
Cranker		8556 bytes	6.2s 	(1.3x)
Doynamite68k	8125 bytes	8.5s	(1.8x)
nrv2s		7840 bytes	9.7s	(2.0x)
Arj m7		7276 bytes	16.4s	(3.4x)
Shrinkler	6528 bytes	110s	(23x)
PackFire	6432 bytes	219s	(46x)
I can see why Dan said they used lz4+arj as required for De Profundis - they sit at opposite ends of the "reasonable" speed vs compression scale it seems.
Antiriad_UK is offline  
Old 03 November 2019, 23:14   #75
DanScott
Lemon. / Core Design
 
DanScott's Avatar
 
Join Date: Mar 2016
Location: Tier 5
Posts: 1,212
LZ4 was essential for the infinite roto-zoomer... had 128 frames to decompress the next image to be used, with about 30-45% of a frame free for the background decompression (actually less, as there was a LOT of copper action happening too)

ARJ was mainly used for the compression of each demo part (to be loaded decompressed in the background)
DanScott is offline  
Old 16 December 2019, 15:36   #76
VladR
Registered User
 
Join Date: Dec 2019
Location: North Dakota
Posts: 741
Hi guys. Just joined the board. Happy to be here !

I'm working on a 4 KB demo for Atari Jaguar and am in need of some serious packer. I don't mind abysmal depacking times, as it can't possibly be that long with just couple KBs.

Is Shrinkler achieving best compression ratio ?

Obviously, there is going to be a threshold where depacker, regardless of its compression ratio, is simply too big for a 4 KB executable.

Are there any known 4-KB specific depackers with best ratio ? I don't mind implementing it by myself, I just need to avoid scenario where I implement something and a week later find out I should have implemented a different depacker.

The target data payload is:
68000 code
GPU RISC code
DSP RISC code
gfx data
audio data
VladR is offline  
Old 16 December 2019, 15:48   #77
Galahad/FLT
Going nowhere
 
Galahad/FLT's Avatar
 
Join Date: Oct 2001
Location: United Kingdom
Age: 50
Posts: 9,014
Quote:
Originally Posted by VladR View Post
Hi guys. Just joined the board. Happy to be here !

I'm working on a 4 KB demo for Atari Jaguar and am in need of some serious packer. I don't mind abysmal depacking times, as it can't possibly be that long with just couple KBs.

Is Shrinkler achieving best compression ratio ?

Obviously, there is going to be a threshold where depacker, regardless of its compression ratio, is simply too big for a 4 KB executable.

Are there any known 4-KB specific depackers with best ratio ? I don't mind implementing it by myself, I just need to avoid scenario where I implement something and a week later find out I should have implemented a different depacker.

The target data payload is:
68000 code
GPU RISC code
DSP RISC code
gfx data
audio data
Im sure someone can give you good advice on this as there are quite a few demo programmers that are adept on different retro computing formats.
Galahad/FLT is offline  
Old 16 December 2019, 17:16   #78
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,479
Quote:
Originally Posted by VladR View Post
Is Shrinkler achieving best compression ratio ?

Obviously, there is going to be a threshold where depacker, regardless of its compression ratio, is simply too big for a 4 KB executable.
Yes, Shrinkler is perfect for 4KB demo/intro, normally has the best compression ratio compared to all other compressors.
Depacker code also is very tiny (but be prepared for long depacking, also on 4KB of datas).
ross is offline  
Old 17 December 2019, 10:25   #79
VladR
Registered User
 
Join Date: Dec 2019
Location: North Dakota
Posts: 741
Quote:
Originally Posted by ross View Post
Yes, Shrinkler is perfect for 4KB demo/intro, normally has the best compression ratio compared to all other compressors.
Depacker code also is very tiny (but be prepared for long depacking, also on 4KB of datas).
Thanks!


I just tried using it on my current build (without audio yet) with option -d and it compressed 3,904 Bytes into 2,080 -> 52.7 %
But, I do have a 1,024 Bytes of Heightmap terrain data there (which will shrink to 256 soon), so not sure how that affects the ratio.


Nice !


Looks like there is even a 68000 depacker source already and it's just 3 pages of code, so that shouldn't be an issue at all (as I worried initially).




Just curious - can it even take more than a second to depack 8 KB on a 13.3 MHz 68000 ? I will, eventually, find out, just wondering if some kind of visual loading progress is even needed for such miniscule size ?


Anything around 2-3 seconds is not a big deal when the compo is running. If it was 10 seconds, then yeah - I would need some loading screen.


Worst case scenario, I could implement it on GPU, though such RISC code might easily take double to triple the size of 68000, so I'd rather not go that route...


I think I saw some timing benchmarks around in this thread, so I'll reread from first page...
VladR is offline  
Old 17 December 2019, 11:00   #80
VladR
Registered User
 
Join Date: Dec 2019
Location: North Dakota
Posts: 741
Quote:
Originally Posted by leonard View Post
if anyone is interested by a fast LZ4 68k depacker, you can use one of my three version here: (tiny, normal and fast )

https://github.com/arnaud-carre/lz4-68k

If you need extreme packing ratio and you don't care of decompression time then use Shrinkler.
If you need very good packing ratio with average depacking speed ( about 15KiB/s, same speed as floppy loading), use ARJ mode 7
If you need almost same good packing ratio than ARJ mode 7 and about 2 times faster depacking, use UPX (nrv2b )
If you need extremly fast depacking speed withtout ridiculous packing ratio, use LZ4

I did a ATARI depacking benchmark, you can look results here: https://ibb.co/JKtQFVt

first column is the packed binary file size ( smaller is better ). Then the name of the file ( you have lz77, pft=packfire tiny, lz, am7=arj mode 7, shk=shrinkler

the number between bracket () is the decompressor code size.

Then the last column is the decompressor speed ( number of 50hz tick to depack). Smaller is faster.
That is an awesome benchmark. Thanks for the work involved!


So, if I'm reading it right, Shrinkler takes more than an order of magnitude longer than the fastest one ? Wow, I guess I won't be using it for my 4 MB demos It also appears that ~3KB were decompressed in exactly 1 second (49 vbls out of 50), which is just about my target size.
VladR is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
What depacker to use? Premier Manager II pepsimaxman Coders. General 13 15 July 2019 11:17
Fastest Amiga Games Djay Nostalgia & memories 73 09 June 2016 21:42
Fastest unZIP on 030? Amiga1992 support.Apps 7 04 October 2010 01:15
fastest hardfile or directory ? turrican3 New to Emulation or Amiga scene 10 06 June 2007 19:08
RNC Data File Depacker v2.1 Nico New to Emulation or Amiga scene 8 05 May 2002 18:05

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 05:18.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.10502 seconds with 15 queries