English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 15 September 2021, 16:17   #21
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
Quote:
Originally Posted by bebbo View Post
hm...
Very nice report.

However the purpose is not clear to me
(there is none of the compressors we talked about)

Is it to show compression speeds that are actually acceptable?
ross is offline  
Old 15 September 2021, 19:53   #22
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,602
Nice thread, getting the itch as always, a few thoughts.

Leonard: Floppy speed is a 'decent' reference but also relative, I've apparently used 28936 b/s as definition (don't remember the calculation but it includes MFM decoding.) However if performance is desired, it's not good to settle for floppy speed, because then you have little time for 'action' (Tai-Pan/Phalanx definition ) For such needs I would put 'good enough' at twice floppy speed at least or 58K/s.

Sometimes performance isn't a big deal (example: onefiler-on-floppy) and then any decompression speed is good, slower than floppy speed could even save buffers if you risk it. So this is why I think floppy speed is a decent reference but not necessarily the goal of a competitive decruncher.

This was the reasoning behind creating Nibbler (new algorithm).

a/b: Like the initiative but Shrinkler is at 0 bytes/s? If you could check the axis, feel free to place Nibbler somewhere. My chart has quite few data points and was measured before all these legacy algorithms were ported and explored. (Though old, they can reach great ratios if run exhaustively, and the same is true if some features are removed to improve decompression speed, so the fastest versions of them should not be discounted but run a million times to make the most of them with modern tech 35y later.)

It would be better with more datapoints and categorized by type of content (sorry, was too lazy to add all at the time), because algorithm and setting can affect ratio depending on it. Often I see this "ratio!" with no concern for the type of content. There is nothing that says you shouldn't use multiple crunchers in a single release, or over separate releases, but there's some desire there to just "make it smaller and never change my tools". It hasn't been possible yet, and maybe there's a lesson there to keep exploring

I have a burning desire to finish my improvements to Nibbler, but the stats-running+analysis is very time-consuming, and I must finish previous obligations first.

Last edited by Photon; 15 September 2021 at 20:03.
Photon is offline  
Old 15 September 2021, 20:18   #23
a/b
Registered User
 
Join Date: Jun 2016
Location: europe
Posts: 1,039
It's not my image, I just found it elsewhere and decided to post it because it looked interesting and relevant, showing zx0's position compared to some of the other known algorithms.
BTW, horizontal axis is inverted: right side is copy speed (ldir=1), and left side is 25x+ copy speed, which is where Shrinkler resides.
a/b is offline  
Old 15 September 2021, 21:02   #24
NorthWay
Registered User
 
Join Date: May 2013
Location: Grimstad / Norway
Posts: 839
Quote:
Originally Posted by ross View Post
...
Thanks.


I have always thought of LZ as using [offset,length] pairs - typically adding up to 16 bits - but I guess I should think of it as [length[,offset]]
NorthWay is offline  
Old 15 September 2021, 22:39   #25
leonard
Registered User
 
leonard's Avatar
 
Join Date: Apr 2013
Location: paris
Posts: 133
zx0 has really nice properties. I took time to pack my AmigAtari demo as this is the most challenging data to fit on single floppy. Despite zx0 did very good job it doesn't succeed in making AmigAtari fit on the disk. Here is the arjm7 original AmigAtari version:

Code:
            boot.bin    310    310 (100%) [---] Off:$000000 (00/0/01:$000) (user arg=0)
       dirkernel.tmp   9080   6720 ( 74%) [AR4] Off:$000136 (00/0/01:$136) (user arg=0)
       logo_fade.bin 131160  27284 ( 20%) [AR7] Off:$001b76 (00/1/03:$176) (user arg=0)(C:128KiB F:  1KiB)
            main.bin 331380 122194 ( 36%) [AR7] Off:$00860a (03/0/02:$00a) (user arg=0)(C:422KiB F:234KiB)
        ym7Pack0.bin  18672  12402 ( 66%) [AR7] Off:$02635c (13/1/09:$15c) (user arg=0)
        ym7Pack1.bin 202162 113136 ( 55%) [AR7] Off:$0293ce (14/1/11:$1ce) (user arg=0)
        ym7Pack2.bin 200448 112806 ( 56%) [AR7] Off:$044dbe (25/0/01:$1be) (user arg=0)
        ym7Pack3.bin 199646 107798 ( 53%) [AR7] Off:$060664 (35/0/02:$064) (user arg=0)
        ym7Pack4.bin 203764 106350 ( 52%) [AR7] Off:$07ab7a (44/1/03:$17a) (user arg=0)
        ym7Pack5.bin 174760  91400 ( 52%) [AR7] Off:$094ae8 (54/0/02:$0e8) (user arg=0)
        ym7Pack6.bin 128800  58362 ( 45%) [AR7] Off:$0aaff0 (62/0/04:$1f0) (user arg=0)
     CosoPackLz4.bin 218912 142048 ( 64%) [AR7] Off:$0b93ea (67/0/08:$1ea) (user arg=0)
----------------------------------------------------------------
Saving AmigAtari.adf:
Disk contains 12 files, packing ratio: 49%
1777KiB packed to 880KiB ( 1819094 to 900810 bytes )
1KiB left ( 310 bytes )
and now the result using zx0

Code:
            boot.bin    310    310 (100%) [---] Off:$000000 (00/0/01:$000) (user arg=0)
       dirkernel.tmp   9180   6800 ( 74%) [AR4] Off:$000136 (00/0/01:$136) (user arg=0)
       logo_fade.bin 131160  27364 ( 20%) [AR7] Off:$001bc6 (00/1/03:$1c6) (user arg=0)(C:128KiB F:  1KiB)
            main.bin 331380 123368 ( 37%) [AR7] Off:$0086aa (03/0/02:$0aa) (user arg=0)(C:422KiB F:234KiB)
        ym7Pack0.bin  18672  14052 ( 75%) [AR7] Off:$026892 (14/0/01:$092) (user arg=0)
        ym7Pack1.bin 202162 129864 ( 64%) [AR7] Off:$029f76 (15/0/06:$176) (user arg=0)
        ym7Pack2.bin 200448 125772 ( 62%) [AR7] Off:$049abe (26/1/07:$0be) (user arg=0)
        ym7Pack3.bin 199646 130882 ( 65%) [AR7] Off:$06860a (37/1/11:$00a) (user arg=0)
        ym7Pack4.bin 203764 125114 ( 61%) [AR7] Off:$08854c (49/1/02:$14c) (user arg=0)
        ym7Pack5.bin 174760 108342 ( 61%) [AR7] Off:$0a6e06 (60/1/05:$006) (user arg=0)
        ym7Pack6.bin 128800  66602 ( 51%) [AR7] Off:$0c153c (70/0/07:$13c) (user arg=0)
     CosoPackLz4.bin 218912 160446 ( 73%) [AR7] Off:$0d1966 (76/0/05:$166) (user arg=0)
ERROR: Don't fit on the disk.
the 3 first files are pretty same ratio but others are a bit larger with zx0, esp the last one "CosoPackLz4.bin" . These files are special, they are aleady LZ4 packed. This is the key feature of AmigAtari, LZ4 so the package could fit in memory, and LZ4 + Arj packed to fit on the disk ( LZ4 could be packed, because it's a byte stream format ).

I'm still looking for another packer that could fit AmigAtari on a floppy....
leonard is offline  
Old 15 September 2021, 22:48   #26
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
Quote:
Originally Posted by leonard View Post
I'm still looking for another packer that could fit AmigAtari on a floppy....
Add an entropy coding stage to ZX0
ross is offline  
Old 15 September 2021, 23:12   #27
leonard
Registered User
 
leonard's Avatar
 
Join Date: Apr 2013
Location: paris
Posts: 133
Quote:
Originally Posted by ross View Post
Add an entropy coding stage to ZX0
that's not that simple... one of the key feature of zx0 is that there is no entropy coding, so they can brute force all combine between litterals + pair/offset ( and it also supports string of litteral instead of 9bits per literral for standard LZxx )

If you add entropy coding, it become extremely hard to brute force the search space.

Zx0 is a really powerfull packer for small files, small platforms. All the energy is spent at compression stage. Such a good packing ratio for such a simple depacker is beautifull
leonard is offline  
Old 15 September 2021, 23:33   #28
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,602
Quote:
Originally Posted by a/b View Post
It's not my image, I just found it elsewhere and decided to post it because it looked interesting and relevant, showing zx0's position compared to some of the other known algorithms.
BTW, horizontal axis is inverted: right side is copy speed (ldir=1), and left side is 25x+ copy speed, which is where Shrinkler resides.
Copyspeed is indeed another point of axis, one which normally can't be transcended by a decompressor, but you can get close.

The image is bad because I see an axis starting at 0 bytes per "frame" and ending at 2500 bytes per "frame". And 0 bytes per anything is 0 bytes per second. Please don't put Nibbler on this image, and I've already stressed the importance of type of content before strictly committing to single cruncher, if ever.

Quote:
Originally Posted by leonard View Post
I'm still looking for another packer that could fit AmigAtari on a floppy....
You have posted this here as a challenge even though it fits, but swap out one or two files (such as the already compressed files, which might fare great with a basic RLE...), and arjm7 might not be the one to beat.

Again as per my "loading scheme" paragraphs much can be done for presentation by staging the decompression and using the right tool for each stage.
Photon is offline  
Old 15 September 2021, 23:41   #29
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
Quote:
Originally Posted by leonard View Post
that's not that simple...
Yep, mine was basically a joke, I'm absolutely aware of how complicated such a thing would be.
Much more feasible to add support for distant offsets (maybe using -q compression, otherwise the compression time would be..)
ross is offline  
Old 15 September 2021, 23:55   #30
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,960
Easy, use LZMA only for CosoPack file. Add nice jingle when this file is depacked a few minutes and you can use other packer for rest files. Or use LZMA only for packed main (big) file (like in BC Kid 1 disk) and add jingle with nice music and text "Please wait loading and depacking." People can/must wait at begining only. Rest of files will be quickly depacked.
Don_Adan is offline  
Old 16 September 2021, 08:00   #31
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
Quote:
Originally Posted by leonard View Post
I'm still looking for another packer that could fit AmigAtari on a floppy....
Personnally i would reconsider using own encoded soundchip output rather than original music code and data. For example COSO/TFMX songs are quite small and you could reuse player code where appropriate.
meynaf is offline  
Old 16 September 2021, 10:43   #32
a/b
Registered User
 
Join Date: Jun 2016
Location: europe
Posts: 1,039
@Thcm optimize can be a bit faster with this:
Code:
int elias_gamma_bits(int value) {
#if defined __GNUC__
    // written this way to cancel out a xor in __builtin_clz()
    return 1 + ((__builtin_clz(v)^(sizeof(int)*8-1))<<1);
#elif defined _MSC_VER
    int bits;
    _BitScanReverse(&bits,value);   // might need <intrin.h>
    return  1 + (bits<<1);
#else
    int bits = 1;
    while (value > 1) {
        bits += 2;
        value >>= 1;
    }
    return bits;
#endif
}
When compiled with optimizations, this will end up using a single bsr instruction on satan-cpu to find the highest 1. I don't use other compilers so only those two cases covered.
a/b is offline  
Old 16 September 2021, 11:26   #33
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,960
Or you can depack LZ4 packed files and use direct zx0 on these files. Some packers dont like to pack packed files and LZ4 seems to be average packer for me. Second option is to split big files on smallest parts like 30KB or 60KB and check if zx0 can pack these better. Anyway double packed files is never good option for me. Good packer must always pack better original (not packed) file, than packed with other packer already.
Don_Adan is offline  
Old 16 September 2021, 14:02   #34
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
I see 3 possible improvements/optimizations that can be applied to the ZX0 68k depacker (apart from the obvious ones concerning micro-optimization), of course modifying the bitcoding structure.

One that I used in nrv2x, one that I used in aplibx and one that concerns the structure of the raw stream itself (it has a really cool feature, if I'm not wrong it's an 'even' encoding!*).

Also the 'end token' would be better moved in the literal run stream decoding for two reasons: probably guarantees a less invasive check (but a bit more end bits?, but who cares) and can solve the >64k lrun, that is plain wrong in actual decoder.
I think we can gain a decent decompression speed compared to the available code (which to tell the truth I haven't tried yet )
Tonight I'll work on it a bit.

*probably in the 68k case an 'odd' encoding is better; this means that the 'startup' byte should be different (and a single bit gained)
ross is offline  
Old 16 September 2021, 14:03   #35
leonard
Registered User
 
leonard's Avatar
 
Join Date: Apr 2013
Location: paris
Posts: 133
Quote:
Originally Posted by meynaf View Post
Personnally i would reconsider using own encoded soundchip output rather than original music code and data. For example COSO/TFMX songs are quite small and you could reuse player code where appropriate.
oh of course it would make data a lot smaller! but this is a totally different issue. I just used AmigAtari demo as a size benchmark. I could use "De Profundis" demo too, contains a lot of amiga demo data. ( but packed 2 disks with zx0 could take hours )
leonard is offline  
Old 16 September 2021, 15:37   #36
a/b
Registered User
 
Join Date: Jun 2016
Location: europe
Posts: 1,039
Quote:
Originally Posted by ross View Post
... One that I used in nrv2x...
What I changed in nrv2b... inverted all match offset bits, and the rest was minor patching of bits and bytes at the end to reduce overrun and size a tiny bit. However, I didn't have any interests in in-place depacking, and I was happy with depacker size and speed so I didn't mess with bitcode any futher.
I see a not.b d0 (offset hi-byte) in depacker, plus lsl.w #8 so there is definitely a potential there :P.
a/b is offline  
Old 16 September 2021, 16:04   #37
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
Quote:
Originally Posted by a/b View Post
However, I didn't have any interests in in-place depacking
Well, actually there is a big interest in in-place depacking (backward or forward ones).
For one simple reason: many coders don't want to know anything about offsets, safety bytes, pre-set buffers and arcane requirements.
They want to allocate the original memory and tell the unpacker to do the job like a blackbox.

Quote:
and I was happy with depacker size and speed so I didn't mess with bitcode any futher.
I thought ZX0 would stimulate the optimizer-man inside of you
(and I'm not talking about micro-optimizations, which everyone can do).

Quote:
I see a not.b d0 (offset hi-byte) in depacker, plus lsl.w #8 so there is definitely a potential there :P.
Yep, the one I've used in aplibx that uses a similar approach for offsets
ross is offline  
Old 16 September 2021, 20:44   #38
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
Ok, new provisional bitcode defined (EDIT: AZX0?, name to be defined ).

The unpacker code is 120 bytes with no subroutines (is unrolled as much as possible) and contains some accelerators, fully supports >16-bit offsets and protection for literal runs >64k.
It does not yet contain the outline code for in-place decompression and some details.
It works till now only in my head because I don't even encoded a synthetic stream for it to try (so for sure it will have to be fixed here and there )
Fortunately the stream is really simple so it didn't take me that long, I think I used all the tricks I could (not many because the code is too little..).

If I have not made huge errors of concept (which could well be) in a short time I try to generate the new bitcode.
I expect good speed.

Last edited by ross; 16 September 2021 at 20:53.
ross is offline  
Old 17 September 2021, 17:47   #39
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
More code unrollolled, more accelerated paths, now I'm near 150 bytes.
Maybe time to stop, I'm at short branches limit.

And I haven't tried with a synthetic stream yet, so nothing might work
I now use much more the original bitcode so the conversion should be simplified.
ross is offline  
Old 22 September 2021, 17:27   #40
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
Ok, now I've a working packer and unpacker for the 'new' ZX0 stream, optimized and friendly to 68k.

I've tested it with the github mentioned "cobra.scr" (a little file of 6912 bytes, packed to 2294)

Impressive results (they went beyond my wildest expectations ):
- original stream and original 68k decoder: 387600 cycles
- 'new' stream and my optimized decoder: 275700 cycles

This is a whopping 40% speed increase!
Those who know about decompressors know it is a remarkable achievement.

I would also have done an absolute speed comparison with other (de)packers, but as long as there is no support for large offsets it makes no sense.
ross is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Zip packer corrupting files ? Retroplay support.Apps 13 23 July 2011 12:17
old soundeditors and pt-packer Promax request.Apps 7 14 July 2010 13:21
Pierre Adane Packer Muerto request.Modules 15 21 October 2009 18:03
Power Packer PP Files HELP W4r3DeV1L support.Apps 2 30 September 2008 06:20
Cryptoburners graphics packer Ziaxx request.Apps 1 06 March 2007 10:30

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 04:23.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.09715 seconds with 13 queries