English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 19 February 2019, 01:36   #41
AMIGASYSTEM
Registered User
 
AMIGASYSTEM's Avatar
 
Join Date: Aug 2014
Location: Brindisi (Italy)
Age: 70
Posts: 8,248
Quote:
Originally Posted by chip View Post
Ok, test finished

This is the list of unsupported formats found

StoneCracker 3.0 (exe)
MasterCruncher 3.0 Addr (exe)
ByteKiller 1.3 (exe)
PowerPacker 4.0 (exe)
StoneCracker4.04 Data (data)
Imploder (exe)

In the zone one file for each of the above

These files are all easily unpackable with a click with LZ

--- Italiano ---

Questi file sono tutti facilmente scompattabili con un click con LZ
AMIGASYSTEM is offline  
Old 19 February 2019, 08:20   #42
chip
Registered User
 
Join Date: Oct 2012
Location: Italy
Age: 49
Posts: 2,942
I didn't know this "LZ"

It's an Amiga application or a Windows program ?
chip is offline  
Old 20 February 2019, 21:27   #43
temisu
Registered User
 
Join Date: Mar 2017
Location: Tallinn / Estonia
Posts: 74
Quote:
Originally Posted by chip View Post
So, summarizing, main goal of ancient is to decompress crunched data, not crunched exe ?
Actually, this is a rather good description. I'm seriously thinking of adding this line to the Readme. I'm never saying never though, but for now at least my focus is more on the data than executables...

Quote:
Originally Posted by jbl007 View Post
Zoned uncompressed encrypted dms test image.
Excellent. Thank you very much.

I found a brainfart in my design by using this file. It was doing header parsing again and again on every try slowing it down. Re-arranged code a bit and got time from 2 seconds to 0.04 seconds on my machine...

If you have more of these encrypted dms images I'm interested, I have so few of them it is not surprise there aren't more things like this.

Quote:
Originally Posted by malko View Post
please, yes You can add it in the archive.
Added both 32bit and 64bit executables in the zip file (on the Zone). Included the DMS-fix there as well

Quote:
Originally Posted by AMIGASYSTEM View Post
These files are all easily unpackable with a click with LZ
Indeed they can be uncompressed using proper tools (on Amiga at least). However, the point of ancient is to be portable decompressor (for data). As such it will have some overlap with already existing tools.

What I hope to achieve that ancient will replace are xdms/undms, dungeon keeper utilities (for RNC), defunct unix-port of XPK and amigadepacker (as soon as I get MMCMP, StoneCracker).

Another thing I can only dream of at this point is that different mod-players which all have bad implementation of xpk-sqsh will get rid of that and use ancient-library since that can handle xpk format properly. (Think big)
temisu is offline  
Old 24 February 2019, 13:44   #44
jbl007
Registered User
 
Join Date: Mar 2013
Location: Leipzig/Germany
Posts: 466
Quote:
Originally Posted by temisu View Post
Re-arranged code a bit and got time from 2 seconds to 0.04 seconds on my machine...
It's lightning fast now.
I found another problem. First there's a typo in the identify usage string (indentify -> identify). Second it fails to identify PP20 files, always returns "Unknown or invalid compression format". Decrunching works fine.

Tiny base64 encoded example file:
UFAyMAkKCwtOrna2pnSmHqZ8AAAAAAkR

Should decrunch to ASCII text "runme.exe"
jbl007 is offline  
Old 24 February 2019, 14:01   #45
Galahad/FLT
Going nowhere
 
Galahad/FLT's Avatar
 
Join Date: Oct 2001
Location: United Kingdom
Age: 50
Posts: 8,986
Quote:
Originally Posted by temisu View Post
Actually, this is a rather good description. I'm seriously thinking of adding this line to the Readme. I'm never saying never though, but for now at least my focus is more on the data than executables...
Just to point out that some of these packed executables are nothing of the sort, and your code with a little modification can handle some of these packed executables easily.

For instance, RNC Propack which you have support for crunched files, if you simply scan the packed executable generated by Propack, you will find the packed data file within the executable with the same header information, packed size/unpacked size, so you can easily have native support to depack those Propack executables.

Some of the others work on the same principle as well, simply putting an executable header on the packed data file.
Galahad/FLT is offline  
Old 24 February 2019, 21:32   #46
temisu
Registered User
 
Join Date: Mar 2017
Location: Tallinn / Estonia
Posts: 74
Quote:
Originally Posted by jbl007 View Post
I found another problem. First there's a typo in the identify usage string (indentify -> identify).
Second it fails to identify PP20 files, always returns "Unknown or invalid compression format".
Both were really stupid things, thanks for reporting. Fixed.

Quote:
Originally Posted by Galahad/FLT View Post
Just to point out that some of these packed executables are nothing of the sort,
and your code with a little modification can handle some of these packed executables easily.

For instance, RNC Propack which you have support for crunched files,
if you simply scan the packed executable generated by Propack,
you will find the packed data file within the executable with the same header information,
packed size/unpacked size, so you can easily have native support to depack those Propack executables.

Some of the others work on the same principle as well, simply putting an executable header on the packed data file.
Yes, RNC-propack can be handled as well as few others. There is option already the scan files for embedded streams. so that is covered...

There are 2 problems with scanning the streams (in more generic fashion):
  1. Some formats stream backwards and do not encode their size thus guessing what to decompress is hard(er)
  2. Some executable compressors remove the header used with data thus you cant identify them the same way

I have written code to parse amiga hunk executables, but making that work with decompressors feels like a deep rabbit hole.
temisu is offline  
Old 15 September 2020, 22:05   #47
rygar
Registered User
 
Join Date: Nov 2007
Location: Poland
Posts: 1,298
temisu, where can i find binary files?
rygar is offline  
Old 16 September 2020, 09:24   #48
temisu
Registered User
 
Join Date: Mar 2017
Location: Tallinn / Estonia
Posts: 74
Quote:
Originally Posted by rygar View Post
temisu, where can i find binary files?
I uploaded them to the zone, some time ago, but they are already expired.

Also, the code has improved a lot since then. I need to make a new builds...
temisu is offline  
Old 16 September 2020, 11:00   #49
chip
Registered User
 
Join Date: Oct 2012
Location: Italy
Age: 49
Posts: 2,942
For rygar

Old binary are here

http://grandis.nu/eabsearch/search.p...xclude=&limit=

chip is offline  
Old 16 September 2020, 18:50   #50
rygar
Registered User
 
Join Date: Nov 2007
Location: Poland
Posts: 1,298
Quote:
Originally Posted by chip
Thank you


Quote:
Originally Posted by temisu View Post
I uploaded them to the zone, some time ago, but they are already expired.

Also, the code has improved a lot since then. I need to make a new builds...

Thank you for fast answer. I'll wait for a new version from you
rygar is offline  
Old 18 November 2020, 22:26   #51
temisu
Registered User
 
Join Date: Mar 2017
Location: Tallinn / Estonia
Posts: 74
Well, this was not fast but there is a release now:

https://github.com/temisu/ancient/releases/tag/v1.0

I needed to resurrect old windows installation since somehow my new laptop delivery did not happen after waiting for more than a month...

Currently I'm only providing 64-bit executable - Sagamusix has been doing awesome job of fuzzing the codebase and I have fixed issues he found - but only on 64-bit builds. I still need to think how I'm going to introduce the fixes for 32-bit compatible way...
temisu is offline  
Old 18 November 2020, 22:40   #52
Radertified
Registered User
 
Join Date: Jan 2011
Location: -
Posts: 728
I never got around to building it myself but I've always been interested in trying Ancient so thanks for the binary!
Radertified is offline  
Old 25 March 2022, 18:54   #53
rez
coder at Melon
 
Join Date: Nov 2021
Location: Paris / France
Age: 48
Posts: 3
Is it possible to get a binary working on MacOS? :'(
rez is offline  
Old 27 March 2022, 23:20   #54
temisu
Registered User
 
Join Date: Mar 2017
Location: Tallinn / Estonia
Posts: 74
Quote:
Originally Posted by rez View Post
Is it possible to get a binary working on MacOS? :'(
I'll include that in the next release.
temisu is offline  
Old 08 June 2023, 15:22   #55
nocash
Registered User
 
Join Date: Feb 2016
Location: Homeless
Posts: 62
Hi temisu, many thanks for the Ancient source code and test files!!!

I've implemented most of those compression methods in 80x86 asm, and your stuff has been very helpful for the various XPK methods (and a dozen of other methods, and some corner cases in methods that I had already implemented).

A few questions...

XPK-CBR0
Okay, that appears to be a very simple RLE compressor... but I am unable to implement it, in lack of having a test file : /
The CBR0 file in your test folder is completely uncompressed (maybe the compressor is doing stupid things like trying to RLE-compress the "ll" in "Hello!", which would make the compression ratio worse than leaving the file uncompressed?).
Is there another test file somewhere, that is actually containing compressed data?

XPK-CBR1
What is that??? The XPK webpage says "equals CBR0 byte for byte". But it doesn't tell what equals what.
If the library is equal: That would imply that the CBR1 library did contain the exact same NAME, Date, Version strings as CBR0 (despite of being said to be a newer version with different name).
If the compressed data is equal: That could happen for several reasons (eg. if the file is uncompressed, or if it differs only on run lengths bigger than 80h bytes or the like).

XPK-LIN1,LIN2,LIN3,LIN4 - Lino Packer
Just curious: Where is that from, and who is Lino?

RNC1DecompressOld
Where did you find such files? And are you sure that they are real official RNC files?
Asking because your "RNC1DecompressOld" function does very much look like standard "Pack-Ice" decompression... I would guess that somebody had just changed the ID in the fileheader from "ICE!" to "RNC",01h.
Without having such files, I couldn't tell if the remaining file header was also edited.
For Pack-Ice specs, see here: https://eab.abime.net/showpost.php?p...09&postcount=7

End codes
There are about a dozen of methods that seem to contain End codes, but it looks as if you have implemented them as "if <value=special> then Error" instead of "if <value=special> then DecompressionDone".
Is that... because you deeply dislike the concept of using End codes?
I mean, of course it is an error if the End code does occur unexpectedly in the middle of the bitstream, but it would be neat to add comments saying "unexpected end code" in such cases. And perhaps also implement an error check for missing end codes.
For example, the LZW formats (ZENO and BLZW) are definetly containing standard LZW-style End codes. And most of the LZSS formats (LZW2, LZW3, LZW4, LZW4, SLZ3, LZBS) also seem to have end codes.

Last edited by nocash; 08 June 2023 at 23:14.
nocash is offline  
Old 08 June 2023, 22:14   #56
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,602
"Ancient" in the original post vs. "what is this LZ??", funny read, thx for the thread gents.

Towards the goal of the project: Sadly, archivers (and many packers) used ancient algorithms from the 1950s and 1970s, and there have been little improvement to them since, let alone novel compression algorithms.

The best modern compression format you can think of use a combination of these, and the simpler RLE encoding and delta conversion. The best one on top of that still uses the same algorithms, but seeds assumptions and exhaustively run all of them until the punishment from assumptions make the smallest file (on very fast modern computers, not suitable for the size of modern applications).

The good news is that this means that if not for the fact that choices affect the format, and compression allows you to invent a custom format much more than a novel algorithm, you could understand the re-used algorithms and decrunch them all with a generic enough choice- and format-detecting decompression source. The code might look like a spaghetti tree that describes the routes the algorithms took when passed around, but that is what it is.

The bad news though is that the formats are custom, and come in many different versions as well, sometimes compatible between versions.

The Amiga offered a great playground and a great need for custom formats as well as packing, both for games (separate loader) and packdisks (included in the exe). This is why the many formats, and there may be a few novel algorithms that are beyond the above "choice and custom", unlike before and since the Amiga. Mr. Spiv, Blueberry, and Bifat can give answers. I've made 5 so far, 2 simple quick ones, Nibbler truly novel and competitive, as are 2 more targeted for 1-4K executables. Well, come to think of it I also made 2 especially for graphics and sound.

There is absolute gold to mine, if there were again the same need for compressing a disk image on Amiga. The prime use of DMS was mostly to fit some disk images onto the smaller 720K PC disks and for the BBS era, which I hear others tell of. The Amiga utility was neat, and it had checksumming, even though compression in itself breaks easily to automatically provide checksumming by not breaking.

For most early Amiga cruncher versions, decompression sources are usually included, especially for executable crunchers. Later, there was more of a focus on archivers and installers, which put the decompression in a separate binary not included in the compression size, once the format was deemed established. This means later versions of the same format might not include them, and that you must disassemble those binaries.

I hope this information inspires you to continue the project and that, by working on ancient crunchers, it is as if you are working on modern crunchers, mostly, for most of them.
Photon is offline  
Old 15 June 2023, 16:13   #57
nocash
Registered User
 
Join Date: Feb 2016
Location: Homeless
Posts: 62
Before I forget, some comments on the Ancient cource code...

Pack.z
Code:
uint8_t levelCounts[24];
levelCounts[maxLevel-1]+=2;
That looks buggy. The +2 is there because the last entry can be in range 2..100h, which exceeds your uint8 array. Situations where 8bit overflows could occur:
1) 256 codes with 8bit codesize (ie. end code and 255 different bytes).
2) 1 code with 1bit codesize, plus 256 codes with 9bit codesize.

TDCS and LZW5
Those are somewhat duplicated, they could be as well merged into a single file. The only differences are:
1) adding +2 or +3 (in case 1 and 2)
2) end code is case=1/disp=0000h in LZW5, and case=3/disp=0000h in TDCS
3) whether or not using XOR for negating the disp values makes no difference at all

LOB
Code:
MSP (something lz)
There is a name for that something lz: It's ByteKiller. Although with several minor changes:
1) with leading 2 uncompressed bytes
2) with disp+1, instead of disp+0
3) with forwards streams, instead backwards
4) with readbyte, instead readbits(8)
5) with 8bit lzlen+4, instead of lzlen+1

Code:
MSS (lzss style packer)
That's almost same as TurboPacker (but with inverted flag bits for compressed/uncompressed bytes). Anyways, that might be coincidence since they are both standard LZSS variants.

ByteKiller
In case you want to also support that format, here's what I've found out about it.
Code:
ByteKiller is an old compression tool with Question-and-Answer based User
Interface (the ANC Cruncher variant is commandline based, and crashes when not
specifying src and dst filenames).
ByteKiller was made by Lord Blitter (various hacks were released by other
people, some of them based on other hacks from yet other people).
The exact same ByteKiller compressed bitstream format & checksum calculation is
used in various games and demos (often with customized headers/footers).

Standard ByteKiller Format
This format is used by the original ByteKiller tool (v1.2plus, v1.3, v2.0,
v2.03, v2.05, v3.0, v3.0win, v3.0b are all using the format; except, Pro 1.0 is
appending an extra ID word).
  000h 4    Compressed Size (Filesize-0Ch) (or Pro 1.0: Filesize-10h)    ;\
  004h 4    Uncompressed Size                                            ; Head
  008h 4    Compressed Checkum (all compressed 32bit words XORed)        ;/
  00Ch ..   Compressed Data (32bit words, processed backwards)           ;-Data
  ...  (-)  Nothing here  ;<-- normal versions with [000h]=Filesize-0Ch  ;\Foot
  ...  (4)  ID ("data")   ;<-- Pro 1.0 version with [000h]=Filesize-10h  ;/

Alternate ACE! Variant
  000h 4    ID ("ACE!")                                                  ;\
  004h 4    Compressed Size+4 (Filesize-0Ch)                             ; Head
  008h 4    Uncompressed Size                                            ;
  00Ch 4    Compressed Checkum (all compressed 32bit words XORed)        ;/
  010h ..   Compressed Data (32bit words, processed backwards)           ;-Data

Alternate FVL0/MD10/MD11/PVC! Variants
  000h 4    ID ("FVL0", "MD10", "MD11", "PVC!")                          ;\
  004h (4)  Compressed Size+8 (Filesize-8)  ;-ID="FVL0" (ANC Cruncher)   ;
  004h (-)  Nothing here                    ;-ID="MD10" (McDisk mag 1-2) ; Head
  004h (2)  Uncompressed Size XOR "MD"      ;\ID="MD11" (McDisk mag 3)   ;
  006h (2)  Uncompressed Size XOR "11"      ;/                           ;
  004h (4)  Uncompressed Size, as in footer ;\ID="PVC!" (VideoTracker)   ;
  008h (4)  Compressed Size+14h (Fsize-0..2);/                           ;/
  ...  ..   Compressed Data (32bit words, processed backwards)           ;-Data
  ...  4    Compressed Checksum (all compressed 32bit words XORed)       ;\
  ...  4    Uncompressed Size                                            ; Foot
  ...  (-)  Nothing here                    ;-normal case                ;
  ...  (..) Appended stuff (see below)      ;-ID="PVC!" (VideoTracker)   ;/
VideoTracker "PVC!" files may have following data appended:
  Player\*      --> appends 1-byte
  Routine\*.rot --> appends 2-bytes, plus another PVC! file, also with 2-bytes
  Video\*       --> appends nothing

DecompressByteKiller(src,dst,src_len,dst_len,chksum):
 ;This is for the plain Data part (after stripping the headers/footers)
  src=src_start+src_len, dst=dst_start+dst_len
  src=src-4, collected=BigEndian32bit[src], chksum=chksum XOR collected
  if collected=0 then error  ;first word contains 0..31 bits, plus end flag
 @@decompress_lop:
  if dst<=dst_start then goto @@decompress_done
  if GetBits(1)=0 then
    if GetBits(1)=0 then rawlen=GetBits(3)+1, goto @@copy_raw       ;00
    else                 lzlen=2, disp=GetBits(8)                   ;01
  else
    if GetBits(1)=0 then
      if GetBits(1)=0 then lzlen=3, disp=GetBits(9)                 ;100
      else                 lzlen=4, disp=GetBits(10)                ;101
    else
      if GetBits(1)=0 then lzlen=GetBits(8)+1, disp=GetBits(12)     ;110
      else                 rawlen=GetBits(8)+9, goto @@copy_raw     ;111
  if disp=0 then error
  for i=1 to lzlen, dst=dst-1, [dst]=[dst+disp]                     ;copy lz
  goto @@decompress_lop
 ;---
 @@copy_raw:
  for i=1 to rawlen, dst=dst-1, [dst]=GetBits(8)                    ;copy raw
  goto @@decompress_lop
 ;---
 @@decompress_done:
  if dst<>dst_start or src<>src_start or collected<>00000001h then error
  if chksum<>0 then error
  ret
 ;---
 GetBits(n):
  val=0, for i=1 to n, val=val*2+GetBit      ;<-- LEFT shift (!)
  return val
 ;---
 GetBit:
  collected=collected SHR 1                  ;<-- RIGHT shift (!)
  if collected=00000000h then
    src=src-4, x=BigEndian32bit[src], chksum=chksum XOR x
    collected=(x+100000000h) SHR 1
  return carry                  ;carry-out from above SHR shift's
Decompression supports 12bit disp aka 1000h byte Dictionary size (although the
compressors may use less for faster compression; ByteKiller v1.3, v2.0 and Pro
v1.0 are only allowing to use max 800h bytes).

IDs for ByteKiller based formats
  ID            Offset Used in
  Size-0Ch      0      Packer: ByteKiller 1.2+, 1.3, 2.0, 2.0x, 3.0
  "ACE!"        0      Mag: Resident #1
  "ARP3"        0      Tool: Action Replay III freeze file, Game demo: RedZone
  "ARPF"        0      Tool: Action Replay II freeze file, Game demo: WarZone
  "CRND"        0      X-Out, Killing Game Show
  "CRUN"        0      Dominium
  "DAVE"        0      Myth - History in the Making
  "FVL0"        0      Packer: ANC Cruncher (reportedly=FLV0, actually=FVL0)
  "FUCK";"MARC" 0;EOF  Sensible Soccer V1.1, Alien World, Yogi's Big Clean Up
  "GR20"        0      Games created with GRAC V2.00
  "MD10"        0      Mag: McDisk #1, McDisk #2
  "MD11"        0      Mag: McDisk #3
  "PVC!"        0      Tool: VideoTracker
  "xVdg"        0      AMOS: Used in compiled AMOS programs
  "xVgd"        0      AMOS: AMOS Updater V1.36, EasyAMOS installation disks
  "FLA0"        EOF    X-it
  "JEK!"        EOF    AtariST: Jek Packer V1.2(d) - V1.3, JamPacker V1
  "JEK!"        EOF    The seven gates of Jambala (other hdr than Jek Packer)
  "MARC"        EOF    Mercenary III, Flimbo's Quest
  "OND<"        EOF    Shadow of the Beast ">DATA COMPACTION (C)1989 A.R.BOND<"
  "TXIC"        EOF    Chariots Of Wrath, Striker Number 9, Fighter Command
  "data"        EOF    Packer: ByteKiller Pro 1.0
Wishlist: I need more test files with hacked IDs.

LZX
Code:
There is no good spec on the LZX header content -> lots of unknowns here
The format is more or less well documented (at the end of the "unlzx.c" source code).
Here is what I've extracted from it:
Code:
LZX Archive Format (1995-1997)
LZX archives can contain single files (separately compressed) and merged files
(compressed together). For example,
  Archive Header
  File List entry for 1st file, with Compressed Size=0  ;\
  File List entry for 2nd file, with Compressed Size=0  ; Merged (Flags=1)
  File List entry for 3rd file, with Compressed Size<>0 ;
  Compressed data for 1st-3rd file                      ;/
  File List entry for 4th file, with Compressed Size=0  ;\
  File List entry for 5th file, with Compressed Size<>0 ; Merged (Flags=1)
  Compressed data for 4th-5th file                      ;/
  File List entry for 6th file, with Compressed Size<>0 ;\Single (Flags=0)
  Compressed data for 6th file                          ;/
  File List entry for 7th file, with Compressed Size<>0 ;\Single (Flags=0)
  Compressed data for 7th file                          ;/
Archive Header:
  000h 3    ID      ("LZX")
  003h 1    Flags   (00h) (bit0=DamageProtect, bit1=Locked)
  004h 1    Unknown (00h) (or 0Ch)
  005h 1    Unknown (00h)
  006h 1    Unknown (0Ah)           ;maybe version? or hdrsize? or Amiga?
  007h 1    Unknown (00h) (or 04h)
  008h 1    Unknown (00h)
  009h 1    Unknown (00h)
File List entries:
  000h 1    Attr (0Fh) (bit0-7=Read,Write,Delete,Exec,Archive,Hold,Script,Pure)
  001h 1    Unused (00h)
  002h 4    Uncompressed Size                 ;LittleEndian...
  006h 4    Compressed Size (or 0 for non-last merged files)
  00Ah 1    Machine Type (0=MSDOS, 1=Windows, 2=OS2, 0Ah=Amiga, 14h=Unix)
  00Bh 1    Method (02h) (or 00h) (0=Stored, 2=LzxCompressed, 20h=EOF=???)
  00Ch 1    Flags  (00h) (or 01h) (0=Single, 1=Merged)
  00Dh 1    Unused (00h)
  00Eh 1    Comment Length (00h) (for Amiga: 0..79)
  00Fh 1    Version needed to extract (0Ah)
  010h 1    Unused (00h)
  011h 1    Unused (00h)
  012h 4    Timestamp (similar to MSDOS format, but 6bit year and 6bit second)
  016h 4    CRC32 on Uncompressed Data
  01Ah 4    CRC32 on File List entry & name/comment (with this field zeroed)
  01Eh 1    Filename Length
  01Fh ..   Filename ("path/filename")
  ...  ..   Comment  (if any)
  ...  ..   Compressed Data (if any)
 Unknown if there are any fields for UID/GID.
 Unknown what Method=20h=EOF means (it's never used, neither 20h nor N+20h).

LZX Bug
The 68020 and 68040 versions are occassionally creating corrupted LZX archives.
As workaround, always use the 68000 version for compression. In particular, LZX
seems to store garbage in the last 2-3 bytes of some files:
http://telparia.com/fileFormatSamples/archive/lzx/CXHandler3.8.LZX - corrupt
http://aminet.net/package/util/cdity/CXHandlerV38 - intact copy of above files
Apart from the Unknown/Unused fields, the biggest unknown is that Method=20h=EOF thing.

LHLB (lh.library)
Code:
different distance/count logic
Yeah, but not soo different.
The length/count logic is exactly same. Except that the datalen tree contains two extra length codes with len=1 and len=2 (these are probably unused dummy codes, for avoiding to need to add +2 to the length values) (or maybe len=2 is actually used, but len=1 is certainly useless).
The different distance logic is just giving the exact same results as in LZHUF, you could as well use the normal LZHUF logic, instead of the huge distanceHighBits[256] table.
nocash is offline  
Old 15 June 2023, 23:25   #58
temisu
Registered User
 
Join Date: Mar 2017
Location: Tallinn / Estonia
Posts: 74
Quote:
Originally Posted by nocash View Post

A few questions...

XPK-CBR0
The CBR0 file in your test folder is completely uncompressed (maybe the compressor is doing stupid things like trying to RLE-compress the "ll" in "Hello!", which would make the compression ratio worse than leaving the file uncompressed?).
Is there another test file somewhere, that is actually containing compressed data?
Yup, that is bad test file. It is still correct as in correctly formatted but does not compress at all. I'll create another one

Quote:
Originally Posted by nocash View Post
XPK-CBR1
What is that??? The XPK webpage says "equals CBR0 byte for byte". But it doesn't tell what equals what.
I'm assuming that it is the same as CBR0. But frankly I don't know for sure and won't be able to confirm unless I'll get the compressor...

I have been considering deleting this codepath completely

Quote:
Originally Posted by nocash View Post
XPK-LIN1,LIN2,LIN3,LIN4 - Lino Packer
Just curious: Where is that from, and who is Lino?
The compression library for LIN1 says
Author: Gerhard Tuenkler
Long Name: Lino1 V1.1
The other versions are similar expect for the numbers. I do not know where the name comes from. These libraries are part of choloks fixed xpk-collection of libraries

Quote:
Originally Posted by nocash View Post
RNC1DecompressOld
Where did you find such files? And are you sure that they are real official RNC files?
Asking because your "RNC1DecompressOld" function does very much look like standard "Pack-Ice" decompression... I would guess that somebody had just changed the ID in the fileheader from "ICE!" to "RNC",01h.
Without having such files, I couldn't tell if the remaining file header was also edited.
For Pack-Ice specs, see here: https://eab.abime.net/showpost.php?p...09&postcount=7
Yes, that is a real thing. Since I do not have yet implemented ICE, I don't know if they are the exactly same and who copied who. The file I committed into test files is created by me since only other source for these files (to my knowledge) are game files.

Quote:
Originally Posted by nocash View Post
End codes
There are about a dozen of methods that seem to contain End codes, but it looks as if you have implemented them as "if <value=special> then Error" instead of "if <value=special> then DecompressionDone".
Is that... because you deeply dislike the concept of using End codes?
I mean, of course it is an error if the End code does occur unexpectedly in the middle of the bitstream, but it would be neat to add comments saying "unexpected end code" in such cases. And perhaps also implement an error check for missing end codes.
For example, the LZW formats (ZENO and BLZW) are definetly containing standard LZW-style End codes. And most of the LZSS formats (LZW2, LZW3, LZW4, LZW4, SLZ3, LZBS) also seem to have end codes.
I have no opinion about end codes one way or another. Many compressors define end code but do not use it. Then I have just added it as an error condition.

Quote:
Originally Posted by nocash View Post
Pack.z
Code:
uint8_t levelCounts[24];
levelCounts[maxLevel-1]+=2;
That looks buggy. The +2 is there because the last entry can be in range 2..100h, which exceeds your uint8 array.
That is indeed a bug. Will fix

Quote:
Originally Posted by nocash View Post
TDCS and LZW5
Those are somewhat duplicated, they could be as well merged into a single file. The only differences are:
1) adding +2 or +3 (in case 1 and 2)
2) end code is case=1/disp=0000h in LZW5, and case=3/disp=0000h in TDCS
3) whether or not using XOR for negating the disp values makes no difference at all
There is indeed a bit of duplication. I prefer duplication instead of creating of lots of if's in the code. I have combined some compressors that are 100% the same. It is not exact science.

Quote:
Originally Posted by nocash View Post
LOB
Code:
MSP (something lz)
There is a name for that something lz: It's ByteKiller. Although with several minor changes:
1) with leading 2 uncompressed bytes
2) with disp+1, instead of disp+0
3) with forwards streams, instead backwards
4) with readbyte, instead readbits(8)
5) with 8bit lzlen+4, instead of lzlen+1
I haven't done bytekiller so I do not (yet) know how similar it is. Everyone copied everbody else it seems

Quote:
Originally Posted by nocash View Post
Code:
MSS (lzss style packer)
That's almost same as TurboPacker (but with inverted flag bits for compressed/uncompressed bytes). Anyways, that might be coincidence since they are both standard LZSS variants.
Yup.

Quote:
Originally Posted by nocash View Post
ByteKiller
In case you want to also support that format, here's what I've found out about it.
Identifying bytekiller bitstream is something that is holding me back here. Those that have ID's are actually trivial but the original format is not.

Quote:
Originally Posted by nocash View Post
The format is more or less well documented (at the end of the "unlzx.c" source code).
There are still some open question about flags, methods etc. Also if we consider non-amiga LZX this get weird very quickly

Quote:
Originally Posted by nocash View Post
LHLB (lh.library)
Code:
different distance/count logic
Yeah, but not soo different.
The length/count logic is exactly same. Except that the datalen tree contains two extra length codes with len=1 and len=2 (these are probably unused dummy codes, for avoiding to need to add +2 to the length values) (or maybe len=2 is actually used, but len=1 is certainly useless).
The different distance logic is just giving the exact same results as in LZHUF, you could as well use the normal LZHUF logic, instead of the huge distanceHighBits[256] table.
Again, it is not exact science how it works. I'm thinking maybe I should have a generic LZ-decompressor and some way to feed params to it. Also in many cases there are tables where simpler code could be used. Some improvements could be made, but I'm not fan of refactoring working code
temisu is offline  
Old 16 June 2023, 03:31   #59
nocash
Registered User
 
Join Date: Feb 2016
Location: Homeless
Posts: 62
Quote:
Originally Posted by temisu View Post
Yup, that is bad test file. It is still correct as in correctly formatted but does not compress at all. I'll create another one
Second test file would be cool. The current test file is also interesting because it's apparently worse than the other RLE compressors (RLEN and FRLE).

Quote:
Originally Posted by temisu View Post
The compression library for LIN1 says
Author: Gerhard Tuenkler
Long Name: Lino1 V1.1
The other versions are similar expect for the numbers. I do not know where the name comes from. These libraries are part of choloks fixed xpk-collection of libraries
Thanks, I wasn't aware of that Cholok package, but I've now found it here http://wt.exotica.org.uk/test.html and it does even include source code (or reworked disassemblies), that's nice to have.

Quote:
Originally Posted by temisu View Post
Yes, that is a real thing... The file I committed into test files is created by me
No, not the "test_C1.rnc1" and "test_C1.rnc2" files, that are the two standard RNC1 and RNC2 methods.
I meant the "third" method called "RNC1Old" in your source code. That looks a lot as if somebody thought that it would be funny to change the Pack-Ice ID into a RNC1 ID to confuse people. Like all those files listed as "Original ID: <xxxx>" on this webpage: http://www.amiga-stuff.com/crunchers-id.html
Or do you know some official looking Rob Northen software that is actually using that "RNC1Old" format?

Quote:
Originally Posted by temisu View Post
Many compressors define end code but do not use it. Then I have just added it as an error condition.
Ah, okay. Ah, but wait, now that you have mentioned the Choloks package... peeking at the LZW2 code:
Code:
beq.s   .end            ;tagend
That does very much look as if the decompressor did use the end code : )

Quote:
Originally Posted by temisu View Post
Identifying bytekiller bitstream is something that is holding me back here.
Yup, detecting formats without ID isn't so nice, but don't underestimate the first 4 bytes being equal to filesize-0Ch. That's quite a "strong" indicator (at least stronger than things like gzip or zlib headers). Of course it's working only if the filesize isn't padded to sectorsize or the like.
Speaking of sectorsizes, I am always loading the first 200h bytes detection (I guess that won't be much slower than loading only the first 4 bytes on many operating systems).
And, for compressed files without ID, it can be useful to "detect if the file looks compressed" (checking if the first 200h bytes contain a lot of different non-continous bytes in range 00h..FFh).
If you are 99% sure that it could be bytekiller, then it might be worth to risk mis-detection, or to load the whole and verify the checksum (the file doesn't need to be decompressed for that).

Quote:
Originally Posted by temisu View Post
Those that have ID's are actually trivial but the original format is not.
Yeah, except that they do all use different custom IDs and custom headers. I've looked into some of those used in magazines and demos, but I didn't check those used in games.
I guess it would be immensely difficult to find somebody who could share test files for all those bytekiller variants. Somebody, prove me wrong!

Quote:
Originally Posted by temisu View Post
I'm thinking maybe I should have a generic LZ-decompressor and some way to feed params to it.
That might make it more difficult to understand the source code. I think altogether it's fine as is, although there are some things that could be simplified a bit.

Quote:
Originally Posted by temisu View Post
Some improvements could be made, but I'm not fan of refactoring working code
I thought so. But maybe add a comment that the LHLB table is a bit overcomplicated, and it could be easier (or smaller) to use the normal LZHUF table. That's tested and it's working fine for me.

Quote:
Originally Posted by temisu View Post
There is indeed a bit of duplication. I prefer duplication instead of creating of lots of if's in the code. I have combined some compressors that are 100% the same. It is not exact science.
Same there, maybe add a comment in LZW5 and TDCS, mentioning that they are nearly identical... that might help when navigating through the different methods... anyways, that's not really iimportant.

DHUF
I've found the rare DHUF library here https://archive.org/download/Commodo...9%28XPK%29.zip
and here https://aminet.net/package/misc/os/aros-orca-m68k (huge download, 282Mbyte).
Might be interesting to see how it differs from HUFF and HFMN. I can help on disassembling code in the the library.
But I have no idea how to set up a working boot disc with all required XPK libraries for creating a DHUF test file : /

Last edited by nocash; 16 June 2023 at 03:53.
nocash is offline  
Old 16 June 2023, 09:41   #60
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,960
Quote:
No, not the "test_C1.rnc1" and "test_C1.rnc2" files, that are the two standard RNC1 and RNC2 methods.
I meant the "third" method called "RNC1Old" in your source code. That looks a lot as if somebody thought that it would be funny to change the Pack-Ice ID into a RNC1 ID to confuse people. Like all those files listed as "Original ID: <xxxx>" on this webpage: http://www.amiga-stuff.com/crunchers-id.html
Or do you know some official looking Rob Northen software that is actually using that "RNC1Old" format?
RNC1Old is official packer used in old/budget (?) version of Rob Northen package. Seems that Rob Northen using one of Ice Packer methods.
If I remember right, all old Amiga games from Mutation Software using this packer. For newest, i dont know. I dont see data in new games from Adrian Cummings. You can ask him about name and version of this Rob Northen package.
Heimdall using RNC1Old too.
Don_Adan is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
"Voices8" 8 Channel Soundtracker "DemoSongI" song - "This is the Amiga with 8 Voices" DemosongIHunter request.Music 45 23 May 2022 20:07
"Reminder "Lincs Amiga User Group aka "LAG" Meet Sat 5th of January 2013" rockape News 4 30 January 2013 00:06
After creating OS 3.9 Emergency Disk: Cannot open "cd.device" unit 2 Snowwie support.Other 2 31 March 2012 14:44
A source of cheap LCD for "small Amiga" projects... mabus support.Hardware 2 14 February 2008 00:25
Who hides behind the handle "Source / The Source"? andreas Retrogaming General Discussion 15 04 January 2005 16:02

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 17:58.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.21345 seconds with 13 queries