English Amiga Board


Go Back   English Amiga Board > Other Projects > project.WHDLoad

 
 
Thread Tools
Old 17 July 2020, 16:28   #1
kipper2k
Registered User
 
Join Date: Sep 2006
Location: Thunder Bay, Canada
Posts: 4,323
Loading WHDLoad games by decompressing on the fly

Hi All, not sure if this is doable. I am thinking about some of the games in the WHDload games library that can sometimes have 100's of small files. The obvious annoyance with this is the time it takes to copy these files onto a CF/HDD etc even in Winuae.

Qusetion is, could WHDload have a User option to run a compressed game file decompressing on the fly. I understand that this would probably need an expanded/upgraded Amiga. but part of the benefits would be reduced HDD space and file copying time?
kipper2k is offline  
Old 17 July 2020, 16:50   #2
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
Quote:
Originally Posted by kipper2k View Post
Hi All, not sure if this is doable. I am thinking about some of the games in the WHDload games library that can sometimes have 100's of small files. The obvious annoyance with this is the time it takes to copy these files onto a CF/HDD etc even in Winuae.

Qusetion is, could WHDload have a User option to run a compressed game file decompressing on the fly. I understand that this would probably need an expanded/upgraded Amiga. but part of the benefits would be reduced HDD space and file copying time?
Well, it's not the same thing, but you can unpack the archive in RAM: and launch the game from there
ross is offline  
Old 17 July 2020, 17:18   #3
Wepl
Moderator
 
Wepl's Avatar
 
Join Date: Nov 2001
Location: Germany
Posts: 866
To support some kind of archives directly is on my ToDo list.
But it probably won't happen in the next time.
Wepl is offline  
Old 18 July 2020, 00:16   #4
coldacid
WinUAE 4000/40, V4SA
 
coldacid's Avatar
 
Join Date: Apr 2020
Location: East of Oshawa
Posts: 538
Maybe in version 19?
coldacid is offline  
Old 20 July 2020, 13:27   #5
Wepl
Moderator
 
Wepl's Avatar
 
Join Date: Nov 2001
Location: Germany
Posts: 866
Maybe someone wants to contribute?
I need an archive format which:
- preserves all amiga filesystem meta data (protection, date, file comment)
- stores directories and normal files
- files should be stored uncompressed or compressed
- the format must allow random access to the stored files, so chunks should be used if compressed (like in XPK)
- I think a stream format like lha is best suited
Wepl is offline  
Old 20 July 2020, 14:27   #6
daxb
Registered User
 
Join Date: Oct 2009
Location: Germany
Posts: 3,303
Doesn't LHA or LZX work? I ask because emulators have LHA support but I don' know how they handle that. I guess the problem is write access for changing files (highscore, savestate, icons, ...)? If the user want to change tooltypes, how shall it work?
daxb is offline  
Old 20 July 2020, 14:58   #7
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,502
lha probably is the best option. Supports Amiga flags, comments. Open source (Original unix lha).

lzx also works but there is only reverse-engineered source available and lzx (if I remember correctly) can't be seeked freely, at least in some compression modes.
Toni Wilen is online now  
Old 20 July 2020, 15:08   #8
Wepl
Moderator
 
Wepl's Avatar
 
Join Date: Nov 2001
Location: Germany
Posts: 866
Existing archivers like lha/lzx doesn't allow random access and are also too slow.
The archives will be read-only for whdload. The archives will be like an additional read-only data directory. Writing will only occur to normal files if there is also real data directory (e.g. SavePath).
Wepl is offline  
Old 20 July 2020, 15:26   #9
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,502
Store without compression/use simpler compression method? If the main point was to reduce number of tiny files.

I thought you meant seeking to any file which is possible with lha (but not necessarily possible with lzx or any other with "solid" method without decompressing all previous files in archive). AFAIK no normal archiver supports random access seeking to any file position without (at least partially) decompressing the file first.

In my opinion some kind of caching (decompress file when needed on the fly and then keep decompressed data in memory) would fix the slow down problem. Files are generally tiny (unless slave use disk image) so decompression is always fast
Toni Wilen is online now  
Old 20 July 2020, 16:37   #10
Wepl
Moderator
 
Wepl's Avatar
 
Join Date: Nov 2001
Location: Germany
Posts: 866
Starting without compression is probably the best. But without compression it will be a step back for people which use XPK currently.

Seeking in the files is probably also no intention for a archiver. But it should also work without preloaded files in low memory configs. If the files are compressed in not too large chunks it should be possible. A caching buffer of one chunk should be suffient if many small IOs are performed to avoid a slowdown.
Wepl is offline  
Old 06 September 2020, 20:54   #11
temisu
Registered User
 
Join Date: Mar 2017
Location: Tallinn / Estonia
Posts: 74
I know nobody asked my opinion (and this is an old thread) but I've been experimenting idea of having filesystem-like view for archives (lhz, lzx, zip) and I've learned a thing or two in the process. I have not yet shared any code for that since it is not maybe in best of shape, although I have open decompressor algorithms made for lzx, most of lha and some of zip. (This is not in the amiga context itself, but same learnings apply)

My view on this topic would be siply "please no no no another container format". There are enough already. Features like this always live or die the easiness of the tooling made.
Lets take a look of the options:

* tar.(gz,bz2,xz...) - is compressed with the container. completely unsuitable
* lzx - can compress multiple files into a stream, is not random access per files.
* lha - linear structure. not ideal for random access.
* zip - has central directory. Files randomly accessible.
* rar - proprietary, free solutions can only unpack (legally that is)
* 7z - open format but current tooling does not look like it is classic friendly
* adf + xpk files - Well, it is an option. Not necessarily a good one though...

So frankly, we are left with zip. Fortunately zip has support for storing amiga protection bits (actually 2 different ways), file comments and support files being both stored compressed and uncompressed, although again tool support varies for this feature.

The best part of zip is (assuming we are not interested about fancier features like zip64, multi file archives, encryption etc.) is that it is rather low overhead format. I can see that central directory probably needs to be read into memory but rest of the file is in-demand only. Also if we stay only on the deflate compression there are hundreds of implementations - I'm pretty sure there is speedy m68k variant as well...

Then comes the tricky part of random access inside files. Even many xpk-compressors do not implement this properly. Also nothing nerfs the compression performance than splitting the compression blocks into a smaller pieces. However, not many people know that deflate can also split the the bitstream into smaller chunks (some rare compressors use this feature, but not many). So it is easy to emulate a block-size structure for the Deflate. Only problem is to where to store this information. Here the zlib and zip container format comes handy. We can append offsets in some sane fashion to the end of the bitstream and still be compliant and other zip-file extractors do not see anything special but implementation that knows about the method can speed up seek() made quite considerably. Best part is that this does not have to be static - some files could for example be made block size by track whereas others could be by sector. This only depends how to make such tooling...

This comes to the last point. This would be a really fun project for me and I could make a reference implementation (C) for using the zip-archives and tooling (C++) how to create "enhanced" zip files for open license (like BSD) assuming this is something that would be picked up by WHDLoad I don't even know what are the technical requirement for the code. I'm too old for assembly, even though I have written it "enough" earlier...
temisu is offline  
Old 06 September 2020, 21:22   #12
jotd
This cat is no more
 
jotd's Avatar
 
Join Date: Dec 2004
Location: FRANCE
Age: 52
Posts: 8,160
would make sense for big games with a lot of files. Currently starting whdload on such games (even without PRELOAD) takes forever. I don't really know why whdload scans the contents of the "data" directory since files aren't loaded.
jotd is offline  
Old 07 September 2020, 19:29   #13
rare_j
Zone Friend
 
rare_j's Avatar
 
Join Date: Apr 2005
Location: London
Posts: 1,176
Apparently, zip is missing support for a particular amiga file attribute that breaks some game files for whdload that have been compressed, then decompressed using it.

This is why the preferred compressor for game files for whdload is lha.

Sorry that I cannot remember the details of the problem with zip. It was identified in a thread on eab recently which I can't find. But this is relevent because you'd need to use a customised version of zip to fix this for whdload compression, and so you effectively have another container format.
rare_j is offline  
Old 07 September 2020, 19:48   #14
Radertified
Registered User
 
Join Date: Jan 2011
Location: -
Posts: 728
Quote:
Originally Posted by rare_j View Post
It was identified in a thread on eab recently
You're probably referring to this one where StingRay schooled me: http://eab.abime.net/showthread.php?t=31450

Zip cannot store Amiga filesystem specific features, such as comments, so it's out of the question.
Radertified is offline  
Old 07 September 2020, 22:14   #15
rare_j
Zone Friend
 
rare_j's Avatar
 
Join Date: Apr 2005
Location: London
Posts: 1,176
Yes it was that thread, but now I look again I don't think stingray was talking about zip.
He was talking about extracting archives onto a non-amigados filesystem.
Additionally, temisu says above that zip on the amiga does support comments.

However, there is a problem with info-zip on the amiga. I have reproduced the issue myself. There is a game, that if you zip up all the contents on the amiga, then unzip the contents again, the game doesn't work. I apologise that I don't remember the game, and (as far as I know) it was not identified what the issue is.
Perhaps Retroplay remembers which game it is.
rare_j is offline  
Old 07 September 2020, 22:19   #16
temisu
Registered User
 
Join Date: Mar 2017
Location: Tallinn / Estonia
Posts: 74
Quote:
Originally Posted by Radertified View Post
Zip cannot store Amiga filesystem specific features, such as comments, so it's out of the question.
Hi, I did not come here to argue (or to start flame war) about archivers. I think we all know the state of the affairs with lha / zip in amiga and the problems they have and why people prefer lha...

Following we know for certain for zip
  • Amiga attributes are easily broken if the file is ever decompressed in any other operating system
  • To include file comments in zip files you need to add special flag both when compressing and decompressing
  • There can be character set conversions that break the filenames

However, if we want to choose between a new container format and fixing bugs in existing one (and extending it backwards compatible fashion), fixing existing one would be in my mind a better option. There is nothing wrong in the zip-file format itself, it is well documented in a RFC and has working extension system. It is always about the implementation...

I just wanted to point out that zip-file format is better suited for random access of files. Obviously there is a risk that if zip is used as a container, there will be people abusing it. But it only takes a simple check to see what is the creator operating system to make sure we have a proper image. So there is always a trade off

Now, in order to steer the discussion to more technical side I made some comparisons. I tested with known adf-file (3DDemo1). Lets compare what are the effects of making file seekable with track accuracy (5632 bytes)
  • ADF - 901120 bytes
  • ADZ - 597108 bytes
  • DMS - 617348 bytes
  • Split deflate - 628245 bytes

So, obviously there is a price to pay if you make the compressed file seekable. But to me it looks like there would be a benefit as long as it is something that is dynamically tunable per file
temisu is offline  
Old 07 September 2020, 22:39   #17
Radertified
Registered User
 
Join Date: Jan 2011
Location: -
Posts: 728
Quote:
Originally Posted by temisu View Post
Hi, I did not come here to argue (or to start flame war) about archivers.
I'm definitely not arguing about it. I'm sorry if my reply came off as hostile.

If zip works, fantastic. Let's go with that
Radertified is offline  
Old 07 September 2020, 22:41   #18
rare_j
Zone Friend
 
rare_j's Avatar
 
Join Date: Apr 2005
Location: London
Posts: 1,176
Quote:
Originally Posted by temisu View Post
[*]To include file comments in zip files you need to add special flag both when compressing and decompressing
It's possible that this feature has been causing trouble in the past.
Is there a way to tell if an archive has been generated preserving file comments?
rare_j is offline  
Old 07 September 2020, 22:57   #19
temisu
Registered User
 
Join Date: Mar 2017
Location: Tallinn / Estonia
Posts: 74
Quote:
Originally Posted by Radertified View Post
I'm definitely not arguing about it. I'm sorry if my reply came off as hostile.

If zip works, fantastic. Let's go with that
No worries. Sometimes in order to make progress we need to be able talk about crazy ideas openly.

I do not know whether having a zip is a good idea. But I'm sure we can think and talk about it

Quote:
Originally Posted by rare_j View Post
It's possible that this feature has been causing trouble in the past.
Is there a way to tell if an archive has been generated preserving file comments?
Well, if there are content in the comment-attribute you can be certain that archive is created properly. If not, you have no safe way of knowing whether there actually was not any comments or whether they were not included by accident.

On decompression side, they will get just dropped if the special flag is not used for unzip.

Looking implementation I wrote earlier, I have following (horrific) logic for reading zip-files. (isAmiga is coming from os-creator field for zip-files)

Code:
std::string fileNote;
if (isAmiga && commentLength>1)
{
        // it is not a comment, it is filenote
        size_t commentOffset=dirEntOffset+centralFileHeader.size()+nameLength+extraLength;
        // null is included in file, but we do not have to store it...
        fileNote=UTF8::convertFromISO88591(std::string(reinterpret_cast<const char*>(centralDirectory->data())+commentOffset,std::min(size_t(commentLength),size_t(79U))-1),true); 
}
temisu is offline  
Old 08 September 2020, 00:55   #20
Wepl
Moderator
 
Wepl's Avatar
 
Join Date: Nov 2001
Location: Germany
Posts: 866
Quote:
Originally Posted by temisu View Post
My view on this topic would be simply "please no no no another container format". There are enough already. Features like this always live or die the easiness of the tooling made.
I fully agree with this. If there are formats which match the requirements it will be better to use existing ones.
Quote:
Originally Posted by temisu View Post
Lets take a look of the options:
* lha - linear structure. not ideal for random access.
Yes, if there are many files and the archive is not preloaded accessing a file may require many seeks and reads.
An advantage would be that file comments etc. are widely used with that format. The archive format itself with different header and compression formats is not very nice I think.
Quote:
Originally Posted by temisu View Post
* zip - has central directory. Files randomly accessible.

So frankly, we are left with zip. Fortunately zip has support for storing amiga protection bits (actually 2 different ways), file comments and support files being both stored compressed and uncompressed, although again tool support varies for this feature.
Which zip implementation does support protection bits and file comments?
I did not know about.
Quote:
Originally Posted by temisu View Post
The best part of zip is (assuming we are not interested about fancier features like zip64, multi file archives, encryption etc.) is that it is rather low overhead format. I can see that central directory probably needs to be read into memory but rest of the file is in-demand only.
If the directory is held in memory then this could also be done for lha. Currently this is not done by WHDLoad (except for PreLoad and Examine).
Quote:
Originally Posted by temisu View Post
Also if we stay only on the deflate compression there are hundreds of implementations - I'm pretty sure there is speedy m68k variant as well...
Speed is important but as long as uncompressed storage is supported the user has the choice. Archivers are normally optimized to save space and not for fast decompression. So this is a point why I'm unsure is reusing existing archivers is a good idea.
Quote:
Originally Posted by temisu View Post
Then comes the tricky part of random access inside files. Even many xpk-compressors do not implement this properly. Also nothing nerfs the compression performance than splitting the compression blocks into a smaller pieces. However, not many people know that deflate can also split the the bitstream into smaller chunks (some rare compressors use this feature, but not many). So it is easy to emulate a block-size structure for the Deflate. Only problem is to where to store this information. Here the zlib and zip container format comes handy. We can append offsets in some sane fashion to the end of the bitstream and still be compliant and other zip-file extractors do not see anything special but implementation that knows about the method can speed up seek() made quite considerably. Best part is that this does not have to be static - some files could for example be made block size by track whereas others could be by sector. This only depends how to make such tooling...
Great if this is possible
Quote:
Originally Posted by temisu View Post
This comes to the last point. This would be a really fun project for me and I could make a reference implementation (C) for using the zip-archives and tooling (C++) how to create "enhanced" zip files for open license (like BSD) assuming this is something that would be picked up by WHDLoad I don't even know what are the technical requirement for the code. I'm too old for assembly, even though I have written it "enough" earlier...
I really would like to add this. C would be fine, there is already a small part compiled using vbcc. I could probably also define a small ABI which would allow to have this as a separate BLOB.
Basically needed are:
- function which iterates over all file names (PreLoad)
- function which returns size of a file (GetFileSize)
- function which reads part of a file (or complete file)
- function which iterates over all file names in a sub directory (ListFiles)
- function which iterates over all files/dirs and delivers all filesystem meta data (Examine)

Quote:
Originally Posted by jotd View Post
would make sense for big games with a lot of files. Currently starting whdload on such games (even without PRELOAD) takes forever. I don't really know why whdload scans the contents of the "data" directory since files aren't loaded.
Only if the Slave has Examine flag set WHDLoad collects all filesystem meta data at startup. This data is needed for the Examine/ExNext calls and collected before to avoid many os-switches. So this happens for most kickemu games.
Wepl is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Decompressing Shrinkler executables? oRBIT Coders. General 3 08 February 2019 23:45
WHDLOAD not loading games Taff64 project.WHDLoad 60 06 September 2011 08:22
Crash when loading anything with WHDload KONEY project.WHDLoad 30 27 May 2010 20:58
Tower of Babel (demo) and Magic Fly - 3D games Angus Retrogaming General Discussion 0 05 December 2007 11:22
Damn Mac games fly in Fusion!! DDNI Amiga scene 1 04 June 2007 01:44

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 10:24.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.17130 seconds with 16 queries