English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 12 March 2024, 23:33   #1
Hemiyoda
Registered User
 
Join Date: Mar 2012
Location: Ramnäs / Sweden
Posts: 29
My audio compressor (delta with lookup)

Hello,

I have had this idea for many years that builds on the classic delta fibonacci technique. Finally I have realized it.

Instead of one lookup table I have used 256 LUTs and I divide the sound file into 16byte frames. (i.e. the lookup table can change every 16bytes).

The lookup table is 4kB. (256x16). Since the LUT is 4kB and there is some overhead from the LUT switching, the compression ratio will not be 50%. But used together with additional LZW-style compression the final ratio is close to 50%. (and better than LZW alone).

There is of course some degradation in sound quality but I think it beats ADPCM and it is miles ahead of the traditional delta fibonacci variant.

Demo video: [ Show youtube player ]
Github: https://github.com/Hemiyoda/deladaenc
Hemiyoda is offline  
Old 13 March 2024, 00:01   #2
VladR
Registered User
 
Join Date: Dec 2019
Location: North Dakota
Posts: 741
What kind of analysis do you do on the data before you create the 256 LUTs ?

It should be very straightforward to measure the total error by simply adding all the deltas together (New - Old), no ?
VladR is offline  
Old 13 March 2024, 07:13   #3
Hemiyoda
Registered User
 
Join Date: Mar 2012
Location: Ramnäs / Sweden
Posts: 29
Quote:
Originally Posted by VladR View Post
What kind of analysis do you do on the data before you create the 256 LUTs ?

It should be very straightforward to measure the total error by simply adding all the deltas together (New - Old), no ?
The short answer is yes!

The encoder gradually builds up the lut. Checking the total error for each frame. Almost every frame will have some minor error so I work with a threshold. The encoder works with multiple passes and discarding of unused sets. (It may find a better set and every new frame is dependent on the result of the previous frame).
There is also optimization of every LUT. It discards duplicate delta values, tries to remove values too close to each other and extends the search if all the spots doesn't get filled in the current frame.

This all adds up to that a 200kB mod takes minutes to process on a 10 year old i3 laptop. It would probably take a day on an A500.
Hemiyoda is offline  
Old 13 March 2024, 10:24   #4
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,165
How long compression takes shouldn't matter much. Ideally what you want is decompression that's so fast you can mix multiple compressed streams as you go. Years ago I wrote something I called xdac which was a variable bitrate delta compressor that was designed so that I could, like you, decode frames (each frame had a bitrate and embedded lookup so it wasn't the best compression rate unless I used longet frames or had silences). However the mixing was built in to the decoding so you'd be producing the accumulated output frame. It was designed for 16 bit audio reconstruction and generally got around a 3:1 compression for an average 4 bit delta. Never ended up using it in the end, but it was fun to work on.
Karlos is offline  
Old 13 March 2024, 10:51   #5
Hemiyoda
Registered User
 
Join Date: Mar 2012
Location: Ramnäs / Sweden
Posts: 29
I want to clarify that the intended usage of my compressor is to reduce storage space. Not reducing runtime memory usage.
(although decoding would absolutely be fast enough because it's only shifts/masking,lut lookup & addition)
Hemiyoda is offline  
Old 13 March 2024, 10:59   #6
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,165
Quote:
Originally Posted by Hemiyoda View Post
I want to clarify that the intended usage of my compressor is to reduce storage space. Not reducing runtime memory usage.
(although decoding would absolutely be fast enough because it's only shifts/masking,lut lookup & addition)
Do you use a predictor? One of the mistakes of my old code was that it only really looked at the difference between the currently reconstructed sample (while encoding) versus the next real input and tried to find the delta value from the set that, when added to the current value, would always be closest to the next input sample value. That works but it's not the best. You can use a predictor function that first estimates the next sample based on the previous N values and then you only need the difference between the predicted value and the actual value and to encode that, which typically results in less data to encode for the same amount of bits. There are lots of potential functions you can use (and documentation) some of which are not too expensive.

TLDR - the differences between predicted next sample and actual next sample are usually smaller than the differences between adjacent samples.

Last edited by Karlos; 13 March 2024 at 12:31.
Karlos is offline  
Old 13 March 2024, 11:49   #7
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,474
interesting, nice work.
It reminds me somehow of this article where a similar method is tried with different quantization tables per frame for DPCM:
https://www.bitsnbites.eu/hiqh-quality-dpcm-attempts/

Surely yours is of 'lower' quality, but certainly much faster (8-bit vs 16bit final PCM).
ross is offline  
Old 13 March 2024, 12:30   #8
Hemiyoda
Registered User
 
Join Date: Mar 2012
Location: Ramnäs / Sweden
Posts: 29
Quote:
Originally Posted by Karlos View Post
Do you use a predictor? One of the mistakes of my old code was that it only really looked at the difference between the currently reconstructed sample (while encoding) versus the next real input and tried to find the delta value from the set that, when added to the current value, would always be closest to the next input sample value. That works but it's not the best. You can use a predictor function that first estimates the next sample based on the previous N values and then you only need the difference between the predicted value and the actual value and to encode that, which typically results in less data to encode for the same amount of bits. There are lots of potential functions you can use (and documentation) some of which are not too expensive.
I think I understand how your method works now. I have also thought of using a 'correction stream', which in principle, depending on implementation could make the compression lossless.

What I have observed of the predictor in ADPCM is that it is bad at transients. (But it still sounds pretty good). The diff files on adpcm sample data produces big errors. ADPCM would do better with higher sample rates. (Mostly 8000-16000hz in mods).

No predictor in my encoder. The car equivalent of my encoder would be a Lada. i.e low complexity and most of the times it takes you where you want to go & in reasonable condition.
Hemiyoda is offline  
Old 13 March 2024, 12:41   #9
Hemiyoda
Registered User
 
Join Date: Mar 2012
Location: Ramnäs / Sweden
Posts: 29
Quote:
Originally Posted by ross View Post
interesting, nice work.
It reminds me somehow of this article where a similar method is tried with different quantization tables per frame for DPCM:
https://www.bitsnbites.eu/hiqh-quality-dpcm-attempts/

Surely yours is of 'lower' quality, but certainly much faster (8-bit vs 16bit final PCM).
Thanks! Very seldom there's something completely new under the sun.
I had some inspiration from the old TV sound standard. NICAM

It would have been cool if they had dared to design the Paula with non-linear 8-bit DACs. But designing a sampler would have been harder and we could probably not do the '14-bit' trick.
Hemiyoda is offline  
Old 13 March 2024, 12:59   #10
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,165
Quote:
Originally Posted by Hemiyoda View Post
It would have been cool if they had dared to design the Paula with non-linear 8-bit DACs. But designing a sampler would have been harder and we could probably not do the '14-bit' trick.
I think I know what you mean (companded encoding) but they are still pretty non-linear TBH. This is why you need to use the CyberSound 14-bit calibration tool to get the best results on 14-bit playback.

The volume controls on the channels are PWM based (basically turning the DAC on and off at higher than audio frequency). The DAC is on when a counter (cyclic, range 0-63, updated at ~3.5MHz) is below the channel volume value. This gives a very linear volume response, but the output level for a given sample value at full channel volume is not very linear.
Karlos is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Lookup tables and addressing modes Ernst Blofeld Coders. Asm / Hardware 10 06 February 2021 21:51
How do I use Cranker compressor tool? retrogamer support.Apps 8 23 August 2020 17:08
WHDLoad Game Name Lookup Enverex support.FS-UAE 23 28 February 2015 22:21
Lookup by SPS ID jotd HOL suggestions and feedback 2 02 August 2006 00:45
Automatic Game Database Lookup !? AmiGer Retrogaming General Discussion 6 03 October 2002 11:30

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 07:00.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.08712 seconds with 13 queries