My audio compressor (delta with lookup)

Hemiyoda · 12 March 2024, 23:33

Hello,

I have had this idea for many years that builds on the classic delta fibonacci technique. Finally I have realized it.

Instead of one lookup table I have used 256 LUTs and I divide the sound file into 16byte frames. (i.e. the lookup table can change every 16bytes).

The lookup table is 4kB. (256x16). Since the LUT is 4kB and there is some overhead from the LUT switching, the compression ratio will not be 50%. But used together with additional LZW-style compression the final ratio is close to 50%. (and better than LZW alone).

There is of course some degradation in sound quality but I think it beats ADPCM and it is miles ahead of the traditional delta fibonacci variant.

Demo video: [ Show youtube player ]
Github: https://github.com/Hemiyoda/deladaenc

VladR · 13 March 2024, 00:01

What kind of analysis do you do on the data before you create the 256 LUTs ?

It should be very straightforward to measure the total error by simply adding all the deltas together (New - Old), no ?

Hemiyoda · 13 March 2024, 07:13

Quote:

Originally Posted by VladR

What kind of analysis do you do on the data before you create the 256 LUTs ?

It should be very straightforward to measure the total error by simply adding all the deltas together (New - Old), no ?

The short answer is yes!

The encoder gradually builds up the lut. Checking the total error for each frame. Almost every frame will have some minor error so I work with a threshold. The encoder works with multiple passes and discarding of unused sets. (It may find a better set and every new frame is dependent on the result of the previous frame).
There is also optimization of every LUT. It discards duplicate delta values, tries to remove values too close to each other and extends the search if all the spots doesn't get filled in the current frame.

This all adds up to that a 200kB mod takes minutes to process on a 10 year old i3 laptop. It would probably take a day on an A500.

Karlos · 13 March 2024, 10:24

How long compression takes shouldn't matter much. Ideally what you want is decompression that's so fast you can mix multiple compressed streams as you go. Years ago I wrote something I called xdac which was a variable bitrate delta compressor that was designed so that I could, like you, decode frames (each frame had a bitrate and embedded lookup so it wasn't the best compression rate unless I used longet frames or had silences). However the mixing was built in to the decoding so you'd be producing the accumulated output frame. It was designed for 16 bit audio reconstruction and generally got around a 3:1 compression for an average 4 bit delta. Never ended up using it in the end, but it was fun to work on.

Hemiyoda · 13 March 2024, 10:51

I want to clarify that the intended usage of my compressor is to reduce storage space. Not reducing runtime memory usage.
(although decoding would absolutely be fast enough because it's only shifts/masking,lut lookup & addition)

Karlos · 13 March 2024, 10:59

Quote:

Originally Posted by Hemiyoda

I want to clarify that the intended usage of my compressor is to reduce storage space. Not reducing runtime memory usage.
(although decoding would absolutely be fast enough because it's only shifts/masking,lut lookup & addition)

Do you use a predictor? One of the mistakes of my old code was that it only really looked at the difference between the currently reconstructed sample (while encoding) versus the next real input and tried to find the delta value from the set that, when added to the current value, would always be closest to the next input sample value. That works but it's not the best. You can use a predictor function that first estimates the next sample based on the previous N values and then you only need the difference between the predicted value and the actual value and to encode that, which typically results in less data to encode for the same amount of bits. There are lots of potential functions you can use (and documentation) some of which are not too expensive.

TLDR - the differences between predicted next sample and actual next sample are usually smaller than the differences between adjacent samples.

ross · 13 March 2024, 11:49

interesting, nice work.
It reminds me somehow of this article where a similar method is tried with different quantization tables per frame for DPCM:
https://www.bitsnbites.eu/hiqh-quality-dpcm-attempts/

Surely yours is of 'lower' quality, but certainly much faster (8-bit vs 16bit final PCM).

Hemiyoda · 13 March 2024, 12:30

Quote:

Originally Posted by Karlos

Do you use a predictor? One of the mistakes of my old code was that it only really looked at the difference between the currently reconstructed sample (while encoding) versus the next real input and tried to find the delta value from the set that, when added to the current value, would always be closest to the next input sample value. That works but it's not the best. You can use a predictor function that first estimates the next sample based on the previous N values and then you only need the difference between the predicted value and the actual value and to encode that, which typically results in less data to encode for the same amount of bits. There are lots of potential functions you can use (and documentation) some of which are not too expensive.

I think I understand how your method works now. I have also thought of using a 'correction stream', which in principle, depending on implementation could make the compression lossless.

What I have observed of the predictor in ADPCM is that it is bad at transients. (But it still sounds pretty good). The diff files on adpcm sample data produces big errors. ADPCM would do better with higher sample rates. (Mostly 8000-16000hz in mods).

No predictor in my encoder. The car equivalent of my encoder would be a Lada. i.e low complexity and most of the times it takes you where you want to go & in reasonable condition.

Hemiyoda · 13 March 2024, 12:41

Quote:

Originally Posted by ross

interesting, nice work.
It reminds me somehow of this article where a similar method is tried with different quantization tables per frame for DPCM:
https://www.bitsnbites.eu/hiqh-quality-dpcm-attempts/

Surely yours is of 'lower' quality, but certainly much faster (8-bit vs 16bit final PCM).

Thanks! Very seldom there's something completely new under the sun.
I had some inspiration from the old TV sound standard. NICAM

It would have been cool if they had dared to design the Paula with non-linear 8-bit DACs. But designing a sampler would have been harder and we could probably not do the '14-bit' trick.

Karlos · 13 March 2024, 12:59

Quote:

Originally Posted by Hemiyoda

It would have been cool if they had dared to design the Paula with non-linear 8-bit DACs. But designing a sampler would have been harder and we could probably not do the '14-bit' trick.

I think I know what you mean (companded encoding) but they are still pretty non-linear TBH. This is why you need to use the CyberSound 14-bit calibration tool to get the best results on 14-bit playback.

The volume controls on the channels are PWM based (basically turning the DAC on and off at higher than audio frequency). The DAC is on when a counter (cyclic, range 0-63, updated at ~3.5MHz) is below the channel volume value. This gives a very linear volume response, but the output level for a given sample value at full channel volume is not very linear.

12 March 2024, 23:33	#1
Hemiyoda Registered User Join Date: Mar 2012 Location: Ramnäs / Sweden Posts: 29	My audio compressor (delta with lookup) Hello, I have had this idea for many years that builds on the classic delta fibonacci technique. Finally I have realized it. Instead of one lookup table I have used 256 LUTs and I divide the sound file into 16byte frames. (i.e. the lookup table can change every 16bytes). The lookup table is 4kB. (256x16). Since the LUT is 4kB and there is some overhead from the LUT switching, the compression ratio will not be 50%. But used together with additional LZW-style compression the final ratio is close to 50%. (and better than LZW alone). There is of course some degradation in sound quality but I think it beats ADPCM and it is miles ahead of the traditional delta fibonacci variant. Demo video: [ Show youtube player ] Github: https://github.com/Hemiyoda/deladaenc

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Lookup tables and addressing modes	Ernst Blofeld	Coders. Asm / Hardware	10	06 February 2021 21:51
How do I use Cranker compressor tool?	retrogamer	support.Apps	8	23 August 2020 17:08
WHDLoad Game Name Lookup	Enverex	support.FS-UAE	23	28 February 2015 22:21
Lookup by SPS ID	jotd	HOL suggestions and feedback	2	02 August 2006 00:45
Automatic Game Database Lookup !?	AmiGer	Retrogaming General Discussion	6	03 October 2002 11:30

13 March 2024, 00:01	#2
VladR Registered User Join Date: Dec 2019 Location: North Dakota Posts: 741	What kind of analysis do you do on the data before you create the 256 LUTs ? It should be very straightforward to measure the total error by simply adding all the deltas together (New - Old), no ?

13 March 2024, 10:24	#4
Karlos Alien Bleed Join Date: Aug 2022 Location: UK Posts: 4,571	How long compression takes shouldn't matter much. Ideally what you want is decompression that's so fast you can mix multiple compressed streams as you go. Years ago I wrote something I called xdac which was a variable bitrate delta compressor that was designed so that I could, like you, decode frames (each frame had a bitrate and embedded lookup so it wasn't the best compression rate unless I used longet frames or had silences). However the mixing was built in to the decoding so you'd be producing the accumulated output frame. It was designed for 16 bit audio reconstruction and generally got around a 3:1 compression for an average 4 bit delta. Never ended up using it in the end, but it was fun to work on.

13 March 2024, 10:51	#5
Hemiyoda Registered User Join Date: Mar 2012 Location: Ramnäs / Sweden Posts: 29	I want to clarify that the intended usage of my compressor is to reduce storage space. Not reducing runtime memory usage. (although decoding would absolutely be fast enough because it's only shifts/masking,lut lookup & addition)

13 March 2024, 11:49	#7
ross Defendit numerus Join Date: Mar 2017 Location: Crossing the Rubicon Age: 54 Posts: 4,501	interesting, nice work. It reminds me somehow of this article where a similar method is tried with different quantization tables per frame for DPCM: https://www.bitsnbites.eu/hiqh-quality-dpcm-attempts/ Surely yours is of 'lower' quality, but certainly much faster (8-bit vs 16bit final PCM).

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)