English Amiga Board

English Amiga Board (https://eab.abime.net/index.php)
-   Coders. General (https://eab.abime.net/forumdisplay.php?f=37)
-   -   My audio compressor (delta with lookup) (https://eab.abime.net/showthread.php?t=117144)

Hemiyoda 12 March 2024 23:33

My audio compressor (delta with lookup)
 
Hello,

I have had this idea for many years that builds on the classic delta fibonacci technique. Finally I have realized it.

Instead of one lookup table I have used 256 LUTs and I divide the sound file into 16byte frames. (i.e. the lookup table can change every 16bytes).

The lookup table is 4kB. (256x16). Since the LUT is 4kB and there is some overhead from the LUT switching, the compression ratio will not be 50%. But used together with additional LZW-style compression the final ratio is close to 50%. (and better than LZW alone).

There is of course some degradation in sound quality but I think it beats ADPCM and it is miles ahead of the traditional delta fibonacci variant.

Demo video: https://www.youtube.com/watch?v=bamlZ7v3OyY
Github: https://github.com/Hemiyoda/deladaenc

VladR 13 March 2024 00:01

What kind of analysis do you do on the data before you create the 256 LUTs ?

It should be very straightforward to measure the total error by simply adding all the deltas together (New - Old), no ?

Hemiyoda 13 March 2024 07:13

Quote:

Originally Posted by VladR (Post 1673778)
What kind of analysis do you do on the data before you create the 256 LUTs ?

It should be very straightforward to measure the total error by simply adding all the deltas together (New - Old), no ?

The short answer is yes! :laughing

The encoder gradually builds up the lut. Checking the total error for each frame. Almost every frame will have some minor error so I work with a threshold. The encoder works with multiple passes and discarding of unused sets. (It may find a better set and every new frame is dependent on the result of the previous frame).
There is also optimization of every LUT. It discards duplicate delta values, tries to remove values too close to each other and extends the search if all the spots doesn't get filled in the current frame.

This all adds up to that a 200kB mod takes minutes to process on a 10 year old i3 laptop. It would probably take a day on an A500. :D

Karlos 13 March 2024 10:24

How long compression takes shouldn't matter much. Ideally what you want is decompression that's so fast you can mix multiple compressed streams as you go. Years ago I wrote something I called xdac which was a variable bitrate delta compressor that was designed so that I could, like you, decode frames (each frame had a bitrate and embedded lookup so it wasn't the best compression rate unless I used longet frames or had silences). However the mixing was built in to the decoding so you'd be producing the accumulated output frame. It was designed for 16 bit audio reconstruction and generally got around a 3:1 compression for an average 4 bit delta. Never ended up using it in the end, but it was fun to work on.

Hemiyoda 13 March 2024 10:51

I want to clarify that the intended usage of my compressor is to reduce storage space. Not reducing runtime memory usage.
(although decoding would absolutely be fast enough because it's only shifts/masking,lut lookup & addition)

Karlos 13 March 2024 10:59

Quote:

Originally Posted by Hemiyoda (Post 1673833)
I want to clarify that the intended usage of my compressor is to reduce storage space. Not reducing runtime memory usage.
(although decoding would absolutely be fast enough because it's only shifts/masking,lut lookup & addition)

Do you use a predictor? One of the mistakes of my old code was that it only really looked at the difference between the currently reconstructed sample (while encoding) versus the next real input and tried to find the delta value from the set that, when added to the current value, would always be closest to the next input sample value. That works but it's not the best. You can use a predictor function that first estimates the next sample based on the previous N values and then you only need the difference between the predicted value and the actual value and to encode that, which typically results in less data to encode for the same amount of bits. There are lots of potential functions you can use (and documentation) some of which are not too expensive.

TLDR - the differences between predicted next sample and actual next sample are usually smaller than the differences between adjacent samples.

ross 13 March 2024 11:49

interesting, nice work.
It reminds me somehow of this article where a similar method is tried with different quantization tables per frame for DPCM:
https://www.bitsnbites.eu/hiqh-quality-dpcm-attempts/

Surely yours is of 'lower' quality, but certainly much faster (8-bit vs 16bit final PCM).

Hemiyoda 13 March 2024 12:30

Quote:

Originally Posted by Karlos (Post 1673836)
Do you use a predictor? One of the mistakes of my old code was that it only really looked at the difference between the currently reconstructed sample (while encoding) versus the next real input and tried to find the delta value from the set that, when added to the current value, would always be closest to the next input sample value. That works but it's not the best. You can use a predictor function that first estimates the next sample based on the previous N values and then you only need the difference between the predicted value and the actual value and to encode that, which typically results in less data to encode for the same amount of bits. There are lots of potential functions you can use (and documentation) some of which are not too expensive.

I think I understand how your method works now. I have also thought of using a 'correction stream', which in principle, depending on implementation could make the compression lossless.

What I have observed of the predictor in ADPCM is that it is bad at transients. (But it still sounds pretty good). The diff files on adpcm sample data produces big errors. ADPCM would do better with higher sample rates. (Mostly 8000-16000hz in mods).

No predictor in my encoder. The car equivalent of my encoder would be a Lada. i.e low complexity and most of the times it takes you where you want to go & in reasonable condition. :D

Hemiyoda 13 March 2024 12:41

Quote:

Originally Posted by ross (Post 1673847)
interesting, nice work.
It reminds me somehow of this article where a similar method is tried with different quantization tables per frame for DPCM:
https://www.bitsnbites.eu/hiqh-quality-dpcm-attempts/

Surely yours is of 'lower' quality, but certainly much faster (8-bit vs 16bit final PCM).

Thanks! Very seldom there's something completely new under the sun.
I had some inspiration from the old TV sound standard. NICAM :)

It would have been cool if they had dared to design the Paula with non-linear 8-bit DACs. But designing a sampler would have been harder and we could probably not do the '14-bit' trick.

Karlos 13 March 2024 12:59

Quote:

Originally Posted by Hemiyoda (Post 1673861)
It would have been cool if they had dared to design the Paula with non-linear 8-bit DACs. But designing a sampler would have been harder and we could probably not do the '14-bit' trick.

I think I know what you mean (companded encoding) but they are still pretty non-linear TBH. This is why you need to use the CyberSound 14-bit calibration tool to get the best results on 14-bit playback.

The volume controls on the channels are PWM based (basically turning the DAC on and off at higher than audio frequency). The DAC is on when a counter (cyclic, range 0-63, updated at ~3.5MHz) is below the channel volume value. This gives a very linear volume response, but the output level for a given sample value at full channel volume is not very linear.


All times are GMT +2. The time now is 20:45.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.

Page generated in 0.04329 seconds with 11 queries