English Amiga Board - View Single Post

Thorham · 24 December 2007, 15:56

Quote:

Originally Posted by meynaf

And you can't do that on nowadays machines.

Maybe you can with palette based screen modes

Quote:

Originally Posted by meynaf

There are more computations than that : the code also performs on-the-fly requantization.
According to libmad's layer3.c :

Code:

 * The Layer III formula for requantization and scaling is defined by
 * section 2.4.3.4.7.1 of ISO/IEC 11172-3, as follows:
 *
 *   long blocks:
 *   xr[i] = sign(is[i]) * abs(is[i])^(4/3) *
 *           2^((1/4) * (global_gain - 210)) *
 *           2^-(scalefac_multiplier *
 *               (scalefac_l[sfb] + preflag * pretab[sfb]))
 *
 *   short blocks:
 *   xr[i] = sign(is[i]) * abs(is[i])^(4/3) *
 *           2^((1/4) * (global_gain - 210 - 8 * subblock_gain[w])) *
 *           2^-(scalefac_multiplier * scalefac_s[sfb][w])
 *
 *   where:
 *   scalefac_multiplier = (scalefac_scale + 1) / 2

Not simple, really

So this does both in one go, eh? Doesn't that still mean the huffman decoding simply has to be written to output the variable length data, after which the scaling and re-quantization are handled

Maybe I just don't get enough of it, yet

Quote:

Originally Posted by meynaf

The part of that stuff is similar to the ham rendering as compared to the jpeg decoding proper, so it's not useless to check.

Remember that we can't play that 16-bit 44.1 data directly ; we have to downsample it before, and prepare it for 14-bit output. My code does this in 5:3 instead of the usual 2:1, leading to 26460hz instead of 22050 (better quality). But, of course, this takes some time.

When I use mpega I'm often at 95% cpu use (when there aren't gaps in the replay !), so it's worth removing whatever we can.

This code must write to chip memory, and there are nasty divides in it. You sure know these things aren't fast

95% is pretty steep. I suppose optimizing the 14bit routine really should be done then. Although I still believe most of the gain will come from finding optimizations in the really heavy parts of the code

It's a big shame the audio dma can only be doubled by doubling the screen scan rate, otherwise the down-sampling wouldn't be needed and one could just chop off two bits, would be faster and sound better.

By the way, have you ever thought of a 15bit routine by any chance? I know I should probably not be bringing this up (will slow things down), but I just couldn't resist