View Single Post
Old 24 December 2007, 15:56   #20
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,799
Quote:
Originally Posted by meynaf
And you can't do that on nowadays machines.
Maybe you can with palette based screen modes
Quote:
Originally Posted by meynaf
There are more computations than that : the code also performs on-the-fly requantization.
According to libmad's layer3.c :
Code:
 * The Layer III formula for requantization and scaling is defined by
 * section 2.4.3.4.7.1 of ISO/IEC 11172-3, as follows:
 *
 *   long blocks:
 *   xr[i] = sign(is[i]) * abs(is[i])^(4/3) *
 *           2^((1/4) * (global_gain - 210)) *
 *           2^-(scalefac_multiplier *
 *               (scalefac_l[sfb] + preflag * pretab[sfb]))
 *
 *   short blocks:
 *   xr[i] = sign(is[i]) * abs(is[i])^(4/3) *
 *           2^((1/4) * (global_gain - 210 - 8 * subblock_gain[w])) *
 *           2^-(scalefac_multiplier * scalefac_s[sfb][w])
 *
 *   where:
 *   scalefac_multiplier = (scalefac_scale + 1) / 2
Not simple, really
So this does both in one go, eh? Doesn't that still mean the huffman decoding simply has to be written to output the variable length data, after which the scaling and re-quantization are handled Maybe I just don't get enough of it, yet
Quote:
Originally Posted by meynaf
The part of that stuff is similar to the ham rendering as compared to the jpeg decoding proper, so it's not useless to check.

Remember that we can't play that 16-bit 44.1 data directly ; we have to downsample it before, and prepare it for 14-bit output. My code does this in 5:3 instead of the usual 2:1, leading to 26460hz instead of 22050 (better quality). But, of course, this takes some time.

When I use mpega I'm often at 95% cpu use (when there aren't gaps in the replay !), so it's worth removing whatever we can.

This code must write to chip memory, and there are nasty divides in it. You sure know these things aren't fast
95% is pretty steep. I suppose optimizing the 14bit routine really should be done then. Although I still believe most of the gain will come from finding optimizations in the really heavy parts of the code

It's a big shame the audio dma can only be doubled by doubling the screen scan rate, otherwise the down-sampling wouldn't be needed and one could just chop off two bits, would be faster and sound better.

By the way, have you ever thought of a 15bit routine by any chance? I know I should probably not be bringing this up (will slow things down), but I just couldn't resist
Thorham is online now  
 
Page generated in 0.08372 seconds with 11 queries