English Amiga Board - View Single Post

paraj · 24 April 2017, 12:43

Quote:

Originally Posted by ross

Hi paraj, long time ago I came across the same question...
My results? Use mulu/muls, but whit care.

In pure 68k mul(x) is predictable: the algorithm requires 38+2n+address_calculation clocks where n is defined as:
MULU: n = the number of ones in the source
MULS: n = concatenate the source with a zero as the LSB; n is the resultant number of 10 plus number of 01 patterns in the 17-bit source.

So: destination must have max bit resolution (16), source can be minor.

Suppose a 16x16 multiply but in source 8 bit precision suffice.
With MULx dx=xxxxxxxx00000000,dy=xxxxxxxxxxxxxxxx min cycle is 38 and max 54!
[worst case MULU dx=#$FF00->1111111100000000->38+8*2, worst case MULS dx=#$5500->0101010100000000->38+4*2+4*2]

Cheers,
ross

A yeah, that's a good point. For non-scaling transformations the matrix elements will usually have a magnitude of <= 1 ($100 for 8.8 fixpoint), while the vectors also probably have a known range. So it should be easy to decide the best order.

Too bad there isn't an easy way to speed up multiplication like there is with division

at least it's not too slow...