Sound overhaul for TKG - Page 3

Karlos · 17 June 2024, 13:09

That's what I was thinking. There is the notion of a master volume level on the mixer that affects how the tables look, but we can just store a straight up amplification factor in a table of 15 entries. That factor then applies to all 16 of the samples in the loop. So, the loop becomes read the sample byte, sign extend, multiply by the factor and accumulate. If that multiplication factor is kept in the range 0-255, then the resulting lower word of the product is all we need.

Karlos · 19 June 2024, 19:50

After a bit of a false start, the normalisation code is working well.

I added an entry point where I can just normalise the accumulation buffer(s) and streamed some 16-bit audio through it, dumping the 8-bit output. Zoomed out, it looks awful. Totally "sausaged" to throw a bit of sound mixer slang on it. And it sounds like a distorted mess (though still identifiable).

However, zooming in, you begin to see the individually normalised frames. What matters is that each frame is normalised to play back at a given Paula volume. The discontinuites in the data re where there are significant changes in the required playback volume.

It may not look much but this is is progress

Karlos · 19 June 2024, 21:35

@paraj (and anyone else interested)

Do you think you could try this on your 060? 040 also welcome but I haven't properly optimised it yet for either (beyond using shift normalisation where it can be). It loads up 3 (low quality) 8 bit sample filess and mixes them. The mixing currently goes to chip ram buffers, two 8-bit sample buffers and 2 16-bit volume buffers. These aren't wired up to Paula yet, but we capture the data and write it to a couple of files (raw 8 bit signed left and right sample buffers and 16-bit left and right volume buffers). The sounds are particularly bad because they were 7-bit 8kHz and I crudely converted them to 8-bit 16kHz in audacity which seems to have added a lot of grit.

https://github.com/0xABADCAFE/tkg-mixer (binary and sound files are needed)

The mixer is set up with a notional mixing rate of 16 kHz and an update rate of 50Hz, which all results in a packet size of 320 samples per update.

The time taken to call Aud_MixPacket() (which does all the mixing, peak detection, normalisation and push to the 4 chip ram buffers for a packet) is summed until all the input channels are empty. This results in 96 packets being mixed. This would be equivalent to about 1.92 seconds.

At the end, the total number of eclock ticks required is reported. I am interested to know what this looks like on real hardware as it will inform how much we can get away with. The results I get in UAE are meaningless.

paraj · 19 June 2024, 22:14

Zone a zip/lha that has everything needed and I'll report the numbers.

Karlos · 19 June 2024, 22:18

Quote:

Originally Posted by paraj

Zone a zip/lha that has everything needed and I'll report the numbers.

You can download a zip straight from the repo home page, just click the code button abd choose Download as Zip

paraj · 19 June 2024, 22:24

Quote:

Originally Posted by Karlos

You can download a zip straight from the repo home page, just click the code button abd choose Download as Zip

Ah, missed "mixer".
Output:

Code:

Got Timer, frequency is 709379 Hz
Loaded sounds/Teleport.raw [24000 bytes] at 0x68a71ba0
Loaded sounds/RumbleWind.raw [30600 bytes] at 0x68b488c0
Loaded sounds/Collect_weapon.raw [9000 bytes] at 0x68a77970
Aud_Mixer allocated at 0x68a6fc70
    Mix Rate      16000 Hz
    Update Rate   50 Hz
    Packet Length 320 samples [20 lines]
    Left Sample Packet at  0x14160
    Left Volume Packet at  0x142a0
    Right Sample Packet at 0x142f0
    Right Volume Packet at 0x14430
    Volume Tables at 0x68a6fd80
    AbsMaxL 0 [Norm Index 0]
    AbsMaxR 0 [Norm Index 0]
    Norm Table at 0x6800cbfc
    Channel  0: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel  1: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel  2: SamplePtr: 0x68a71ba0 [Remaining: 24000 LVol:15 RVol: 5]
    Channel  3: SamplePtr: 0x68b488c0 [Remaining: 30608 LVol: 1 RVol: 1]
    Channel  4: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel  5: SamplePtr: 0x68a77970 [Remaining: 9008 LVol: 0 RVol: 8]
    Channel  6: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel  7: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel  8: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel  9: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel 10: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel 11: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel 12: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel 13: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel 14: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel 15: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
Sample Fetch Buffer [ +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0, ]
Left Mix Buffer  [   +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0, ]
Right Mix Buffer [   +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0, ]
Aud_Mixer allocated at 0x68a6fc70
    Mix Rate      16000 Hz
    Update Rate   50 Hz
    Packet Length 320 samples [20 lines]
    Left Sample Packet at  0x142a0
    Left Volume Packet at  0x142c8
    Right Sample Packet at 0x14430
    Right Volume Packet at 0x14458
    Volume Tables at 0x68a6fd80
    AbsMaxL 0 [Norm Index 0]
    AbsMaxR 0 [Norm Index 0]
    Norm Table at 0x6800cbfc
    Channel  0: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel  1: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel  2: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel  3: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel  4: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel  5: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel  6: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel  7: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel  8: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel  9: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel 10: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel 11: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel 12: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel 13: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel 14: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
    Channel 15: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
Sample Fetch Buffer [ +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0, ]
Left Mix Buffer  [   +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0, ]
Right Mix Buffer [   +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0, ]
Mixed 96 Packets in 39851 EClockVal ticks (709379/s)

Karlos · 19 June 2024, 22:38

That looks reasonable. 39851/709379 is 56.18ms for 96 packets, or 0.585ms per packet.

Each packet would be 20ms of audio under the given configuration.

Karlos · 19 June 2024, 22:43

I'll have to push an update that mixes all 16 channels now and measure that.

paraj · 19 June 2024, 22:50

Quote:

Originally Posted by Karlos

I'll have to push an update that mixes all 16 channels now and measure that.

Would probably be a good idea if you could make a small test suite of representative cases, whatever that means (you're the boss). Easy to run tests is how we get numbers

Karlos · 19 June 2024, 22:51

The other thing I wanted to do was to mix in an audio stream for music, since this will prevent the old mod player from working (all 4 channels are used).

There's no reason the stream can't be better than 8 bit depth but it is an interesting problem from a size point of view. Perhaps something delta encoded. It could pull the next 16 bytes into the existing fetch buffer with move16 and do some decoding directly into the left and right accumulation buffer, before we start mixing any active sound effects channels over it.

I also need to add your volmod output implementation on the thing and make it make actual sound.

Karlos · 19 June 2024, 22:52

Quote:

Originally Posted by paraj

Would probably be a good idea if you could make a small test suite of representative cases, whatever that means (you're the boss). Easy to run tests is how we get numbers

Yeah, this is just getting started

Karlos · 19 June 2024, 22:55

It's probably sensible to set a time budget. How much time do we want to allocate to mixing? My assumption is I'm going to need to do this on an interrupt. I think the existing game is doing it in the same one all the game logic runs in (but not the rendering/C2P)

paraj · 20 June 2024, 19:08

A good starting point would be to know much time the existing code takes. Otherwise, let's say the game runs at 10fps (fullscreen), then the packet time translates to around ~3% of CPU time. Seems reasonable for good sound.

Karlos · 20 June 2024, 20:57

Can you download a fresh zip? I have pushed a change to test all 16 channels with a spread of volumes and offsets to make sure we have something approximating a worst case scenario.

As for the existing sound mixer, I doubt it takes even a tiny fraction of what this will, since it mixes 7-bit samples at 8kHz in a pretty tight loop.

My target is at least 16kHz, 16 8-bit input channels with 16 independent left right volume settings per input channel, 16-bit mixing, full stereo volume-normalised (i.e. volume channel modulated) output.

Stretch goals are:

More channels (if needed)
Better input LR volume resolution (e.g. 32 steps)
Music stream - something that decodes into the accumulation buffer directly, with ideally better than 8-bit resolution.

paraj · 20 June 2024, 21:12

Result:

Code:

Got Timer, frequency is 709379 Hz
Loaded sounds/airstrike.raw [60460 bytes] at 0x68a7b1c0
Aud_Mixer allocated at 0x68a792a0
	Mix Rate      16000 Hz
	Update Rate   50 Hz
	Packet Length 320 samples [20 lines]
	Left Sample Packet at  0x14160
	Left Volume Packet at  0x142a0
	Right Sample Packet at 0x142f0
	Right Volume Packet at 0x14430
	Volume Tables at 0x68a793b0
	AbsMaxL 0 [Norm Index 0]
	AbsMaxR 0 [Norm Index 0]
	Norm Table at 0x6800cdac
	Channel  0: SamplePtr: 0x68af6060 [Remaining: 60464 LVol: 0 RVol:15]
	Channel  1: SamplePtr: 0x68a7b1e0 [Remaining: 60432 LVol: 1 RVol:14]
	Channel  2: SamplePtr: 0x68af60a0 [Remaining: 60400 LVol: 2 RVol:13]
	Channel  3: SamplePtr: 0x68a7b220 [Remaining: 60368 LVol: 3 RVol:12]
	Channel  4: SamplePtr: 0x68af60e0 [Remaining: 60336 LVol: 4 RVol:11]
	Channel  5: SamplePtr: 0x68a7b260 [Remaining: 60304 LVol: 5 RVol:10]
	Channel  6: SamplePtr: 0x68af6120 [Remaining: 60272 LVol: 6 RVol: 9]
	Channel  7: SamplePtr: 0x68a7b2a0 [Remaining: 60240 LVol: 7 RVol: 8]
	Channel  8: SamplePtr: 0x68af6160 [Remaining: 60208 LVol: 8 RVol: 7]
	Channel  9: SamplePtr: 0x68a7b2e0 [Remaining: 60176 LVol: 9 RVol: 6]
	Channel 10: SamplePtr: 0x68af61a0 [Remaining: 60144 LVol:10 RVol: 5]
	Channel 11: SamplePtr: 0x68a7b320 [Remaining: 60112 LVol:11 RVol: 4]
	Channel 12: SamplePtr: 0x68af61e0 [Remaining: 60080 LVol:12 RVol: 3]
	Channel 13: SamplePtr: 0x68a7b360 [Remaining: 60048 LVol:13 RVol: 2]
	Channel 14: SamplePtr: 0x68af6220 [Remaining: 60016 LVol:14 RVol: 1]
	Channel 15: SamplePtr: 0x68a7b3a0 [Remaining: 59984 LVol:15 RVol: 0]
Sample Fetch Buffer [ +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0, ]
Left Mix Buffer  [   +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0, ]
Right Mix Buffer [   +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0, ]
Aud_Mixer allocated at 0x68a792a0
	Mix Rate      16000 Hz
	Update Rate   50 Hz
	Packet Length 320 samples [20 lines]
	Left Sample Packet at  0x142a0
	Left Volume Packet at  0x142c8
	Right Sample Packet at 0x14430
	Right Volume Packet at 0x14458
	Volume Tables at 0x68a793b0
	AbsMaxL 0 [Norm Index 0]
	AbsMaxR 0 [Norm Index 0]
	Norm Table at 0x6800cdac
	Channel  0: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
	Channel  1: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
	Channel  2: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
	Channel  3: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
	Channel  4: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
	Channel  5: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
	Channel  6: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
	Channel  7: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
	Channel  8: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
	Channel  9: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
	Channel 10: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
	Channel 11: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
	Channel 12: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
	Channel 13: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
	Channel 14: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
	Channel 15: SamplePtr:          0 [Remaining:    0 LVol: 0 RVol: 0]
Sample Fetch Buffer [ +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0,  +0, ]
Left Mix Buffer  [   +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0, ]
Right Mix Buffer [   +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0,    +0, ]
Mixed 189 Packets in 327178 EClockVal ticks (709379/s)

Karlos · 20 June 2024, 21:41

That's 327178 / 709378 / 189 = 0.00244, i.e. 2.5ms per packet. That's probably higher than I would like, but it's also a worst case with all 16 channels going.

A lot of the time, there would be no audio at all, unless there's a stream going into it.

This is where I am hoping to go with it:

Karlos · 20 June 2024, 21:45

I'll have to add that replace the volume lookup by the multiplication for the 060 and see what the result is under the same conditions.

Karlos · 20 June 2024, 21:56

I wonder if processing things a whole cache line's worth of everything at a time is just too inefficient. It didn't seem to bad on paper. Perhaps it would be better if the fetch and mix buffers were a little longer. We can still do the normalisation a line at a time and have the same output of 16 samples per volume change.

Karlos · 21 June 2024, 14:25

Quote:

Originally Posted by paraj

If you just need multiplication (and even a little extra calculation, assuming it's not something like clipping) then straight calculation is probably faster even without cache effects.

I have been thinking about this a little from the untermensch (sub 060) perspective. I think a table lookup is unavoidable there. However, suppose we directly looked up only the first 8-bit input of the 16. Now we have the first 16-bit sample which we will use as a current value also.

Then, for the remaining 15 8-bit samples in the line, we first convert to a delta and look that up instead. The resulting 16-bit lookup is added to the current value to construct the next sample and this is written to the accumulation buffer.

The thinking here is that you will probably stall on the first two (unless they live in the same cache line) but the resulting delta based lookups should hopefully cluster closely around a small set of table entries that are close together.

Thoughts?

You could even preconvert the sound samples in memory since we always do things in frames of 16. This would remove the subtraction to construct the delta.

paraj · 21 June 2024, 18:14

Just a thought, but but have you looked (heard?) into how much fidelity is lost if you restrict yourself to volume levels that only require shifts? My gut feeling is that you'll want to do that for <060, but I don't have much optimization experience with those targets (and don't have hardware to test it apart from base A1200).

And of course, if you do decide to have two paths, you'll want to make it configurable and not just based on AF_68060 since in particular pistorm does not self identify as such

19 June 2024, 19:50	#42
Karlos Alien Bleed Join Date: Aug 2022 Location: UK Posts: 4,571	After a bit of a false start, the normalisation code is working well. I added an entry point where I can just normalise the accumulation buffer(s) and streamed some 16-bit audio through it, dumping the 8-bit output. Zoomed out, it looks awful. Totally "sausaged" to throw a bit of sound mixer slang on it. And it sounds like a distorted mess (though still identifiable). However, zooming in, you begin to see the individually normalised frames. What matters is that each frame is normalised to play back at a given Paula volume. The discontinuites in the data re where there are significant changes in the required playback volume. It may not look much but this is is progress Attached Thumbnails

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Slow A4000 after overhaul	Screechstar	support.Hardware	57	11 July 2023 23:02
Amiga Font Editor overhaul	buggs	Coders. Releases	19	09 March 2021 17:39
Escom A1200 overhaul	Ox.	Amiga scene	8	26 August 2014 08:54
Will Bridge Practice series needs an overhaul	mk1	HOL data problems	1	02 April 2009 21:55

17 June 2024, 13:09	#41
Karlos Alien Bleed Join Date: Aug 2022 Location: UK Posts: 4,571	That's what I was thinking. There is the notion of a master volume level on the mixer that affects how the tables look, but we can just store a straight up amplification factor in a table of 15 entries. That factor then applies to all 16 of the samples in the loop. So, the loop becomes read the sample byte, sign extend, multiply by the factor and accumulate. If that multiplication factor is kept in the range 0-255, then the resulting lower word of the product is all we need.

19 June 2024, 21:35	#43
Karlos Alien Bleed Join Date: Aug 2022 Location: UK Posts: 4,571	@paraj (and anyone else interested) Do you think you could try this on your 060? 040 also welcome but I haven't properly optimised it yet for either (beyond using shift normalisation where it can be). It loads up 3 (low quality) 8 bit sample filess and mixes them. The mixing currently goes to chip ram buffers, two 8-bit sample buffers and 2 16-bit volume buffers. These aren't wired up to Paula yet, but we capture the data and write it to a couple of files (raw 8 bit signed left and right sample buffers and 16-bit left and right volume buffers). The sounds are particularly bad because they were 7-bit 8kHz and I crudely converted them to 8-bit 16kHz in audacity which seems to have added a lot of grit. https://github.com/0xABADCAFE/tkg-mixer (binary and sound files are needed) The mixer is set up with a notional mixing rate of 16 kHz and an update rate of 50Hz, which all results in a packet size of 320 samples per update. The time taken to call Aud_MixPacket() (which does all the mixing, peak detection, normalisation and push to the 4 chip ram buffers for a packet) is summed until all the input channels are empty. This results in 96 packets being mixed. This would be equivalent to about 1.92 seconds. At the end, the total number of eclock ticks required is reported. I am interested to know what this looks like on real hardware as it will inform how much we can get away with. The results I get in UAE are meaningless.

19 June 2024, 22:14	#44
paraj Registered User Join Date: Feb 2017 Location: Denmark Posts: 1,239	Zone a zip/lha that has everything needed and I'll report the numbers.

19 June 2024, 22:38	#47
Karlos Alien Bleed Join Date: Aug 2022 Location: UK Posts: 4,571	That looks reasonable. 39851/709379 is 56.18ms for 96 packets, or 0.585ms per packet. Each packet would be 20ms of audio under the given configuration.

19 June 2024, 22:43	#48
Karlos Alien Bleed Join Date: Aug 2022 Location: UK Posts: 4,571	I'll have to push an update that mixes all 16 channels now and measure that.

19 June 2024, 22:51	#50
Karlos Alien Bleed Join Date: Aug 2022 Location: UK Posts: 4,571	The other thing I wanted to do was to mix in an audio stream for music, since this will prevent the old mod player from working (all 4 channels are used). There's no reason the stream can't be better than 8 bit depth but it is an interesting problem from a size point of view. Perhaps something delta encoded. It could pull the next 16 bytes into the existing fetch buffer with move16 and do some decoding directly into the left and right accumulation buffer, before we start mixing any active sound effects channels over it. I also need to add your volmod output implementation on the thing and make it make actual sound.

19 June 2024, 22:55	#52
Karlos Alien Bleed Join Date: Aug 2022 Location: UK Posts: 4,571	It's probably sensible to set a time budget. How much time do we want to allocate to mixing? My assumption is I'm going to need to do this on an interrupt. I think the existing game is doing it in the same one all the game logic runs in (but not the rendering/C2P)

20 June 2024, 19:08	#53
paraj Registered User Join Date: Feb 2017 Location: Denmark Posts: 1,239	A good starting point would be to know much time the existing code takes. Otherwise, let's say the game runs at 10fps (fullscreen), then the packet time translates to around ~3% of CPU time. Seems reasonable for good sound.

20 June 2024, 20:57	#54
Karlos Alien Bleed Join Date: Aug 2022 Location: UK Posts: 4,571	Can you download a fresh zip? I have pushed a change to test all 16 channels with a spread of volumes and offsets to make sure we have something approximating a worst case scenario. As for the existing sound mixer, I doubt it takes even a tiny fraction of what this will, since it mixes 7-bit samples at 8kHz in a pretty tight loop. My target is at least 16kHz, 16 8-bit input channels with 16 independent left right volume settings per input channel, 16-bit mixing, full stereo volume-normalised (i.e. volume channel modulated) output. Stretch goals are: More channels (if needed) Better input LR volume resolution (e.g. 32 steps) Music stream - something that decodes into the accumulation buffer directly, with ideally better than 8-bit resolution.

20 June 2024, 21:41	#56
Karlos Alien Bleed Join Date: Aug 2022 Location: UK Posts: 4,571	That's 327178 / 709378 / 189 = 0.00244, i.e. 2.5ms per packet. That's probably higher than I would like, but it's also a worst case with all 16 channels going. A lot of the time, there would be no audio at all, unless there's a stream going into it. This is where I am hoping to go with it:

20 June 2024, 21:45	#57
Karlos Alien Bleed Join Date: Aug 2022 Location: UK Posts: 4,571	I'll have to add that replace the volume lookup by the multiplication for the 060 and see what the result is under the same conditions.

20 June 2024, 21:56	#58
Karlos Alien Bleed Join Date: Aug 2022 Location: UK Posts: 4,571	I wonder if processing things a whole cache line's worth of everything at a time is just too inefficient. It didn't seem to bad on paper. Perhaps it would be better if the fetch and mix buffers were a little longer. We can still do the normalisation a line at a time and have the same output of 16 samples per volume change.

21 June 2024, 18:14	#60
paraj Registered User Join Date: Feb 2017 Location: Denmark Posts: 1,239	Just a thought, but but have you looked (heard?) into how much fidelity is lost if you restrict yourself to volume levels that only require shifts? My gut feeling is that you'll want to do that for <060, but I don't have much optimization experience with those targets (and don't have hardware to test it apart from base A1200). And of course, if you do decide to have two paths, you'll want to make it configurable and not just based on AF_68060 since in particular pistorm does not self identify as such

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)