17 June 2024, 13:09 | #41 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
That's what I was thinking. There is the notion of a master volume level on the mixer that affects how the tables look, but we can just store a straight up amplification factor in a table of 15 entries. That factor then applies to all 16 of the samples in the loop. So, the loop becomes read the sample byte, sign extend, multiply by the factor and accumulate. If that multiplication factor is kept in the range 0-255, then the resulting lower word of the product is all we need.
|
19 June 2024, 19:50 | #42 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
After a bit of a false start, the normalisation code is working well.
I added an entry point where I can just normalise the accumulation buffer(s) and streamed some 16-bit audio through it, dumping the 8-bit output. Zoomed out, it looks awful. Totally "sausaged" to throw a bit of sound mixer slang on it. And it sounds like a distorted mess (though still identifiable). However, zooming in, you begin to see the individually normalised frames. What matters is that each frame is normalised to play back at a given Paula volume. The discontinuites in the data re where there are significant changes in the required playback volume. It may not look much but this is is progress |
19 June 2024, 21:35 | #43 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
@paraj (and anyone else interested)
Do you think you could try this on your 060? 040 also welcome but I haven't properly optimised it yet for either (beyond using shift normalisation where it can be). It loads up 3 (low quality) 8 bit sample filess and mixes them. The mixing currently goes to chip ram buffers, two 8-bit sample buffers and 2 16-bit volume buffers. These aren't wired up to Paula yet, but we capture the data and write it to a couple of files (raw 8 bit signed left and right sample buffers and 16-bit left and right volume buffers). The sounds are particularly bad because they were 7-bit 8kHz and I crudely converted them to 8-bit 16kHz in audacity which seems to have added a lot of grit. https://github.com/0xABADCAFE/tkg-mixer (binary and sound files are needed) The mixer is set up with a notional mixing rate of 16 kHz and an update rate of 50Hz, which all results in a packet size of 320 samples per update. The time taken to call Aud_MixPacket() (which does all the mixing, peak detection, normalisation and push to the 4 chip ram buffers for a packet) is summed until all the input channels are empty. This results in 96 packets being mixed. This would be equivalent to about 1.92 seconds. At the end, the total number of eclock ticks required is reported. I am interested to know what this looks like on real hardware as it will inform how much we can get away with. The results I get in UAE are meaningless. |
19 June 2024, 22:14 | #44 |
Registered User
Join Date: Feb 2017
Location: Denmark
Posts: 1,239
|
Zone a zip/lha that has everything needed and I'll report the numbers.
|
19 June 2024, 22:18 | #45 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
|
19 June 2024, 22:24 | #46 | |
Registered User
Join Date: Feb 2017
Location: Denmark
Posts: 1,239
|
Quote:
Output: Code:
Got Timer, frequency is 709379 Hz Loaded sounds/Teleport.raw [24000 bytes] at 0x68a71ba0 Loaded sounds/RumbleWind.raw [30600 bytes] at 0x68b488c0 Loaded sounds/Collect_weapon.raw [9000 bytes] at 0x68a77970 Aud_Mixer allocated at 0x68a6fc70 Mix Rate 16000 Hz Update Rate 50 Hz Packet Length 320 samples [20 lines] Left Sample Packet at 0x14160 Left Volume Packet at 0x142a0 Right Sample Packet at 0x142f0 Right Volume Packet at 0x14430 Volume Tables at 0x68a6fd80 AbsMaxL 0 [Norm Index 0] AbsMaxR 0 [Norm Index 0] Norm Table at 0x6800cbfc Channel 0: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 1: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 2: SamplePtr: 0x68a71ba0 [Remaining: 24000 LVol:15 RVol: 5] Channel 3: SamplePtr: 0x68b488c0 [Remaining: 30608 LVol: 1 RVol: 1] Channel 4: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 5: SamplePtr: 0x68a77970 [Remaining: 9008 LVol: 0 RVol: 8] Channel 6: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 7: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 8: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 9: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 10: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 11: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 12: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 13: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 14: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 15: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Sample Fetch Buffer [ +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, ] Left Mix Buffer [ +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, ] Right Mix Buffer [ +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, ] Aud_Mixer allocated at 0x68a6fc70 Mix Rate 16000 Hz Update Rate 50 Hz Packet Length 320 samples [20 lines] Left Sample Packet at 0x142a0 Left Volume Packet at 0x142c8 Right Sample Packet at 0x14430 Right Volume Packet at 0x14458 Volume Tables at 0x68a6fd80 AbsMaxL 0 [Norm Index 0] AbsMaxR 0 [Norm Index 0] Norm Table at 0x6800cbfc Channel 0: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 1: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 2: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 3: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 4: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 5: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 6: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 7: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 8: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 9: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 10: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 11: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 12: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 13: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 14: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 15: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Sample Fetch Buffer [ +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, ] Left Mix Buffer [ +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, ] Right Mix Buffer [ +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, ] Mixed 96 Packets in 39851 EClockVal ticks (709379/s) |
|
19 June 2024, 22:38 | #47 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
That looks reasonable. 39851/709379 is 56.18ms for 96 packets, or 0.585ms per packet.
Each packet would be 20ms of audio under the given configuration. |
19 June 2024, 22:43 | #48 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
I'll have to push an update that mixes all 16 channels now and measure that.
|
19 June 2024, 22:50 | #49 |
Registered User
Join Date: Feb 2017
Location: Denmark
Posts: 1,239
|
|
19 June 2024, 22:51 | #50 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
The other thing I wanted to do was to mix in an audio stream for music, since this will prevent the old mod player from working (all 4 channels are used).
There's no reason the stream can't be better than 8 bit depth but it is an interesting problem from a size point of view. Perhaps something delta encoded. It could pull the next 16 bytes into the existing fetch buffer with move16 and do some decoding directly into the left and right accumulation buffer, before we start mixing any active sound effects channels over it. I also need to add your volmod output implementation on the thing and make it make actual sound. |
19 June 2024, 22:52 | #51 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
|
19 June 2024, 22:55 | #52 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
It's probably sensible to set a time budget. How much time do we want to allocate to mixing? My assumption is I'm going to need to do this on an interrupt. I think the existing game is doing it in the same one all the game logic runs in (but not the rendering/C2P)
|
20 June 2024, 19:08 | #53 |
Registered User
Join Date: Feb 2017
Location: Denmark
Posts: 1,239
|
A good starting point would be to know much time the existing code takes. Otherwise, let's say the game runs at 10fps (fullscreen), then the packet time translates to around ~3% of CPU time. Seems reasonable for good sound.
|
20 June 2024, 20:57 | #54 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
Can you download a fresh zip? I have pushed a change to test all 16 channels with a spread of volumes and offsets to make sure we have something approximating a worst case scenario.
As for the existing sound mixer, I doubt it takes even a tiny fraction of what this will, since it mixes 7-bit samples at 8kHz in a pretty tight loop. My target is at least 16kHz, 16 8-bit input channels with 16 independent left right volume settings per input channel, 16-bit mixing, full stereo volume-normalised (i.e. volume channel modulated) output. Stretch goals are: More channels (if needed) Better input LR volume resolution (e.g. 32 steps) Music stream - something that decodes into the accumulation buffer directly, with ideally better than 8-bit resolution. |
20 June 2024, 21:12 | #55 |
Registered User
Join Date: Feb 2017
Location: Denmark
Posts: 1,239
|
Result:
Code:
Got Timer, frequency is 709379 Hz Loaded sounds/airstrike.raw [60460 bytes] at 0x68a7b1c0 Aud_Mixer allocated at 0x68a792a0 Mix Rate 16000 Hz Update Rate 50 Hz Packet Length 320 samples [20 lines] Left Sample Packet at 0x14160 Left Volume Packet at 0x142a0 Right Sample Packet at 0x142f0 Right Volume Packet at 0x14430 Volume Tables at 0x68a793b0 AbsMaxL 0 [Norm Index 0] AbsMaxR 0 [Norm Index 0] Norm Table at 0x6800cdac Channel 0: SamplePtr: 0x68af6060 [Remaining: 60464 LVol: 0 RVol:15] Channel 1: SamplePtr: 0x68a7b1e0 [Remaining: 60432 LVol: 1 RVol:14] Channel 2: SamplePtr: 0x68af60a0 [Remaining: 60400 LVol: 2 RVol:13] Channel 3: SamplePtr: 0x68a7b220 [Remaining: 60368 LVol: 3 RVol:12] Channel 4: SamplePtr: 0x68af60e0 [Remaining: 60336 LVol: 4 RVol:11] Channel 5: SamplePtr: 0x68a7b260 [Remaining: 60304 LVol: 5 RVol:10] Channel 6: SamplePtr: 0x68af6120 [Remaining: 60272 LVol: 6 RVol: 9] Channel 7: SamplePtr: 0x68a7b2a0 [Remaining: 60240 LVol: 7 RVol: 8] Channel 8: SamplePtr: 0x68af6160 [Remaining: 60208 LVol: 8 RVol: 7] Channel 9: SamplePtr: 0x68a7b2e0 [Remaining: 60176 LVol: 9 RVol: 6] Channel 10: SamplePtr: 0x68af61a0 [Remaining: 60144 LVol:10 RVol: 5] Channel 11: SamplePtr: 0x68a7b320 [Remaining: 60112 LVol:11 RVol: 4] Channel 12: SamplePtr: 0x68af61e0 [Remaining: 60080 LVol:12 RVol: 3] Channel 13: SamplePtr: 0x68a7b360 [Remaining: 60048 LVol:13 RVol: 2] Channel 14: SamplePtr: 0x68af6220 [Remaining: 60016 LVol:14 RVol: 1] Channel 15: SamplePtr: 0x68a7b3a0 [Remaining: 59984 LVol:15 RVol: 0] Sample Fetch Buffer [ +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, ] Left Mix Buffer [ +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, ] Right Mix Buffer [ +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, ] Aud_Mixer allocated at 0x68a792a0 Mix Rate 16000 Hz Update Rate 50 Hz Packet Length 320 samples [20 lines] Left Sample Packet at 0x142a0 Left Volume Packet at 0x142c8 Right Sample Packet at 0x14430 Right Volume Packet at 0x14458 Volume Tables at 0x68a793b0 AbsMaxL 0 [Norm Index 0] AbsMaxR 0 [Norm Index 0] Norm Table at 0x6800cdac Channel 0: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 1: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 2: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 3: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 4: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 5: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 6: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 7: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 8: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 9: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 10: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 11: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 12: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 13: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 14: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Channel 15: SamplePtr: 0 [Remaining: 0 LVol: 0 RVol: 0] Sample Fetch Buffer [ +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, ] Left Mix Buffer [ +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, ] Right Mix Buffer [ +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, +0, ] Mixed 189 Packets in 327178 EClockVal ticks (709379/s) |
20 June 2024, 21:41 | #56 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
That's 327178 / 709378 / 189 = 0.00244, i.e. 2.5ms per packet. That's probably higher than I would like, but it's also a worst case with all 16 channels going.
A lot of the time, there would be no audio at all, unless there's a stream going into it. This is where I am hoping to go with it: |
20 June 2024, 21:45 | #57 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
I'll have to add that replace the volume lookup by the multiplication for the 060 and see what the result is under the same conditions.
|
20 June 2024, 21:56 | #58 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
I wonder if processing things a whole cache line's worth of everything at a time is just too inefficient. It didn't seem to bad on paper. Perhaps it would be better if the fetch and mix buffers were a little longer. We can still do the normalisation a line at a time and have the same output of 16 samples per volume change.
|
21 June 2024, 14:25 | #59 | |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
Quote:
Then, for the remaining 15 8-bit samples in the line, we first convert to a delta and look that up instead. The resulting 16-bit lookup is added to the current value to construct the next sample and this is written to the accumulation buffer. The thinking here is that you will probably stall on the first two (unless they live in the same cache line) but the resulting delta based lookups should hopefully cluster closely around a small set of table entries that are close together. Thoughts? You could even preconvert the sound samples in memory since we always do things in frames of 16. This would remove the subtraction to construct the delta. Last edited by Karlos; 21 June 2024 at 15:35. |
|
21 June 2024, 18:14 | #60 |
Registered User
Join Date: Feb 2017
Location: Denmark
Posts: 1,239
|
Just a thought, but but have you looked (heard?) into how much fidelity is lost if you restrict yourself to volume levels that only require shifts? My gut feeling is that you'll want to do that for <060, but I don't have much optimization experience with those targets (and don't have hardware to test it apart from base A1200).
And of course, if you do decide to have two paths, you'll want to make it configurable and not just based on AF_68060 since in particular pistorm does not self identify as such |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Slow A4000 after overhaul | Screechstar | support.Hardware | 57 | 11 July 2023 23:02 |
Amiga Font Editor overhaul | buggs | Coders. Releases | 19 | 09 March 2021 17:39 |
Escom A1200 overhaul | Ox. | Amiga scene | 8 | 26 August 2014 08:54 |
Will Bridge Practice series needs an overhaul | mk1 | HOL data problems | 1 | 02 April 2009 21:55 |
|
|