05 July 2022, 19:33 | #1 |
Newbie Amiga programmer
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
|
8chn audio mixing: 5 to 1 (+3) vs. 4x2 to 4x1
Out of curiousity: which is supposed to be the faster approach; if one mixes 5 channel into one (and uses 3 real channels aside the mixed), or if mixes 4x2 channels to 4x1?
|
05 July 2022, 23:06 | #2 |
Registered User
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
|
I dont think that exist 5 to 1 mixing, i know only 4 to 1 mixing plus 3 real channels. For very easy mixing (same period, same volume) 2 to 1 is fastest, useful fo SFX. For music 4 to 1 is fastest. Anyway this is dependent too to coder knowledge and code quality.
|
06 July 2022, 08:57 | #3 |
Newbie Amiga programmer
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
|
4 channels mixed to 1, plus 3 real channels are only 7 channels. There are 8 channel tracker programs, like OctaMED. If 4 to 1 is faster for music, then they must use 5 to 1, right? Why 4 to 1 is faster, if volume or period differences can occur?
|
06 July 2022, 08:58 | #4 |
Registered User
Join Date: May 2015
Location: Kirkland, Washington, USA
Posts: 56
|
I think 4-to-1 mixing+3 real channels is faster than 3 2-to-1 +1 real, because the CPU only does 4 reads 1 write as opposed to 6 reads 3.
I never tried the 3x2-to-1 mixer, but I made a 4-channel sfx mixer (limited to same period and full or half volume per sample) for fun a while back. I got some really fast inner loops by using codegen for all permutations of how many active samples were at half or full volume level. I used the 3 regular channels for music, and the other 4 could then either be sfx drums or sfx. Since they are software mixed, those samples can stay in fastmem, and I only needed just over 1k chip mem for a double buffer that was updated every frame. It feels fast enough to use in games even on A500. If I had thought about that back in the day, I could probably have used it on Banshee to have music+sfx together |
06 July 2022, 09:15 | #5 |
Newbie Amiga programmer
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
|
That makes sense. And 8 reads and 4 writes would take even more time. I should thought of that.
I try to make something similar, although only for music right now. You wrote Banshee? That game was awesome. |
06 July 2022, 11:58 | #6 |
Registered User
Join Date: Aug 2018
Location: Untergrund/Germany
Posts: 408
|
|
06 July 2022, 12:45 | #7 | ||
Registered User
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,410
|
Quote:
Plus, a 4-to-1 mixer can abuse the fact that reading/adding/writing long values is quite a bit quicker than reading/adding/writing bytes or words, while not leading to a noticable difference in sound quality. Quote:
|
||
06 July 2022, 13:39 | #8 | |
Registered User
Join Date: Feb 2018
Location: Poland
Posts: 352
|
Looking forward to hear that!
Once on some Atari forum I've found such information about Face The Music Amiga tracker. If that's correct I wonder if this might bring some gains and how significant would it influence the quality. FTM wasn't that bad quality-wise. Quote:
|
|
06 July 2022, 14:21 | #9 |
Registered User
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
|
Face The Music's mixing idea/replay was used in Delitracker 2 (for 8 Voices NotePlayer). Is good enough, but often can freezes 68000 (7 MHz). Then You can test quality of this mixer for every 4-8 channels soundformat which has replay as "_note" version.
|
06 July 2022, 15:55 | #10 |
Registered User
Join Date: Feb 2018
Location: Poland
Posts: 352
|
@Don_Adan thanks. I did actually listen to this in the Face The Music itself. It was ok with mild occasional distortions. But for some reason I don't trust that implementation of this alghorithm in the FTM was optimized to the last bit. Maybe it could be implemented better, or there are the other limitations in this approach I'm not aware of.
|
06 July 2022, 21:09 | #11 |
Registered User
Join Date: May 2015
Location: Kirkland, Washington, USA
Posts: 56
|
In case anyone is interested, here are my inner loops from the mixer for fixed rate half/full volume mixing. You need samples eor'ed with $80808080 as that was faster for the mixing code. It is loop unrolled to do 8 samples per loop, so samples must be 8-byte aligned.
Quality loss: there isn't any quality loss for a single full-volume 8-bit sample, a very slight quality loss for multiple samples if they are loud, as the 8-bit values are summed and then clamped to +127 to -128. Half-volume samples lose 1 bit of precision. The main reason I stayed away from variable pitch is that you lose a lot of quality, even if you invest in linear interpolation and use a 28khz playback buffer. Plus, it's slow as hell to mix :-) Other uses: Having a looping sample buffer like this also makes it easy (or at least, easier) to stream samples from disk. I didn't try, but also think you can use the buffers for basic echo/reverb, which combined with the low pass filter could add a little more realism and variety to different regions of a game. Playback Buffers: I didn't include the code that manages the double buffering, which was pretty tricky (and that code isn't clean enough to share). For me it runs in vblank interrupt and balances back and forth between the 2 nearest buffer sizes. You could also do this mixing in an audio interrupt, but then you can't control where during the vblank the mixing happens - or you could triple-buffer and on some frames just skip the third buffer. Code Format: Sorry about all the weird alignment - My autoformatter assumes tab=4 spaces. Code:
cColorClocksPerSecondPAL: equ 3546895 cMinColorClocksPerSamplePAL: equ 124 ; approx 28603hz samples - 121-124 for channel 1 to 4 cMixingBufferSegmentSizeBits: equ 3 cMixingBufferSegmentSize: equ (1<<cMixingBufferSegmentSizeBits) cMixingFrameBufferSize: equ ((cColorClocksPerSecondPAL/(cMinColorClocksPerSamplePAL*50))+cMixingBufferSegmentSize)&(-cMixingBufferSegmentSize) .mixerInnerLoopTable: dc.w .innerLoop_0_0-.mixerInnerLoopTable dc.w .innerLoop_0_1-.mixerInnerLoopTable dc.w .innerLoop_0_2-.mixerInnerLoopTable dc.w .innerLoop_0_3-.mixerInnerLoopTable dc.w .innerLoop_0_4-.mixerInnerLoopTable dcb.w 3, 0 dc.w .innerLoop_1_0-.mixerInnerLoopTable dc.w .innerLoop_1_1-.mixerInnerLoopTable dc.w .innerLoop_1_2-.mixerInnerLoopTable dc.w .innerLoop_1_3-.mixerInnerLoopTable dcb.w 4, 0 dc.w .innerLoop_2_0-.mixerInnerLoopTable dc.w .innerLoop_2_1-.mixerInnerLoopTable dc.w .innerLoop_2_2-.mixerInnerLoopTable dcb.w 5, 0 dc.w .innerLoop_3_0-.mixerInnerLoopTable dc.w .innerLoop_3_1-.mixerInnerLoopTable dcb.w 6, 0 dc.w .innerLoop_4_0-.mixerInnerLoopTable ; registers used when entering mixer inner loop ; a5=destination (frame B) ; d0=samples needed this inner loop-1 ;mixer inner loop registers ; a1-a4=samples (UNSigned 8-bit: 0-255) ; a0=clamptable ; d1=accumulator for sample 1 and 3 ; d2=accumulator for sample 0 and 2 ; d3/d4=work registers ; d5=mask for sample 1 and 3 ($00ff00ff) ; d6=mask for sample 0 and 2 ($ff00ff00) ; d7=0 ReadFirstVoice: macro ; Private : used for inner loop move.l (a1)+, d1 move.l d1, d2 and.l d5, d1 and.l d6, d2 endm AddVoice: macro ; Private : used for inner loop move.l (\1)+, d3 move.l d3, d4 and.l d5, d3 and.l d6, d4 add.l d3, d1 add.l d4, d2 addx.l d7, d2 endm ShiftHalfVolume: macro ; Private : used for inner loop lsr.l #1, d1 ror.l #1, d2 and.l #$0fff0fff, d1 and.l #$ff0fff0f, d2 endm StoreVoice: macro ; Private : used for inner loop ror.l #8, d2 move.b 0(a0, d2.w), 2(a5) move.b 0(a0, d1.w), 3(a5) swap d2 swap d1 move.b 0(a0, d2.w), 0(a5) move.b 0(a0, d1.w), 1(a5) add.w #4, a5 endm StoreVoiceHalfsOnly: macro ; Private : used for inner loop ShiftHalfVolume StoreVoice endm DefineInnerLoop: macro ; Private : Generates optimized mixer inner loop ; Inputs : Number of half volume samples, number of full volume samples .innerLoop_\1_\2: ; read active voice data pointers into a1-a4 into registers if (\1+\2)=0 moveq #0, d5 endif if (\1+\2)=1 move.l MixerData_SortedVoicePtrs(a4), a1 move.l MixerVoice_Ptr(a1), a1 endif if (\1+\2)=2 movem.l MixerData_SortedVoicePtrs(a4), a1-a2 move.l MixerVoice_Ptr(a1), a1 move.l MixerVoice_Ptr(a2), a2 endif if (\1+\2)=3 movem.l MixerData_SortedVoicePtrs(a4), a1-a3 move.l MixerVoice_Ptr(a1), a1 move.l MixerVoice_Ptr(a2), a2 move.l MixerVoice_Ptr(a3), a3 endif if (\1+\2)=4 movem.l MixerData_SortedVoicePtrs(a4), a1-a4 move.l MixerVoice_Ptr(a1), a1 move.l MixerVoice_Ptr(a2), a2 move.l MixerVoice_Ptr(a3), a3 move.l MixerVoice_Ptr(a4), a4 endif if (\1+\2)>1 lea ClampTable-(\1*$40+\2*$80)(pc), a0 move.l #$00ff00ff, d5 if MixerVersion=1 move.l #$ff00ff00, d6 endif if MixerVersion=2 move.w #$0fff, d6 endif moveq #0, d7 else if (\1=1) & (\2=0) move.l #$80808080, d4 move.l #$fefefefe, d5 endif if (\1=0) & (\2=1) move.l #$80808080, d4 endif endif .loop_\1_\2: ; mixes one segment from a1-a4 into a5 if (\1+\2)=0 ; no channels rept cMixingBufferSegmentSize/4 move.l d5, (a5)+ endr endif if (\1=1) & (\2=0) ; single channel half volume rept cMixingBufferSegmentSize/4 move.l (a1)+, d2 if 1 ; correct eor.l d4, d2 and.l d5, d2 move.l d2, d3 ror.l #1, d2 and.l d4, d3 add.l d3, d2 else ; fast and.l #$fefefefe, d2 ror.l #1, d2 sub.l #$40404040, d2 endif move.l d2, (a5)+ endr endif if (\1=0) & (\2=1) ; single channel full volume rept cMixingBufferSegmentSize/4 move.l (a1)+, d2 eor.l d4, d2 move.l d2, (a5)+ endr endif if (\1+\2)>1 rept cMixingBufferSegmentSize/4 ; available registers: ; d2-d7 ; d2-d6=work registers if \1>=1 ReadFirstVoice endif if \1>=2 AddVoice a2 endif if \1>=3 AddVoice a3 endif if \1>=4 AddVoice a4 endif if \2=0 StoreVoiceHalfsOnly else if ((\1+\2)>=1) & (\1<1) ; there are only full volume voices ReadFirstVoice else ; there is a mix between half and full voices, so shift the accumulated values before adding fu ShiftHalfVolume endif ; if there are at least 2 voices in total, but less than 2 are half volume, add voice 2 at full volume if ((\1+\2)>=2) & (\1<2) AddVoice a2 endif if ((\1+\2)>=3) & (\1<3) AddVoice a3 endif if ((\1+\2)>=4) & (\1<4) AddVoice a4 endif StoreVoice endif endr endif dbra d0, .loop_\1_\2 if (\1+\2)>1 lea MixerData(pc), a4 endif bra .innerLoopComplete endm DefineInnerLoop 0, 0 DefineInnerLoop 0, 1 DefineInnerLoop 0, 2 DefineInnerLoop 0, 3 DefineInnerLoop 0, 4 DefineInnerLoop 1, 0 DefineInnerLoop 1, 1 DefineInnerLoop 1, 2 DefineInnerLoop 1, 3 DefineInnerLoop 2, 0 DefineInnerLoop 2, 1 DefineInnerLoop 2, 2 DefineInnerLoop 3, 0 DefineInnerLoop 3, 1 DefineInnerLoop 4, 0 dcb.b cMixerVoiceCount*$80-$80, $80 .temp: set $80 rept 128 dc.b .temp .temp: set .temp+1 endr ClampTable: .temp: set $0 rept 128 dc.b .temp .temp: set .temp+1 endr dcb.b cMixerVoiceCount*$80-$80, $7f |
07 July 2022, 00:18 | #12 |
Lemon. / Core Design
Join Date: Mar 2016
Location: Tier 5
Posts: 1,212
|
there might be a quicker way of doing the clamping, to avoid those indexed memory reads from the clamp table
|
07 July 2022, 22:50 | #13 |
Newbie Amiga programmer
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
|
I'm only experimenting yet, so it will be not released soon.Thank you for sharing it, i'm sure i can learn tricks from it.
Last edited by TCH; 07 July 2022 at 22:50. Reason: Half of the post was on another tab... |
08 July 2022, 02:53 | #14 |
Registered User
Join Date: May 2015
Location: Kirkland, Washington, USA
Posts: 56
|
Dan, yeah I am not too happy about having to do clamping like that, but I couldn’t think of another way without losing sample precision and volume (storing as 6-bit only)
What are you thinking? |
08 July 2022, 19:10 | #15 | |
Lemon. / Core Design
Join Date: Mar 2016
Location: Tier 5
Posts: 1,212
|
Quote:
The way for unsigned after adding to d0: subx.b d1,d1 or.b d1,d0 EDIT: I actually started a thread about this some time ago: https://eab.abime.net/showthread.php?t=106727 So maybe lookup is the best way for signed.... Last edited by DanScott; 08 July 2022 at 19:16. Reason: Updated |
|
08 July 2022, 19:58 | #16 |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
|
|
08 July 2022, 20:10 | #17 |
J.M.D - Bedroom Musician
Join Date: Apr 2014
Location: los angeles,ca
Posts: 3,519
|
I think that is what we used on Powder to add some sound effects, and i remember the coder tell me to keep one of the channel (0? not sure now) as empty as possible
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Using blitter for sound mixing? | idrougge | Coders. Asm / Hardware | 20 | 23 December 2022 16:13 |
Sample mixing | mds | Coders. Language | 1 | 05 July 2022 15:55 |
Performance update: Audio Mixing version 2.0 for Games example + source | roondar | Coders. Asm / Hardware | 45 | 27 February 2021 18:23 |
Audio Mixing for Games example + source | roondar | Coders. Asm / Hardware | 34 | 30 April 2019 10:49 |
atlon 64 3800+ 2.4 ghz vs intel q6600 4x2.4ghz | turrican3 | support.WinUAE | 10 | 08 March 2008 19:05 |
|
|