English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 24 October 2020, 12:09   #61
8bitbubsy
Registered User
 
8bitbubsy's Avatar
 
Join Date: Sep 2009
Location: Norway
Posts: 1,712
Ok, good to know.

Regarding having two mixer sets, well... I'd have to make sure that the integer sampling position wouldn't change inside the mixing loop (which is not guaranteed if delta is < 1), which requires some calculations before entering the loop. I want to keep the outer mix loop simple, don't want too much overhead in there. I already have a 32-bit division in there to figure out the max amount of samples to mix before eventually having to handle the sample end/loop end.
8bitbubsy is offline  
Old 24 October 2020, 12:48   #62
chb
Registered User
 
Join Date: Dec 2014
Location: germany
Posts: 439
Quote:
Originally Posted by 8bitbubsy View Post
Ok, good to know.

Regarding having two mixer sets, well... I'd have to make sure that the integer sampling position wouldn't change inside the mixing loop (which is not guaranteed if delta is < 1), which requires some calculations before entering the loop. I want to keep the outer mix loop simple, don't want too much overhead in there. I already have a 32-bit division in there to figure out the max amount of samples to mix before eventually having to handle the sample end/loop end.
I was rather thinking of something more straight forward - like a simple test in the inner loop if you need to load and process a new sample or if you can continue with the old ones, but not assuming that the integer sampling position stays constant (that's unlikely). So one routine with that test and one without (which is probably exactly the one you have now), and deciding which one to choose before you enter the mixing loop based on the sample read delta, using a simple threshold that's probably around 0.7 or so. Of course, that's only makes sense if you have samples at low sample rates often.

EDIT: something like this, which is slower when delta > 1, but probably faster when samples are repeated:
Code:
; d0.w = bytes to mix

MIXSF MACRO
    move.w (a3,d2.l),d3   ; d3.w = 2x signed 8-bit samples
    move.b d3,d4
    ext.w  d4
    asr.w  #8,d3
    sub.w  d3,d4          ; d4.w = sample2-sample1
    lsl.l  #7,d4
nc:
    move.l d7,d5
    rol.l #7,d5
    or.w d4,d5
    
    add.w  (a6,d5.w*2),d3
    move.w (a1,d3.w*2),d5 ; d5.w = left output sample (from volume LUT)
    swap   d5
    move.w (a4,d3.w*2),d5 ; d5.l = (leftSample << 16) | rightSample
    add.l  d5,(a5)+
    add.l  d6,d7          ; increase sampling position
    bcc nc                ; branch if carry clear = integer sampling position unchanged
    addx.l	d1,d2
    ENDM

Last edited by chb; 24 October 2020 at 14:00.
chb is offline  
Old 24 October 2020, 14:44   #63
8bitbubsy
Registered User
 
8bitbubsy's Avatar
 
Join Date: Sep 2009
Location: Norway
Posts: 1,712
Yeah, that could maybe help.
I think most songs will have an average sampling frequency above ~20kHz (28603.99(mixrate)*0.7), but it might still help for some songs.

I'll try this when I get the time, on monday or so.
8bitbubsy is offline  
Old 24 October 2020, 15:20   #64
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 56
Posts: 2,024
Quote:
Originally Posted by 8bitbubsy View Post
Oh no, I totally forgot that this has a possible word access misalignment! If only one could use addx on an address register, then I could add the sampling position to a3 before the loop, then read two bytes, then addx on a3. And when I leave the loop, I subtract the sample base from a3 to get the new sampling position, before I handle sample end/loop end.

I could still do that method by using d2 as the relative sampling position (d2 = a3+d2 before loop). Then I do "move.l d3,a3" as the first instruction in the loop.
That's one move intruction extra, so probably slower in the end?
This can be faster, because asr.w #8,d3 (4c) instruction left, and perhaps one ext.w (4c) too. I dont know/remember exactly 68020/68030 timings.

Simple you must/can try also this version:
move.l d2,a3
move.b (a3),d3
move.b 1(a3),d7
If you want to reach maximum speed. Same for no swap command version.

Last edited by Don_Adan; 24 October 2020 at 15:26.
Don_Adan is offline  
Old 24 October 2020, 15:21   #65
8bitbubsy
Registered User
 
8bitbubsy's Avatar
 
Join Date: Sep 2009
Location: Norway
Posts: 1,712
Yes, that's what I was thinking of. Though you still have to ext.w both of them to calculate the delta sample (-256..254).
8bitbubsy is offline  
Old 24 October 2020, 15:23   #66
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 56
Posts: 2,024
Quote:
Originally Posted by chb View Post
I was rather thinking of something more straight forward - like a simple test in the inner loop if you need to load and process a new sample or if you can continue with the old ones, but not assuming that the integer sampling position stays constant (that's unlikely). So one routine with that test and one without (which is probably exactly the one you have now), and deciding which one to choose before you enter the mixing loop based on the sample read delta, using a simple threshold that's probably around 0.7 or so. Of course, that's only makes sense if you have samples at low sample rates often.

EDIT: something like this, which is slower when delta > 1, but probably faster when samples are repeated:
Code:
; d0.w = bytes to mix

MIXSF MACRO
    move.w (a3,d2.l),d3   ; d3.w = 2x signed 8-bit samples
    move.b d3,d4
    ext.w  d4
    asr.w  #8,d3
    sub.w  d3,d4          ; d4.w = sample2-sample1
    lsl.l  #7,d4
nc:
    move.l d7,d5
    rol.l #7,d5
    or.w d4,d5
    
    add.w  (a6,d5.w*2),d3
    move.w (a1,d3.w*2),d5 ; d5.w = left output sample (from volume LUT)
    swap   d5
    move.w (a4,d3.w*2),d5 ; d5.l = (leftSample << 16) | rightSample
    add.l  d5,(a5)+
    add.l  d6,d7          ; increase sampling position
    bcc nc                ; branch if carry clear = integer sampling position unchanged
    addx.l	d1,d2
    ENDM
Good idea, but i dont think it will be works. Because main loop works in DO counter. It will be trashed memory via "add.l d5,(a5)+" command, I think.
Don_Adan is offline  
Old 24 October 2020, 15:26   #67
8bitbubsy
Registered User
 
8bitbubsy's Avatar
 
Join Date: Sep 2009
Location: Norway
Posts: 1,712
Hm yes, you are right. This will not work correctly because it will keep branching to nc until one integer of the sampling position has been reached (e.g. d0 counter is not respected until that has happened). Also this macro is unrolled 4 times inside the actual inner loop.
8bitbubsy is offline  
Old 24 October 2020, 16:12   #68
chb
Registered User
 
Join Date: Dec 2014
Location: germany
Posts: 439
Quote:
Originally Posted by Don_Adan View Post
Good idea, but i dont think it will be works. Because main loop works in DO counter. It will be trashed memory via "add.l d5,(a5)+" command, I think.
Yes, that's true. You'd either have the main loop set up accordingly (first do a number of iterations of the modified loop with the branch, then without), or just reserve 1/(sample read delta low) longwords at the end of the buffer, or check a5 against some end position during each iteration. Might be worth the hassle or not; probably depends how many bytes you typically mix. And how slow memory reads are compared to instructions, but those repeated samples come from the data cache anyway on 030+. Hmmm.
chb is offline  
Old 28 October 2020, 13:36   #69
8bitbubsy
Registered User
 
8bitbubsy's Avatar
 
Join Date: Sep 2009
Location: Norway
Posts: 1,712
I tried to move the two samples byte by byte instead of as a word, and it was slightly slower no matter what I did. I also tried to do a benchmark test on my A1200 68030 to figure out how much worse the speed was for a misaligned word read, and it seems to be a quite small speed penalty (if I did my tests right).
8bitbubsy is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Interpolation new Sound options Paul support.WinUAE 10 17 March 2019 20:57
Artifacts from non-gamma-aware interpolation mark_k support.WinUAE 5 08 January 2018 14:37
switch sound interpolation 4 chs turrican3 support.WinUAE 1 14 February 2016 10:39
Non-linear retrogaming? Nogg Retrogaming General Discussion 5 13 October 2007 17:09
is time linear PaulS request.Demos 2 22 September 2002 12:37

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 04:22.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.08035 seconds with 15 queries