English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 01 June 2024, 01:47   #201
lmimmfn
Registered User
 
Join Date: May 2018
Location: Ireland
Posts: 692
Quote:
Originally Posted by Karlos View Post
Just a warning since getting my 030 UAE config in better shape, none of the CACR manipulation versions will run without the datacache enabled anyway, it just freezes up. I should've tested that more carefully. However, the plan is to run with the cache enabled, provided the CACR trick fixes the Akiko read issue.
I'm lurking on this chat but you really need to hail Tony.
lmimmfn is offline  
Old 01 June 2024, 01:50   #202
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,477
Quote:
Originally Posted by lmimmfn View Post
I'm lurking on this chat but you really need to hail Tony.
I don't know that it's a bug in UAE at all, it seems quite reasonable that you might be able to lock up the real 68030 doing silly things in supervisor mode while interrupts are disabled.
Karlos is offline  
Old 01 June 2024, 09:58   #203
Lunda
Registered User
 
Join Date: Jul 2023
Location: Domsjö/Sweden
Posts: 56
Quote:
Originally Posted by Karlos View Post
@lunda if you do get to check this, it needs checking with the cache enabled as well as disabled. Each of the Akiko tests, including the verification has a CACR bashing version.
For some reason my machine could not run the test without data cache(crash reboot). It might be an issue with the beast. I tested with both SDRAM and SRAM.

edit: I found the reason after reading all new posts.
Attached Thumbnails
Click image for larger version

Name:	AkikoCACR_DataCache.jpg
Views:	52
Size:	413.9 KB
ID:	82380  

Last edited by Lunda; 01 June 2024 at 10:06. Reason: new info
Lunda is offline  
Old 01 June 2024, 12:06   #204
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,477
Well it's fair to say, disabled write allocation fixes the Akiko read back problem but the routine is clearly far behind the software C2P on this machine. It beats it by a clear 25%

It certainly seems to be pure IO bandwidth limitation, i.e. the accepted wisdom. If the chip RAM writes were faster, the simplicity would theoretically allow it to beat the software conversion, since it hides the ALU effort behind the slow writes. We can test that actually by just doing C2P from fast to fast.
Karlos is offline  
Old 01 June 2024, 13:23   #205
abu_the_monkey
Registered User
 
Join Date: Oct 2020
Location: Bicester
Posts: 2,022
it would still be nice to know where the crossover point is between using Akiko vs CPU.
Is it an 030@50mhz or faster?
abu_the_monkey is offline  
Old 01 June 2024, 13:26   #206
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,477
Quote:
Originally Posted by abu_the_monkey View Post
it would still be nice to know where the crossover point is between using Akiko vs CPU.
Is it an 030@50mhz or faster?
Maybe Lunda can test different clock crystals?
Karlos is offline  
Old 01 June 2024, 13:39   #207
abu_the_monkey
Registered User
 
Join Date: Oct 2020
Location: Bicester
Posts: 2,022
I guess Akiko will still perform the same?
abu_the_monkey is offline  
Old 01 June 2024, 13:53   #208
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,477
Quote:
Originally Posted by abu_the_monkey View Post
I guess Akiko will still perform the same?
I think it'll take the same number of cycles but the cycles will be longer. The total chip ram delay is the only real invariant. So they might just both end up converging to the same speed, limited by chip write bandwidth.
Karlos is offline  
Old 01 June 2024, 15:56   #209
paraj
Registered User
 
paraj's Avatar
 
Join Date: Feb 2017
Location: Denmark
Posts: 1,217
Unfortunately it looks to me like it's just not going to be worth it on 030 unless "normal" accelerator cards behave radically different or some wizards comes up with a serious improvement to the instruction scheduling.

The time for "Naive (WA)" is still very close to "Null C2P + Akiko Limit (WA)", and Kalms - Null C2P is only 45314 ticks, so assuming just that part scales linearly with clock frequency, it'd start being faster at around 25MHz..
paraj is offline  
Old 01 June 2024, 16:53   #210
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,477
Not having DMA output to chip ram. What a missed opportunity. It's not as if there's much that runs on 020/14 + Fast that can use C2P that isn't just faster using chunky copper screen tricks, so it was only ever going to be truly useful with a faster CPU in the first place.

I know it was "for free", but it's also a bit of a chocolate teapot without being able to get the data out of it faster.
Karlos is offline  
Old 01 June 2024, 17:04   #211
abu_the_monkey
Registered User
 
Join Date: Oct 2020
Location: Bicester
Posts: 2,022
yep.

still, it would be nice to have the numbers from a range of setups.
at least then it can be put to bed once and for all.
abu_the_monkey is offline  
Old 01 June 2024, 19:25   #212
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,477
I think the bus is maxed out when talking to Akiko. If it's doing a transfer every 3 cycles and the bus is 14 MHz, that's 4*14/3 = 18.67 MB/s

The conversion does 9MB/s, but considering it's a write and read workload, that's your 18MB/s nommed up.
Karlos is offline  
Old 01 June 2024, 19:42   #213
abu_the_monkey
Registered User
 
Join Date: Oct 2020
Location: Bicester
Posts: 2,022
random thunk.

Code:
; ############################################################################# 
 movem.l d1-d7/a2/a3/a6,-(sp)

    ; back up the inputs
    move.l  a0,a2
    move.l  a1,a3

        move.l  _SysBase,a6
        jsr             _LVOForbid(a6)
        jsr             _LVODisable(a6)

    move.l  #$00B80038,a0
    move.w  #2559-1,d0;was #2560-1 now an extra 1 less cos the last write falls through
        move.l  a3,a1

        ; a0 akiko
        ; a2 source
        ; a3
; #############################################################################
    move.l  (a2)+,(a0)
    move.l  (a2)+,(a0)
    move.l  (a2)+,(a0)
    move.l  (a2)+,(a0)
    move.l  (a2)+,(a0)
    move.l  (a2)+,(a0)
    move.l  (a2)+,(a0)
    move.l  (a2)+,(a0)
; #############################################################################
.loop:
    ; write plane 0
    move.l  (a0),(a1)
    add.w   #10240,a1
        
        move.l  (a0),d1
        move.l  (a0),d2
        move.l  (a0),d3
        move.l  (a0),d4
        move.l  (a0),d5
        move.l  (a0),d6
        move.l  (a0),d7
        
        move.l  (a2)+,(a0)
        move.l  (a2)+,(a0)
        move.l  (a2)+,(a0)
        move.l  (a2)+,(a0)
        move.l  (a2)+,(a0)
        move.l  (a2)+,(a0)
        move.l  (a2)+,(a0)
        move.l  (a2)+,(a0)
        
    move.l d1,(a1)
    add.w   #10240,a1
        
    move.l d2,(a1)
    add.w   #10240,a1

    move.l d3,(a1)
    add.w   #10240,a1

    move.l d4,(a1)
    add.w   #10240,a1

    move.l d5,(a1)
    add.w   #10240,a1

    move.l d6,(a1)
    add.w   #10240,a1
        add.w   #4,a3
                
        move.l d7,(a1)
    add.w   #10240,a1
    move.l  a3,a1
    dbra    d0,.loop
; #############################################################################
        move.l  (a0),(a1)
    add.w   #10240,a1
        
        move.l  (a0),(a1)
    add.w   #10240,a1
        
        move.l  (a0),(a1)
    add.w   #10240,a1
        
        move.l  (a0),(a1)
    add.w   #10240,a1
        
        move.l  (a0),(a1)
    add.w   #10240,a1
        
        move.l  (a0),(a1)
    add.w   #10240,a1
        
        move.l  (a0),(a1)
    add.w   #10240,a1
        
        move.l  (a0),(a1)
    add.w   #10240,a1
        move.l  a3,a1
; #############################################################################
        jsr _LVOEnable(a6)
        jsr _LVOPermit(a6)

    movem.l (sp)+,d1-d7/a2/a3/a6
    rts
; #############################################################################
probably contains mistakes
abu_the_monkey is offline  
Old 01 June 2024, 20:46   #214
NorthWay
Registered User
 
Join Date: May 2013
Location: Grimstad / Norway
Posts: 853
Would it be beneficial for some specs to do every other decode with Akiko and then cpu? You would have to interleave all 16 Akiko reads and writes in-between the cpu c2p (i.e. not blindly do one and then the other but both at the same time).
NorthWay is offline  
Old 01 June 2024, 21:44   #215
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,477
@abu_the _monkey

Try it. The only things to say are you don't need Forbid/Permit since Disable achieves the same thing regardless. You will probably want to disable write allocate before talking to Akiko too. The latest code does this but it's basically identical to what paraj posted a bit earlier.
Karlos is offline  
Old 01 June 2024, 21:47   #216
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,315
Quote:
Originally Posted by NorthWay View Post
Would it be beneficial for some specs to do every other decode with Akiko and then cpu? You would have to interleave all 16 Akiko reads and writes in-between the cpu c2p (i.e. not blindly do one and then the other but both at the same time).
Hardly. Akiko is synchronous, and the slow part is attempting to read from its registers as the CPU needs to wait for the relatively slow chip bus. For a conversion from fast mem to chip mem, the CPU does not need to wait for anything - it can retire chip bus accesses in its push buffer while continuing to work. That does not help for Akiko,
Thomas Richter is offline  
Old 01 June 2024, 22:14   #217
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,477
Arrow

It has been fun but I think we've pretty effectively demonstrated the common wisdom. It's all in the bus, you just can't move the data around fast enough to beat code able to execute on the CPU behind pending writes.
Karlos is offline  
Old 01 June 2024, 22:29   #218
abu_the_monkey
Registered User
 
Join Date: Oct 2020
Location: Bicester
Posts: 2,022
Quote:
Originally Posted by Karlos View Post
@abu_the _monkey

Try it. The only things to say are you don't need Forbid/Permit since Disable achieves the same thing regardless. You will probably want to disable write allocate before talking to Akiko too. The latest code does this but it's basically identical to what paraj posted a bit earlier.
winuae will not be a good gauge and I don't have real hardware to test on. it would still have the overhead of using the Akiko, just wondered if some of the reads/writes could be done just after (during) a write to chip ram.

Quote:
Originally Posted by Karlos View Post
It has been fun but I think we've pretty effectively demonstrated the common wisdom. It's all in the bus, you just can't move the data around fast enough to beat code able to execute on the CPU behind pending writes.
yes, but where is the point/speed where it becomes better to use the cpu is something I really wanted to know.
abu_the_monkey is offline  
Old 01 June 2024, 22:35   #219
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,477
I don't know - isn't the bus logic busy servicing the chip ram write? I don't think you can just go and do a read from somewhere else (unless in cache I suppose) while you are waiting for it.

This isn't my area of expertise mind.

Last edited by Karlos; 01 June 2024 at 22:49.
Karlos is offline  
Old 01 June 2024, 22:38   #220
abu_the_monkey
Registered User
 
Join Date: Oct 2020
Location: Bicester
Posts: 2,022
nor mine, just a thought that popped in my tired noggin
abu_the_monkey is offline  
 


Currently Active Users Viewing This Thread: 2 (0 members and 2 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
C2P Performance issues meeku Coders. Asm / Hardware 10 09 April 2019 18:29
Alien Breed 3D CD32 - Akiko C2P? wairnair support.Games 9 06 July 2018 14:32
Gloom Akiko C2P? Whitesnake support.Games 5 23 April 2007 19:01
Blizzard 030/50 Accelerators Parsec Amiga scene 20 14 February 2004 17:48
Cd32 Emulator (AKIKO) Doozy support.WinUAE 3 06 December 2001 08:41

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 06:51.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.10171 seconds with 14 queries