English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 25 May 2024, 14:14   #81
abu_the_monkey
Registered User
 
Join Date: Oct 2020
Location: Bicester
Posts: 2,018
If the numbers are correct, it's an even bigger shame that Akiko can't do direct DMA to chip ram.
abu_the_monkey is offline  
Old 25 May 2024, 14:18   #82
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,413
Quote:
Originally Posted by abu_the_monkey View Post
If the numbers are correct, it's an even bigger shame that Akiko can't do direct DMA to chip ram.
True. Especially given the chip *does* have DMA already. I think it uses some ring buffers or something for the CD drive.

The numbers do look a bit startling though. Especially given that its pixels converted per second, full round trip.

I'll put the code on GitHub later in the interests of transparency
Karlos is offline  
Old 25 May 2024, 15:16   #83
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,413
The numbers do seem a bit "unpossible" though. This has to be some sort of cache interference if the loop counts are correct and timing is accurate.

I'll move onto testing and actual block conversion when I get a minute later. That might be more informative
Karlos is offline  
Old 25 May 2024, 18:14   #84
paraj
Registered User
 
paraj's Avatar
 
Join Date: Feb 2017
Location: Denmark
Posts: 1,189
Haven't spotted any errors in your code, but they seem too good to be true. Unless my math is off (wouldn't be the first time) 53768 CIA ticks would correspond to 20*53768=1075360 14MHz clock ticks with each loop iteration taking (20*53768)/100000 = 10.7536 of those. Seems suspiciously low with 16 accesses... The ones taking twice as long might be believable.
paraj is offline  
Old 25 May 2024, 18:39   #85
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,413
Quote:
Originally Posted by paraj View Post
Haven't spotted any errors in your code, but they seem too good to be true. Unless my math is off (wouldn't be the first time) 53768 CIA ticks would correspond to 20*53768=1075360 14MHz clock ticks with each loop iteration taking (20*53768)/100000 = 10.7536 of those. Seems suspiciously low with 16 accesses... The ones taking twice as long might be believable.
Even the ~170ms runs seem a bit suspect - that would be 18.8 MPix/s, which implies 18.8MB/s written to Akiko and 18.8MB/s read back.
Karlos is offline  
Old 25 May 2024, 20:40   #86
robinsonb5
Registered User
 
Join Date: Mar 2012
Location: Norfolk, UK
Posts: 1,157
There's definitely something weird going on with the timing here.
Just out of curiosity I ran the test on the Minimig FPGA core and watched its operation using SignalTap - here's a trace.
(There are two accesses per longword because the TG68 CPU has a 16-bit bus, so my implementation of the Akiko cornerturn expects pairs of word accesses.)

So a complete iteration, from the start of the first write to the start of the next block of writes, takes roughly 352 cycles of the fastest clock in the system, which is 113.34MHz.
I make that 8.82ns per cycle times 352 cycles = 3105ns, or 3.105us for one complete round of the test.

The test is 100000 rounds long, so roughly 310.5ms for the entire test.

Yet the program usually reports about 89000 ticks, 126ms - and sometimes reports only 2,000 ticks, 30ms.

(I'm wondering if maybe ReadEClock() reacts badly to interrupts being disabled for an extended period?)
Attached Thumbnails
Click image for larger version

Name:	Screenshot at 2024-05-25 19-27-10.png
Views:	22
Size:	14.5 KB
ID:	82297  
robinsonb5 is offline  
Old 25 May 2024, 20:55   #87
paraj
Registered User
 
paraj's Avatar
 
Join Date: Feb 2017
Location: Denmark
Posts: 1,189
FWIW here's my timing code that I've never had problems with on my 060 (note: hard coded assumption about 50MHz). Can't remember why I open util.library in the code though..
Attached Files
File Type: c timing.c (2.2 KB, 16 views)
paraj is offline  
Old 25 May 2024, 21:26   #88
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,413
To my knowledge, the Eclock is driven by a divider on the bus. It shouldn't be impacted by interrupt status (unless using it to create a delay of course).
Karlos is offline  
Old 25 May 2024, 21:40   #89
paraj
Registered User
 
paraj's Avatar
 
Join Date: Feb 2017
Location: Denmark
Posts: 1,189
Quote:
Originally Posted by Karlos View Post
To my knowledge, the Eclock is driven by a divider on the bus. It shouldn't be impacted by interrupt status (unless using it to create a delay of course).
I don't remember exactly how the OS handles it, but there is the issue of rollover. Can't remember the details right now of how often that happens, but when timing I found it best to let OS do its stuff regularly and sum time time (with DMA off!) of inner loop.
paraj is offline  
Old 25 May 2024, 22:48   #90
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,413
Should be about every 101 minutes to wrap the lower 32-bit value at 709kHz or so.
Karlos is offline  
Old 25 May 2024, 23:05   #91
robinsonb5
Registered User
 
Join Date: Mar 2012
Location: Norfolk, UK
Posts: 1,157
Quote:
Originally Posted by Karlos View Post
Should be about every 101 minutes to wrap the lower 32-bit value at 709kHz or so.
Except the CIA timer registers which are clocked by eclock are only 16 bits wide, so they overflow every 90ms or so. The OS checks for an overflow and increments the upper 48 bits of the timestamp if detected, but it can only detect a single overflow; if the counter wraps round several times while interrupts are disabled it will indeed lose time.

It might be better to time using the TOD clocks - they're 24-bits wide and on slower clocks (VSync or PSU tick for one CIA, and HSync for the other), so wraparound is rare.
robinsonb5 is offline  
Old 25 May 2024, 23:07   #92
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,413
Quote:
Originally Posted by robinsonb5 View Post
Except the CIA timer registers which are clocked by eclock are only 16 bits wide, so they overflow every 90ms or so. The OS checks for an overflow and increments the upper 48 bits of the timestamp if detected, but it can only detect a single overflow; if the counter wraps round several times while interrupts are disabled it will indeed lose time.
Right, that's got to be it. The only solution then is to use a smaller benchmark, call it multiple times and average the results.
Karlos is offline  
Old 25 May 2024, 23:07   #93
paraj
Registered User
 
paraj's Avatar
 
Join Date: Feb 2017
Location: Denmark
Posts: 1,189
Quote:
Originally Posted by Karlos View Post
Should be about every 101 minutes to wrap the lower 32-bit value at 709kHz or so.
It's only 32-bit if they're in cascade mode and you use both 16-bit timers. Probably else going on, but IIRC timer.device only relies on one 16-bit counter and proper OS operation for wrap around.

EDIT: what robinsonb5 said
paraj is offline  
Old 25 May 2024, 23:12   #94
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,413
That's the first time the Eclock has ever let me down. Mind you, I don't normally write code that runs with interrupts disabled.

A better iteration might be 2560 (equivalent to 320*256 pixels) and accumulate a bunch of those. Then, outliers an be identified and eliminated, and the remainder averaged.

Last edited by Karlos; 25 May 2024 at 23:19.
Karlos is offline  
Old 25 May 2024, 23:17   #95
paraj
Registered User
 
paraj's Avatar
 
Join Date: Feb 2017
Location: Denmark
Posts: 1,189
Quote:
Originally Posted by Karlos View Post
That's the first time the Eclock has ever let me down. Mind you, I don't normally write code that runs with interrupts disabled.
It's because you're such a nice person that doesn't usually beat the system on the head.


My timing.c spaces the beatings for this reason.
paraj is offline  
Old 26 May 2024, 00:42   #96
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,413
Quote:
Originally Posted by paraj View Post
It's because you're such a nice person that doesn't usually beat the system on the head.


My timing.c spaces the beatings for this reason.
Yeah, I'm going to go for conversion cycles closer to a real workload size.
Karlos is offline  
Old 26 May 2024, 00:45   #97
abu_the_monkey
Registered User
 
Join Date: Oct 2020
Location: Bicester
Posts: 2,018
learning new stuff all the time

Quote:
Yeah, I'm going to go for conversion cycles closer to a real workload size.
sounds like the way to go
abu_the_monkey is offline  
Old 26 May 2024, 14:13   #98
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,413
Can we retry the version from here:

https://github.com/0xABADCAFE/akiko-fun/tree/main

This does 20 runs of 2560x32 e,g, (320x256) and derives from there.

Note that it's still only testing reg > hw > reg until we are getting sensible measurements. Then I will start doing actual c2p conversion tests.
Karlos is offline  
Old 26 May 2024, 15:44   #99
Lunda
Registered User
 
Join Date: Jul 2023
Location: Domsjö/Sweden
Posts: 56
Quote:
Originally Posted by Karlos View Post
Can we retry the version from here:

https://github.com/0xABADCAFE/akiko-fun/tree/main

This does 20 runs of 2560x32 e,g, (320x256) and derives from there.

Note that it's still only testing reg > hw > reg until we are getting sensible measurements. Then I will start doing actual c2p conversion tests.
See attached pic.
Attached Thumbnails
Click image for larger version

Name:	NewAkikotest.jpg
Views:	43
Size:	385.8 KB
ID:	82303  
Lunda is offline  
Old 26 May 2024, 16:04   #100
abu_the_monkey
Registered User
 
Join Date: Oct 2020
Location: Bicester
Posts: 2,018
that is more believable.

still almost 9mb/s ain't too shabby.

@Lunda is that with the 030 clock at 70mhz? (5 times the 14mhz cpu clock)
abu_the_monkey is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
C2P Performance issues meeku Coders. Asm / Hardware 10 09 April 2019 18:29
Alien Breed 3D CD32 - Akiko C2P? wairnair support.Games 9 06 July 2018 14:32
Gloom Akiko C2P? Whitesnake support.Games 5 23 April 2007 19:01
Blizzard 030/50 Accelerators Parsec Amiga scene 20 14 February 2004 17:48
Cd32 Emulator (AKIKO) Doozy support.WinUAE 3 06 December 2001 08:41

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 10:00.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.17310 seconds with 16 queries