English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 01 June 2024, 22:52   #221
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,480
Quote:
Originally Posted by abu_the_monkey View Post
nor mine, just a thought that popped in my tired noggin
Well, that's pretty much how all this stated. I was curious about whether or not there was a potential benefit.
Karlos is offline  
Old 02 June 2024, 00:47   #222
abu_the_monkey
Registered User
 
Join Date: Oct 2020
Location: Bicester
Posts: 2,022
where are all the guys with a TF330 hiding?
abu_the_monkey is offline  
Old 02 June 2024, 00:53   #223
abu_the_monkey
Registered User
 
Join Date: Oct 2020
Location: Bicester
Posts: 2,022
Quote:
Originally Posted by Thomas Richter View Post
Hardly. Akiko is synchronous, and the slow part is attempting to read from its registers as the CPU needs to wait for the relatively slow chip bus. For a conversion from fast mem to chip mem, the CPU does not need to wait for anything - it can retire chip bus accesses in its push buffer while continuing to work. That does not help for Akiko,
does this mean that I have things back to front?
after a write to chip ram I can read from fast with no penalty?
abu_the_monkey is offline  
Old 02 June 2024, 01:28   #224
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,480
Quote:
Originally Posted by abu_the_monkey View Post
does this mean that I have things back to front?
after a write to chip ram I can read from fast with no penalty?
I think you can read as long as it's from cache. If it has to go over the bus, I dunno. This is why I was thinking along the lines of trying to prefill some of the cache. However the numbers don't look great. The NullC2P plus the Akiko Limit suggest that there's no wiggle room.
Karlos is offline  
Old 02 June 2024, 05:51   #225
pipper
Registered User
 
Join Date: Jul 2017
Location: San Jose
Posts: 677
The whole issue is that the data has to go over the chipmem (from the view of the CPU) bus 3 times. Write to Akiko, read from Akiko, write to destination bitmap planes.
If only they had made Akiko in a way that _it_ would forward the converted data asynchronously to the destination addresses, it would have been the best way. Maybe they could have made it so it can steal DMA cycles from the blitter…dunno. But I guess incorporating Akiko into the whole DMA scheme was a whole other ballgame for a late addition to the chip, so they didn’t.
pipper is offline  
Old 02 June 2024, 09:09   #226
thellier
Registered User
 
Join Date: Sep 2011
Location: Paris/France
Posts: 278
Not really related to the topic:
Is it possible to use the akiko registers as screen bitmaps (ie setting screen pointers to those registers adresses) ? they would be seen as a 32 pixels line as they are in chip men, no ? then copper could copy new pixels iin those registers for defining next line, perhaps ?
thellier is offline  
Old 02 June 2024, 10:04   #227
grond
Registered User
 
Join Date: Jun 2015
Location: Germany
Posts: 1,926
Quote:
Originally Posted by pipper View Post
If only they had made Akiko in a way that _it_ would forward the converted data asynchronously to the destination addresses, it would have been the best way. Maybe they could have made it so it can steal DMA cycles from the blitter…dunno. But I guess incorporating Akiko into the whole DMA scheme was a whole other ballgame for a late addition to the chip, so they didn’t.
Um, in that case they could simply have implemented chunky fetch. Remember, chunky fetch doesn't change the DMA scheme at all, it only means that each bit has to end up in a different place in the palette-lookup registers. It's a simple rewiring.
grond is offline  
Old 02 June 2024, 11:03   #228
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,480
This brings me full circle to the odd outliers - games that were reportedly faster on Akiko. Errors in measurement or just completely dominated by C2P time and on a CPU not fast enough to complete the ALU workloads behind the chip write for software C2P to be better?

Last edited by Karlos; 02 June 2024 at 12:35.
Karlos is offline  
Old 02 June 2024, 12:09   #229
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,317
Quote:
Originally Posted by thellier View Post
Not really related to the topic: Is it possible to use the akiko registers as screen bitmaps (ie setting screen pointers to those registers adresses) ? they would be seen as a 32 pixels line as they are in chip men, no ? then copper could copy new pixels iin those registers for defining next line, perhaps ?
No, Akiko does not sit on the custom chip bus, so it cannot act as source (nor destination) of a custom chip DMA.
Thomas Richter is offline  
Old 02 June 2024, 12:10   #230
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,317
Quote:
Originally Posted by grond View Post
Um, in that case they could simply have implemented chunky fetch. Remember, chunky fetch doesn't change the DMA scheme at all, it only means that each bit has to end up in a different place in the palette-lookup registers. It's a simple rewiring.
As far as Denise is concerned, certainly. However, that still does not give you chunky. It gives you "chunky oddly distributed over 8 sources". Thus, in addition, the DMA logic would have to be touched to get a linear frame buffer.
Thomas Richter is offline  
Old 02 June 2024, 12:18   #231
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,480
Oddly distributed chunky. That reminds me of that RGB port device. What was it? Graffiti?
Karlos is offline  
Old 02 June 2024, 12:33   #232
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,480
Other than determining the actual point at which Akiko becomes slower than the CPU, there's not much else I'm looking for here. The CPU requirements for TKG are high enough already that there's no meaningful intersection below 50MHz 030, except perhaps in 1x2 pixelmode 2/3 size on some slower part.

I'm fully open to suggestions from cycle counters and scheduling sleuths but I think I'm going to put this to one side for now. I'm happy to add the option for Akiko C2P support to the engine for those that want to experiment with it but I think it's going to be counter productive.
Karlos is offline  
Old 02 June 2024, 17:17   #233
grond
Registered User
 
Join Date: Jun 2015
Location: Germany
Posts: 1,926
Quote:
Originally Posted by Thomas Richter View Post
As far as Denise is concerned, certainly. However, that still does not give you chunky. It gives you "chunky oddly distributed over 8 sources". Thus, in addition, the DMA logic would have to be touched to get a linear frame buffer.
Why would that be? Just point the 1st bitplane pointer to the buffer, the 2nd to buffer+4, the 3rd to buffer+8 a.s.o. and then set the modulo accordingly. Whatever other adjustments might be necessary, could be accomplished through a copperlist.

But I guess this is off-topic in this thread.
grond is offline  
Old 02 June 2024, 20:26   #234
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,317
Because the bitmap DMA accesses memory contiguously, and not interleaved. With this setup, the first word accessed by the third bitplane will be identical to the third word accessed by the first bitplane, and this is surely not what you want. Sure, it seems "trivial enough", but I'm not sure how Agnus interacts with memory and whether it has to open pages. Thus, there is certainly something to do.
Thomas Richter is offline  
Old 03 June 2024, 08:32   #235
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by abu_the_monkey View Post
after a write to chip ram I can read from fast with no penalty?
On 040 and 060 yes. On 030 no - any memory access will stall until the write is finished, even from cache.
meynaf is offline  
Old 03 June 2024, 08:57   #236
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,480
Quote:
Originally Posted by meynaf View Post
On 040 and 060 yes. On 030 no - any memory access will stall until the write is finished, even from cache.
That sucks.
Karlos is offline  
Old 03 June 2024, 10:14   #237
grond
Registered User
 
Join Date: Jun 2015
Location: Germany
Posts: 1,926
Quote:
Originally Posted by Thomas Richter View Post
Because the bitmap DMA accesses memory contiguously, and not interleaved. With this setup, the first word accessed by the third bitplane will be identical to the third word accessed by the first bitplane, and this is surely not what you want.
I guess one could use the bitplane modulo to basically make each line just one 32bit fetch long and then add a modulo of 28 bytes to skip the other seven bitplanes.


Quote:
Sure, it seems "trivial enough", but I'm not sure how Agnus interacts with memory and whether it has to open pages. Thus, there is certainly something to do.
I still think chunky-fetch is a rather trivial change once you've got an 8bit planar mode and it certainly is easier to implement than a DMA-driven c2p-converter.
grond is offline  
Old 03 June 2024, 10:46   #238
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,480
For the sake of completeness and curiosity, I've added Fast 2 Fast tests for the Akiko Naive, Akiko Naive (reg buffered) and Kalms versions.
Karlos is offline  
Old 03 June 2024, 10:59   #239
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,317
Quote:
Originally Posted by grond View Post
I guess one could use the bitplane modulo to basically make each line just one 32bit fetch long and then add a modulo of 28 bytes to skip the other seven bitplanes.
While this is true, the modulo is added at the end of the scanline. So yes, this idea works, but only if the display is 16 pixels wide. Well, it's probably *a bit* too narrow to be useful. (-;

Quote:
Originally Posted by grond View Post
I still think chunky-fetch is a rather trivial change once you've got an 8bit planar mode and it certainly is easier to implement than a DMA-driven c2p-converter.
Sure.
Thomas Richter is offline  
Old 03 June 2024, 11:08   #240
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,480
16 pixels wide you say? That's almost as good as the first generation chunky copper hacks

I take this back, whenever I see AB3D1. No matter how blocky it was, 12-bit RGB is still sweet.
Karlos is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
C2P Performance issues meeku Coders. Asm / Hardware 10 09 April 2019 18:29
Alien Breed 3D CD32 - Akiko C2P? wairnair support.Games 9 06 July 2018 14:32
Gloom Akiko C2P? Whitesnake support.Games 5 23 April 2007 19:01
Blizzard 030/50 Accelerators Parsec Amiga scene 20 14 February 2004 17:48
Cd32 Emulator (AKIKO) Doozy support.WinUAE 3 06 December 2001 08:41

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 09:53.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.18377 seconds with 14 queries