04 June 2024, 14:56 | #261 | |
Registered User
Join Date: Jul 2023
Location: Domsjö/Sweden
Posts: 59
|
Quote:
NTSC 320x200x8: Case 0: 70fps Case 1: 66fps Case 7: 42fps Case 8: 47fps PAL 320x256x8: Case 0: 69fps Case 1: 69fps Case 7: 42fps Case 8: 46fps |
|
07 June 2024, 17:53 | #262 |
Registered User
Join Date: May 2013
Location: Grimstad / Norway
Posts: 854
|
BTW, does anyone remember how the Graffiti(?) made its chunky modes conversion?
How many pixels did it read before it did its conversion? This thread made me think of it, and IIRC it had a slightly convoluted pixel ordering/layout. I expect the reason for that was limited memory before the bit-shuffle was done; if you buffered a whole line you could just split the bits out while getting input and then dumping the whole thing at the end of the line? So I was thinking about making a linear output of bits for something like Grafitti that would make a straight non-shuffled chunky source possible and I think there should be at least a few ways to do so: SHRES 2 bitplanes, FMODE=3, 64 pixels wide display, 8 byte modulo, bitplane 2 staring 8 bytes after bitplane 1, copper used to restart display every 64 pixels. HRES 4 bitplanes, FMODE=1, 32 pixels wide display, 16 byte modulo, bitplanes start 4 bytes after each other, copper used to restart display every 32 pixels. These would require the device to buffer 64 or 32 pixels worth of data, and I suspect the Grafitti only cares about 16. But someone could run with the idea and scale a device up to 32/64/320... |
08 June 2024, 02:20 | #263 |
Registered User
Join Date: Oct 2020
Location: Bicester
Posts: 2,038
|
found a few moments to shoehorn the random thunk I had into the test Karlos wrote (brain damaged out most of the known / consistent tests)
this only has: 0:Vanilla Fast to Chip Copy, 8 longwords at a time. 1:Akiko C2P (Limit) Register to Akiko to Register throughput, CACR Write Allocation Disabled. 2:Akiko C2P (Naive) Chunky read from Fast, planar write to Chip, CACR Write Allocation Disabled. 3:Akiko C2P (AL_Buffer) Chunky read from Fast, planar write to Chip, register buffer to/from Akiko, CACR Write Allocation Disabled. 4:Kalms C2P (c2p1x1_8_c5_030_2) Chunky read from Fast, planar write to Chip. run it from ram disk to be on the safe side. give it a go if you are brave. |
08 June 2024, 11:40 | #264 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
If possible, run while viewing a 320*256*8 bit screen as that will simulate the chip ram contention.
If you do find a useful speed up on 030/50MHz I'm all ears btw. Until then, I'm focusing on audio for a bit .. ("all ears" for video related issues, "focusing on" for audio. Weird phraseology) Last edited by Karlos; 08 June 2024 at 12:59. |
08 June 2024, 17:51 | #265 |
Registered User
Join Date: Oct 2020
Location: Bicester
Posts: 2,038
|
|
15 June 2024, 14:39 | #266 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
A wild thought appears:
Consider that the limits to Akiko are in the transfer of converted planes from it's register to the chip ram. GRIND only uses a 4 bit display depth. On an unexpanded CD32. You still have to write 8 longs to it, but you only have to transfer 4. So, it might give a significant boost to the already decent framerate on CD32. |
16 June 2024, 18:57 | #267 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
FWIW, to integrate Akiko into Grind requires a few assumptions on my part.
1. Chunky Pixel Framebuffer is 1 byte per pixel. 2. Just the lowest 4 bits are set in each byte. 3. It doesn't actually matter what the upper bits are. 4. Framebuffer is arranged in rows. The Akiko workflow first involves detecting it, which is pretty easy. Theres a magic value to read from a particular address. The code is in my test repo. The C2P loop is trivial: You read 8 consecutive longs from the chunky buffer, writing each one into the single register address. That's 32 input pixels. You read back 4 consecutive longs from the address. Each long is 32 bits for each of your planes. You write those to your chip ram planes. You don't have to read back all 8 and planes are returned lowest plane first. Despite the 14MHz bus, reading and writing the Akiko register location is basically uncontended, it's only chip ram writes that are the problem. That's it. |
16 June 2024, 19:07 | #268 |
Registered User
Join Date: Feb 2017
Location: Denmark
Posts: 1,239
|
Isn't grind 2x2? I also seem to recall that it uses (or used) an optimized "chunky" layout that would save or help one or more passes. Doesn't mean it's impossible or even hard to adapt, but I doubt implementing it will be as easy as it was in AB3DII (or say Breathless).
It's interesting to think about for sure, but my suspicious is that it will take a bit of effort (maybe much!) to get working, and would only be a late addition (probably via a plugin architecture thing where you might also be able to get better visuals on 030+ or something). |
16 June 2024, 19:38 | #269 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
I think it's 1x2, but that could be an artefact.
I also haven't bothered implementing it I AB3D2 because I haven't really seen any evidence that it would help by the time you are on 030/50, which is the absolute MVP processor option (1x2 pixels, 2/3 size). Last edited by Karlos; 16 June 2024 at 20:38. |
16 June 2024, 20:15 | #270 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
It goes without saying that if it is 2x2, Akiko may not be such a great idea.
|
16 June 2024, 23:40 | #271 | |
Pixelglass/Reimagine
Join Date: Jun 2012
Location: Athens
Posts: 1,059
|
Quote:
You say using Akiko isn't a good idea for 2x2? Also has anyone tested KK's blitter driven C2P against the others? Obviously not ideal for 030+ but generally speaking. |
|
17 June 2024, 00:19 | #272 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
The reason it may not be such a good idea is that you'd have to do some pixel doubling either on the input to Akiko, or the output. Depending on how this is done, it might nullify any gains. As @paraj says, the current C2P method is quite specialised and what you said here also gives me pause.
|
17 June 2024, 19:22 | #273 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
I'm not sure I understand how Grind makes 2 horizontal pixels from 1 logical 2x2 one. I remember watching a video about the development where KK described an algorithm for that, but I'd have to revisit it.
|
18 June 2024, 02:03 | #274 | |
Registered User
Join Date: May 2018
Location: Ireland
Posts: 693
|
Quote:
I guess it's direct chipram writes and may be more efficient than Akiko when not doing 320x256 I.e. no half horizontal resolution. |
|
18 June 2024, 04:51 | #275 |
Total Chaos forever!
Join Date: Aug 2007
Location: Waterville, MN, USA
Age: 49
Posts: 2,207
|
Copper stretching is used vertically so the vertical resolution is about 100 pixels actual in Grind.
|
18 June 2024, 07:44 | #276 |
Registered User
Join Date: Feb 2010
Location: Espoo / Finland
Posts: 821
|
Because you're moving bytes around, you can pack two 4bit pixels into one byte which in turn gives you a kind of dithering basically for free.
|
18 June 2024, 08:34 | #277 |
Registered User
Join Date: Feb 2017
Location: Denmark
Posts: 1,239
|
I seem to recall KK writing that the most significant pixel was already in place, so I think it's laid out something like this in memory (bits):
Code:
A3 a3 A2 a2 A1 a1 A0 a0 E3 e3 E2 e2 E1 e1 E0 e0 B3 b3 B2 b2 B1 b1 B0 b0 ... From the code it seems to at least be something close to that (but not exactly): https://github.com/Krzysiek-K/Dread-...a_framework.s/ |
18 June 2024, 16:49 | #278 | |
Registered User
Join Date: Sep 2019
Location: Gdansk / Poland
Posts: 135
|
Quote:
But the columns are interleaved in Dread engine, so it's not followed by B3 but A3[1] - or however you would denote the logical pixel just below A. |
|
18 June 2024, 17:02 | #279 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,571
|
|
19 June 2024, 11:51 | #280 | |
Registered User
Join Date: Jul 2014
Location: Warsaw/Poland
Posts: 195
|
Quote:
|
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
C2P Performance issues | meeku | Coders. Asm / Hardware | 10 | 09 April 2019 18:29 |
Alien Breed 3D CD32 - Akiko C2P? | wairnair | support.Games | 9 | 06 July 2018 14:32 |
Gloom Akiko C2P? | Whitesnake | support.Games | 5 | 23 April 2007 19:01 |
Blizzard 030/50 Accelerators | Parsec | Amiga scene | 20 | 14 February 2004 17:48 |
Cd32 Emulator (AKIKO) | Doozy | support.WinUAE | 3 | 06 December 2001 08:41 |
|
|