07 April 2013, 21:16 | #1 |
Registered User
Join Date: Apr 2013
Location: paris
Posts: 133
|
Blitter fill timing
Hi,
I just saw a really cool demo from "Revision 2013" party. ( http://pouet.net/prod.php?which=61182 ). It's a AMIGA 500 (OCS chipset) demo featuring some really nice effects. In the end scroller, the author explain some tricks he use, and I'm curious about one thing. In a glenz vector part, author claim he has to use bitplan trick because the blitter is not able to fill three bitplan screen at 50hz. I'm ATARI-ST programer, I coded amiga stuff too (never released) but it was on A1200 so timings are not the same I guess. Can someone tell me exactly how % of VBL take a complete 3d fill of a 320*256 screen, in one bitplan, on a A500 OCS? (I did't find that on google ) Thanks in advance Leonard / OXYGENE |
08 April 2013, 10:58 | #2 |
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,502
|
A1200 chipset DMA timing is exact same as long as FMODE=0. CPU speed can be much faster (instruction cache, faster instructions and 32-bit wide bus to chipmem).
320x256 single bitplane blitter fill takes about 70 scanlines if all DMA slots are free. I'd estimate (too lazy to calculate anything) it is possible to fill 3 planes in one frame (with 3 bitplane display visible) but there would not be time left for anything else. It will get much worse if overscan is used. |
08 April 2013, 14:01 | #3 |
Registered User
Join Date: Apr 2013
Location: paris
Posts: 133
|
Thanks Toni.
So when we all see classic "glenz vector" objects such as the famous "glenz vector 48 faces" of the HardWeird demo, it's only possible because the glenz vector does not cover the complete screen. ( maybe it's 200*200 pixels?). If it cover the whole screen, there is not enough blitter time to draw 320*256 pixels. Other questions: imagine the CPU is filling some memory, does it slow-down the blitter a bit? Or is the CPU totally on different cycle than blitter? In other words, would it be possible to draw some polygons with CPU for free, during the time blitter did all its work, driven by a pre-computed COPPER list? |
08 April 2013, 14:23 | #4 |
Registered User
Join Date: Apr 2013
Location: paris
Posts: 133
|
Oh BTW, there is 312 "time" scanlines per VBL in PAL, am I right? So three bitplans * 70 scanlines = 210 scanlines to fill 3*320*256, so there is 102 free scanlines, right? (about 1/3).
|
08 April 2013, 16:37 | #5 |
Registered User
Join Date: Dec 2011
Location: Northamptonshire, UK
Age: 41
Posts: 1,236
|
very impressive demo i'm curious to know how they did it too
|
09 April 2013, 07:38 | #6 | ||
Rock Lobster
Join Date: Nov 2012
Location: Macclesfield
Age: 49
Posts: 40
|
Quote:
Quote:
Note that the blitter can slow down the CPU if the code is running from chip ram. If the blitter nasty bit is set it can even stop the CPU, a feature I rely upon in the plasmas near the beginning of the demo. Last edited by Paradroid; 09 April 2013 at 07:51. |
||
09 April 2013, 08:10 | #7 |
Rock Lobster
Join Date: Nov 2012
Location: Macclesfield
Age: 49
Posts: 40
|
ah, just realised Leonard is the same guy I was talked to about this via email
For the benefit of others, here's some of the relevant info I'd passed on... > "blitter can't fill 3 bitplans at 50hz" That was referring to overscan bitplanes. Redux uses a 352x272 display most of the time and the blitter wouldn't even be able to fill clear and fill 2 bitplanes at that resolution, even when using the cpu to help with the clear (well, it might just about do it, but not when drawing a lot of lines too). The area I'm filling with the blitter is clamped around the object, which is why I'm able to keep it at 50Hz. > But then you say when clip arrive, you switch to a four bitplan blitter routine. At this point I've switched to a smaller display area,192x192. At this size I can fill 4 bitplanes in just under half a frame, leaving the other half for clearing (which uses both blitter and CPU) and drawing the lines. > How many time require the blitter to fill a one bitplan, 320*256 pixels screen? It's totally dependant on what else is active and using the DMA buss, such as number of active bitplanes, audio, sprites, etc. If you stick to a 2 bitplane display, write a fast clear and don't draw too many lines you could stay in 50Hz at that size. I just about managed it in the demo deja-vu with a full screen screen glenz (it only needed 2 bitplanes because you couldn't see the outline of the object), but I just couldn't get it fast enough in overscan... Hmmm, that was a very long time ago, maybe I should try again ^_^ |
09 April 2013, 10:17 | #8 |
Registered User
Join Date: Apr 2013
Location: paris
Posts: 133
|
Hi Paradroid
Yes I'm the same guy glad you are on that forum too! I love world record in demos (I get some on ATARI st ) and I always thought 3d was "easy" on amiga. Now I see that it could be a world record to get a 320*256 glenz vector on a standard A500 OCS. Thanks for all explains, I see now that even mythic hardweird glenz 48 faces is quite small on the screen. |
09 April 2013, 10:41 | #9 | |
Rock Lobster
Join Date: Nov 2012
Location: Macclesfield
Age: 49
Posts: 40
|
Quote:
Actually, IIRC I wasn't using the copper or interrupts for rendering the glenzes in Deja-vu, so it shouldn't be too hard to make them bigger as the CPU was proably just waiting for the blitter to finish half the time. EDIT: FYI, the record for OCS glenz faces is at least 192 (see Anarchy's 3D Demo II). Doing that full screen would be nice challenge to take on Last edited by Paradroid; 09 April 2013 at 11:06. |
|
09 April 2013, 12:42 | #10 | |
Registered User
Join Date: Apr 2013
Location: paris
Posts: 133
|
Oh yes but the glenz is 2 bitplans only as you said right? (I mean, it works just because tyhe shape is zoomed so that we don't see the borders)
Quote:
BTW could you tell me how much time it takes to CLEAR with the blitter, compared to "FILL" (in the same condition of bitplans, sound, copper, etc). Did the CLEAR is twice fast than FILL? or anything else? |
|
09 April 2013, 13:53 | #11 |
Rock Lobster
Join Date: Nov 2012
Location: Macclesfield
Age: 49
Posts: 40
|
a clear would be more than double the speed of a fill, although even then I wouldn't usually use a pure blitter clear myself. Depending on how you draw the object you may not need a traditional clear at all. For example, it might be quicker to redraw the lines again to wipe the old ones. That would require the fill to do a copy to another buffer rather than writing the result back to itself...
Then again, maybe you might want to use a technique that doesn't need a fill at all. This is why I love programming the amiga, as with every effect, there loads of ways to go about rendering 3D using the cpu, blitter, copper, interrupts, etc, for various tasks in various configurations and orders, so I suggest you just grab yourself a framework if you don't have one already and just experiment. If all you have is blitter memory bandwidth numbers you sure ain't going to be getting anywhere near the potential of the machine. |
09 April 2013, 14:09 | #12 | ||
Registered User
Join Date: Apr 2013
Location: paris
Posts: 133
|
Quote:
Quote:
I tryed the built-in debugger of WinUAE. Not really bad but far from good to make devleoppement. What debugger are you using when you develop amiga stuff on windows platform? |
||
09 April 2013, 14:19 | #13 |
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,502
|
Technical info here because it is always interesting!
Fill is always at least 3 blitter cycles/word. Plain clear takes 2 blitter cycles. Both have one idle cycle which is usable by the CPU and only by the CPU. Both idle and non-idle blitter cycles require DMA cycle that was not used by any other higher priority DMA channel. If this Blitter cycle was not actually used by Blitter (was blitter idle cycle), it becomes available for the CPU. This is very important undocumented feature that should help to optimize bitplane/blitter/CPU usage even better. btw, WinUAE "dma" debugger can be used to check DMA channel usage ("v" command) |
09 April 2013, 14:23 | #14 |
Rock Lobster
Join Date: Nov 2012
Location: Macclesfield
Age: 49
Posts: 40
|
I just use the winuae debugger, plus a whole load of verification and unit testing code so I don't need to visit it too often. oh how I miss source level debugging, lol
I've got a snasm devkit here (same as what I was using when making the original RaD) that allows source level debugging of code running on the actual hardware, but I don't have a PC old enough to put it in - for some reason I've kept my 386 and 486 mobos, but not the memory chips or power supplies zzz -_- Last edited by Paradroid; 09 April 2013 at 14:56. Reason: missing words :P |
09 April 2013, 18:42 | #15 | |
Registered User
Join Date: Apr 2013
Location: paris
Posts: 133
|
Quote:
|
|
09 April 2013, 20:52 | #16 | |
Registered User
Join Date: Jan 2012
Location: USA
Posts: 372
|
Quote:
This suggests that triple buffering is worth trying when using area fill. Buffer 1 -- display Buffer 2 -- area fill poly Buffer 3 -- CPU clear with MOVEMs |
|
09 April 2013, 21:15 | #17 | ||
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,502
|
Quote:
Quote:
|
||
15 April 2013, 18:49 | #18 |
Registered User
Join Date: Apr 2013
Location: paris
Posts: 133
|
Hi Tony,
I'm working on a test version of doing 3d on A500, OCS. I made some test, showing timing using raster colors (oldskool ). I just wonder how accurate WinUAE is? I mean, I did all my timing tests on winUAE (I don't have A500). I'm interested by Blitter-interrupt (blitter is running, CPU too, and blitter interrupt is used) Do you think I can "trust" winUAE is this configuration? ( I set "cycle exact") |
15 April 2013, 19:01 | #19 |
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,502
|
A500 cycle-exact "should" have perfect timing but I am 100% sure there are some CPU instructions that have wrong cycle usage. Chipset timing should be perfect.
I don't recommend blitter interrupts, at least if there are lots of small blits. 68000 exceptions (including interrupts) have long "startup", ~50 cycles or so and it does not even include saving/restoring registers and RTE. Small blits gets finished before interrupt even starts |
15 April 2013, 22:15 | #20 | |
Registered User
Join Date: Apr 2013
Location: paris
Posts: 133
|
Quote:
The CPU interrupt seems very long compared to the ATARI-ST, but if you confirm it's normal, then I have to take that into account. Thanks! |
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Clipping line for blitter fill | leonard | Coders. Asm / Hardware | 12 | 27 April 2013 12:03 |
80 GB HD to fill! | fatboy | Amiga scene | 16 | 20 July 2011 14:13 |
Sector fill pattern | absence | Coders. General | 7 | 21 March 2009 21:50 |
WinUAE blitter <-> bitplane DMA timing accuracy? | Photon | Coders. General | 1 | 24 November 2004 18:06 |
Fill 'em | Tim Janssen | request.Old Rare Games | 1 | 27 June 2003 09:25 |
|
|