15 April 2013, 23:40 | #21 | |
Registered User
Join Date: Jan 2012
Location: USA
Posts: 372
|
Quote:
If you're doing 3d, my guess is that the latency you're seeing is a result of interrupts occurring during the instruction just before a DIVS or MULS. Worst case is interrupt during MOVEM.L m->D0-D7/A0-A7 with DIVS in prefetch queue. Well over 300 cycles on a 68000. |
|
20 April 2013, 21:56 | #22 |
Posts: n/a
|
One trick about clearing: For a lot of 3d stuff, I (and probably many other people) did the clearing with the linedraw instead of clearing the full buffer- so it worked like this:
1. Draw lines with xor into an empty screen-size buffer (back buffer) 2. blitter fill from the back buffer into a front buffer. The size of this blit is calculated by the overlap of the bounding box for the current frame's object, and bounding box for whatever was drawin on this front buffer last frame. I often recalculated the bounding box for each bitplane, as it sometimes saved a little extra. 3. draw the lines with xor again into the back buffer -this will clear it Also, about the 3D Demo II glenz, I cheated the heck out of it - If I remember correctly, I pre-rotated the vertices, and I may even have precalced the front/back facing, I was just trying to see how many lines I could push through by using the blitter linedraw. I did, however, have quite a bit of cpu cycles leftover, so I added 2 small texturemapped faces on there, if you watch carefully. I didn't understand why the cpu work didn't hurt the blitter time back then, but Toni's explanation makes perfect sense. Also, big props to Paradroid for using bitplane pointers for the third bitplane - it works perfectly because his glenz (like most others) are exactly 5 colors (4+bg) - in 3d demo 2 I used 7 colors (6+bg), so it wouldn't have worked for me. Also, I don't think I ever used blitter interrupts - I was a big fan of copper blits. If you want to use blitter interrupts, I think it makes sense to just use them for the big blits (fill and clear operations), but not for every line etc.) BTW, it's of course really hard to use copper blits if you have effects that run slower than 50fps. also, if you have raster bars only at some parts of the screen (so not every single line), you can still use copper blits during parts of the screen, and interrupt blits during the rest. Last edited by prowler; 20 April 2013 at 23:59. Reason: Back-to-back posts merged; please use the Edit button. |
21 April 2013, 09:25 | #23 |
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,553
|
Use copper to trigger interrupts. Interrupt routine can do multiple CPU blits "normally". (Of course this also gets too slow if you need lots of separate interrupts but normally you only need one or two)
|
21 April 2013, 22:01 | #24 | |
ex. demoscener "Bigmama"
Join Date: Jun 2012
Location: Fyn / Denmark
Posts: 1,635
|
Quote:
Last edited by hooverphonique; 22 April 2013 at 10:16. |
|
22 April 2013, 19:28 | #25 | |
Rock Lobster
Join Date: Nov 2012
Location: Macclesfield
Age: 50
Posts: 40
|
Quote:
Once you start clipping object you can no longer use a single plane to represent the shape as the inner and outer outlines will be different, meaning you have to draw the entirety of the inner and outer surfaces to separate planes, which in my effect's case meant 2 planes each. For example: Of course I could have drawn some holes into the planes represented on the right to get more colours into it, but then that would have caused me a major headache for the glenz that zooms in at the beginning which uses the copper/bitplane spans to define the convex shape of the entire object. Last edited by Paradroid; 22 April 2013 at 19:34. |
|
29 April 2013, 10:32 | #26 |
Amos Basic
Join Date: Feb 2013
Location: Orleans | France
Age: 49
Posts: 85
|
This is discussion is so insanely cool and full of technical details I'm so fond of
|
13 May 2013, 22:36 | #27 | |
Registered User
Join Date: Jan 2012
Location: USA
Posts: 372
|
Quote:
The visualization of DMA channel allocation is great! Thanks for this! Highly recommended. Your comments about blitter clears and the the A channel are especially interesting. The visualization shows that blitter clears are mostly a waste of time unless they're done during during the vertical overscan areas. I noticed some programs/demos that carefully timed this to happen to make the most of available accesses. Also interesting are the number of empty DMA slots in some programs, especially Atari ST conversions. Starglider, even though there's some blitter usage, really doesn't do much to overlap blits with computation, leaving the bus idle more than needs be. And Starglider II doesn't appear to use blitter area fill at all, which surprised me, though sprites are used (unlike Starglider I). Anyway, incredibly enlightening. Made my weekend! |
|
14 May 2013, 22:04 | #28 | |
ex. demoscener "Bigmama"
Join Date: Jun 2012
Location: Fyn / Denmark
Posts: 1,635
|
Quote:
|
|
14 May 2013, 22:32 | #29 | |
Registered User
Join Date: Jan 2012
Location: USA
Posts: 372
|
Quote:
At one time I had my Amiga hooked up to an old rough black and white CRT via the composite output. Anything above the top visible line or below the bottom visible line disappeared under the edges of the display, so for me, anything outside those 200 lines (NTSC) was overscan. |
|
15 May 2013, 21:08 | #30 | ||
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,553
|
Quote:
Quote:
This also explains why copper started blits are optimal, very high chip bus usage, both CPU and blitter can run at the same time, cycles are never wasted for blitter waits. It would be interesting to see how much different programs waste time for CPU blitter waits. Result may be quite unexpected... |
||
26 February 2014, 08:38 | #31 | ||
Registered User
Join Date: Jan 2012
Location: USA
Posts: 372
|
Not sure how I missed answering this...
Quote:
Maybe I just misunderstood, but the current emulator seems to support the idea. Starglider shows this. It does nearly the worst possible thing to clear the buffer, btw. It starts a buffer clear with the blitter at around vpos 5 and busy waits so that about half way through the clear it hits the first visible scan line and starts running at a little faster than half speed (or quarter speed compared to MOVEMs+blitter running in the vertical overscan/blanking area.) |
||
26 February 2014, 10:56 | #32 |
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,553
|
Yeah, it is wrong, I am not sure what I was thinking.. Blitter cycles (idle or not) always require free cycle. Blitter idle cycles are usable by the CPU.
Cycle diagram with 4 lores planes + D clear would be: PDP-PDP-PDP-PDP- (P = bitplane, D = blitter D, - = blitter idle cycle, CPU can use it) -> It is always waste of free cycles if program starts D clear and then immediately starts waiting for the blitter. |
21 August 2014, 19:55 | #33 |
Registered User
Join Date: Jul 2014
Location: Warsaw/Poland
Posts: 192
|
Toni, what about cycle diagram for lines without video dma (top/bottom border) active?
It will be: -D---D---D---D-- or -D-D-D-D- ? thanks |
21 August 2014, 21:27 | #34 | |
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,553
|
Quote:
= -D-D-D-D.. |
|
21 August 2014, 22:59 | #35 |
Registered User
Join Date: Jan 2012
Location: USA
Posts: 372
|
Not sure if it's obvious or not, Cyprian, but each of those DMA cycles is two CPU cycles long.
Since some of those blitter cycles are idle cycles during a clear, there are times like during overscan/blanking when the CPU can run full speed while the blitter also clears at the same time. Some have even used both the CPU's movem instruction and blitter in combination to clear buffers up to twice as fast as with the CPU alone. For example, the DMA sequence would look something like: DwDwDwDwDwD a a a a a a where 'w' is a CPU write to memory, 'D' is the D channel of the blitter writing to memory, and 'a' is approximately when the CPU puts the address on the bus during the first two CPU cycles of a CPU memory access. I think one of the things that prevented programmers from getting the most out of the Amiga early on was an overemphasis on the "odd cycle/even cycle" description of the relation between the CPU/blitter/other DMA. The CPU and blitter are very dynamic when it comes to accessing chipram. A three plane display with blitter clear might look like this: DwD2w3D1wDw2D3w1DwD2w3D1wDw2D3w1DwD2w3D1 In this case the blitter and CPU take turns using the DMA cycle opened up by the missing fourth plane. The odd/even model suggests that this is impossible, but it happens on real hardware. |
22 August 2014, 00:03 | #36 | |
Registered User
Join Date: Jul 2014
Location: Warsaw/Poland
Posts: 192
|
Quote:
we know that in this case the D channel can writes data to memory every second memory slot. I'm just wondering, why it can't do that during bitplane area like that: PDPDPDPDPDPDPDPD Is it caused by that? If yes, what is behind that strategy? Why it needs idle cycles during bitplane area? thanks |
|
22 August 2014, 08:10 | #37 |
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,553
|
|
22 August 2014, 09:47 | #38 | |
Registered User
Join Date: Jan 2012
Location: USA
Posts: 372
|
Quote:
Are there any known issues with blitter/copper DMA happening in the cycle before a disk write DMA cycle? It would be interesting if during disk write DMA the previous cycle was an idle cycle, too. Edit: well, during a disk read I guess since a read fills memory. Last edited by mc6809e; 22 August 2014 at 21:00. Reason: Duh |
|
23 August 2014, 01:27 | #39 |
Moderator
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,650
|
If you mean the twist-scroller, it's filled not with the normal vector filling method but with a copy-and-xor-to-line-above blit.
If you mean filled vectors, well a fill blit takes exactly the same time as a copy blit ($9f0) of the same area. The formula is in Hardware Reference Manual for calculating the time - or you could just do it and measure the time by setting and clearing the background color. Something like IDK, 60-80 scanlines maybe (out of 312)? If you do it during the vertical blanking when there's no competing bitplane DMA running. Copyblits (and therefore fill-blits) leave cycles free for the CPU, so you can do things while it fills (hint hint!) |
23 August 2014, 08:34 | #40 | |
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,553
|
Quote:
This explains why CPU has free cycles. Normal copy will not give any cycles for CPU (if nasty bit is set) HRM diagram is only correct if there are no other DMA activity, no fill, no linedraw. (EDIT: Only the very first rare HRM revision has extra fill information!) Last edited by Toni Wilen; 23 August 2014 at 08:56. |
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Clipping line for blitter fill | leonard | Coders. Asm / Hardware | 12 | 27 April 2013 12:03 |
80 GB HD to fill! | fatboy | Amiga scene | 16 | 20 July 2011 14:13 |
Sector fill pattern | absence | Coders. General | 7 | 21 March 2009 21:50 |
WinUAE blitter <-> bitplane DMA timing accuracy? | Photon | Coders. General | 1 | 24 November 2004 18:06 |
Fill 'em | Tim Janssen | request.Old Rare Games | 1 | 27 June 2003 09:25 |
|
|