22 September 2018, 13:12 | #1 |
Registered User
Join Date: Oct 2017
Location: Sunderland, England
Posts: 2,702
|
Blitter engine working on Interrupt Request
Hi All,
I'm looking at trying to get the most out of my sprite engine and wondered if any of the experienced coders here could help with explaining how a blitter/bob engine works with interrupt requests? The aim is to not have the CPU waiting for the blitter all the time, I do quite a lot of large blits in my project and I want the CPU to be getting on with other things in the background if possible. Cheers, Geezer |
22 September 2018, 14:53 | #2 |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
|
Hi geezer, only some hint because the solutions can be vary and sometimes not so productive.
Practically what you want to avoid is the blitter wait code. You can simply compute/create/fill a growing queue containing the values to be inserted into the blitter's registers and write an IRQ management routine that controls the BLIT bit during IRQ3 (level 3 is shared between copper, vbl and blitter). Of course you must have set the same bit in INTENA to allow interrupts when the blitter finished its job. During IRQ code, if the IRQ source is confirmed, then read values from the blitter queue (compiled by normal main code routine) and write all the hw registers, of course BLTSIZE last. Blitter start as usual. So you can purge the head on the queue and set pointer to next values. Better if you do the acknowledge (move.w # $40,INTREQ) before the BLTSIZE write because if you have BLTPRI set and only chip/slow RAM you can end up the blitter operation before the INTREQ write.. Then you're good to RTE. But there is a fundamental point that often moves you away from this method. Latency for IRQ management: apart from the cycles to start the routine, also the saving of the registers and the various control code during the IRQ. But there is a Sacred Graal: use the copper property to wait on blitter job completion (wait BFD bit) and use the very same copper to setup the blitter registers. Practically this is very seldom implementation because is really hard to setup the copper (with the various video syncro effects) and at the same time enqueue blitter commands. Good job! Last edited by ross; 22 September 2018 at 15:02. Reason: some grammar..., be patient with my english :( |
22 September 2018, 15:17 | #3 | |
Moderator
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,602
|
Quote:
And if it's on, chances are you are not waiting for the Blitter. I.e. the CPU doesn't get control back until the bob has been blitted anyway. You can test this by making a blitwait that sets the background color before waiting and resets it after. If the color slivers are only 1px high and not wider than say 1/6th of a scanline, what you're seeing is the execution time of just 1 loop of the blitwait: it's already finished and you're waiting for nothing. You can also count the repetitions of the blitwait loop and reset the counter each frame. |
|
22 September 2018, 15:29 | #4 |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
|
Yes, the BLTPRI bit could change completely your coding style and flow.
The same difference to thinking single task or multitask (it is not always true because many blitter modes do not use all the bus cycles but you can view it as a general rule) |
22 September 2018, 16:10 | #5 | |
Registered User
Join Date: Oct 2017
Location: Sunderland, England
Posts: 2,702
|
Quote:
Edit: Oddly, if I set the BLTPRI on my large blits are taking longer to complete. |
|
22 September 2018, 16:44 | #6 | ||
Natteravn
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,496
|
Quote:
Quote:
|
||
22 September 2018, 17:01 | #7 | |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
|
Quote:
A blitter-wait, or something that has the same effect, should be always used, also with BLTPRI set. Well, if you optimize something to death for A500 can be acceptable the missed wait [EDIT: and then the poor WHDLoad coder need to patch your code for the speedy people..] Last edited by ross; 22 September 2018 at 17:14. |
|
22 September 2018, 17:09 | #8 | |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
|
Quote:
Are you sure you're not reading the timing while the blitter is still running? Remember that with BLTPRI=0 the processor has interleaved cycles with blitter on internal bus. |
|
22 September 2018, 17:22 | #9 |
Registered User
Join Date: Oct 2017
Location: Sunderland, England
Posts: 2,702
|
I've done a little video showing what is going on.
The background bit planes are done in 3 blits, I set the colour to Red,Green or Blue prior to blitting each plane respecively. The first run is without the Blitter priority set and then with. As you can see without it I don't see any CPU clocks for the first blit. I have cycle exact set on WinUAE too so I'm not sure what is going on. [ Show youtube player ] |
22 September 2018, 17:23 | #10 |
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,506
|
You still need to wait for blit even when running in chip ram with blitter nasty because not all channel mode combinations use all cycles. (for example D only, most fill modes, line draw)
Also due to CPU prefetch and blitter pipelining, there is 3-4 cycles available before blitter "really starts" after writing to BLTSIZE which allows CPU to execute following instruction, at least partially. |
22 September 2018, 17:46 | #11 | |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
|
Quote:
If you do the wait before the blit (how indeed it must be done), then you actually change color (the yellow) practically immediately. If you are doing the blitter wait after (the right way to take a timing) then you have the later effect, showing the actual time of the blit (with BLTPRI=1 the internal bus is hogged by blitter so you cannot write on color register and the effect is the same as a blitter wait done after the blit). As said before you still have the blitter active after the third blit Last edited by ross; 22 September 2018 at 18:10. |
|
22 September 2018, 18:02 | #12 | |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
|
Quote:
(a6=custombase) tst.w (a6) tst.w (a6) tst.w (a6) move.w #something,blitter_reg(a6) |
|
22 September 2018, 18:04 | #13 |
Moderator
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,602
|
Yes, if BLTPRI is off, control will return to the CPU after a few cycles (and will then execute the instructions that measure the blit time prematurely).
Blitwaits must always precede a blit. For measurement purposes (or other result-use purposes such as collision detection), you should temporarily add a Blitwait after, as well. |
29 September 2018, 17:54 | #14 |
Registered User
Join Date: May 2017
Location: AmigaLand
Posts: 459
|
Just wondering, is it necessary to set the blitpri on ? I mean, if the blitter uses all 3 sources it will let very few cycles free for the cpu . The advantage of irq blitter is to simulate a "multitask" between blit operation and other calculations from the cpu.
The question will be obviously yes if the target was an Amiga 1200 with fastram or accelerator but not sure (I mean I don't know the answer) with a stock A1200. |
29 September 2018, 22:45 | #15 | |
Registered User
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,410
|
Quote:
Now make no mistake, your blit will take longer if you do this (it'll suffer a 25% or so penalty). However, your overall system performance will likely go up, more so if your useful work includes expensive instructions such as multiplications/divisions (on 68000 shifts/rotates as well). However, if you can't or can't easily do useful work during a blit and instead need to spend a large percentage of the blit waiting on it to finish, then keeping BLTPRI on is always the better option. What I personally do is have my blitwait macro set BLTPRI to ON at the start, wait for the blit to finish regularly (because a fast processor might outrun this switch) and then after waiting switch it back off. Then, I only call the blitwait macro immediately prior to setting the actual blitter registers. This way - in my experiments anyway - performance tends to be highest. |
|
30 September 2018, 10:05 | #16 | |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
|
Quote:
Considering that disabling BLTPRI at end of macro is an internal bus access you can remove a tst.w and eventually have even more performance. [only for blit that use all the cycles] |
|
01 October 2018, 00:03 | #17 | |
Registered User
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,410
|
Quote:
Code:
BlitWait MACRO move.w #$8400,dmacon(\1) tst.w dmaconr(\1) move.w #$0400,dmacon(\1) ENDM |
|
15 January 2019, 19:17 | #18 | |||
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
|
I reopen this thread because I have an interesting case study and maybe Toni could provide further explanations.
In a message he wrote: Quote:
Quote:
Obviously in the case of using a blitter sequence that cover all the available cycles (for example ABCD, ABD, ACD, I dont consider the others since the BLTPRI would leave me with free cycles anyway). Graphic example: start blitter -> - - A0 B0 C0 - A1 B1 C1 D0 A2 B2 C2 D1 D2 with tst.w (as X): start blitter -> X X A0 B0 C0 X A1 B1 C1 D0 A2 B2 C2 D1 D2 So no wait blit needed. Then roondar wrote: Quote:
Then today a surprise. I've a very tight blitter routine with ABD channels active and with this blitter wait: Code:
move.w #$8400,DMACON(A6) tst.w (a6) move.w #$0400,DMACON(A6) move.l a0,BLTBPTH(a6) If I use: Code:
move.w #$8400,DMACON(A6) tst.w (a6) tst.w (a6) move.w #$0400,DMACON(A6) move.l a0,BLTBPTH(a6) I'm curious to understand why to roondar instead that kind of blitter wait always worked. Any particular situation (his or mine) that give different results? I'm on latest WinUAE 4.1.0, 030 custom x16 freq. (but this frequency is not significative, the bottleneck for instructions that require few cycles is only the internal bus access), all CPU caches active, CE options active, no Wait for or Immediate Blitter (so at maximum real machine compatibility for an expanded emulated machine). |
|||
15 January 2019, 20:27 | #19 |
Registered User
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,410
|
The fault is entirely mine, I worded my reply poorly.
I tested only a few combinations (AD & ABCD) to be exact. These use all available cycles even at the start and they work. It obviously wasn’t clear I only meant this particular use case and hadn’t tested all options. |
15 January 2019, 20:44 | #20 |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
|
Hi roondar, mine was not a criticism for you
I just want to understand what is the best way for this particular blithog/blitwait "combo". This: Code:
move.w #$8400,DMACON(A6) tst.w (a6) tst.w (a6) move.w #$0400,DMACON(A6) |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Blitter interrupt during VERTB interrupt | phx | Coders. Asm / Hardware | 38 | 01 October 2021 19:54 |
Request: Chaos Engine 2 graphics rips | CaptainNow | request.Other | 7 | 20 June 2015 20:40 |
[WinUAE Request] Level 7 NMI Interrupt | Boltar | request.UAE Wishlist | 2 | 26 December 2014 19:32 |
[Request] - Chaos Engine AGA | Zetr0 | project.Sprites | 9 | 03 November 2008 23:32 |
Thomas the Tank Engine II *working* | LordIvo | support.Games | 5 | 13 December 2007 10:49 |
|
|