24 March 2011, 12:28 | #1 |
Registered User
Join Date: Aug 2008
Location: Salisbury
Posts: 744
|
Blitter queues
I've been thinking about blitter queues lately. Looking online, I've not been able to find any examples of it so far.
My guessing is that it uses the same interrupt as the copper/vblank interrupt (level 3?). I've seen in the AHRM that you can set it up to trigger this interrupt when the blitter has finished. Am i right in thinking in this scenario that means you cannot then use a copper or vblank interrupt as its being used for the blitter queue? I'm also guessing that a blitter queue is only really any good when the size of the blits are quite large? Any thoughts or suggestions would be great |
24 March 2011, 12:46 | #2 | ||
move.l #$c0ff33,throat
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 6,863
|
Quote:
Quote:
They can also be useful when you have lots of small blits. |
||
24 March 2011, 12:47 | #3 |
gone
Join Date: Apr 2007
Location: completely gone
Posts: 1,596
|
You can use both at the same time - it's just, when the interrupt is triggered, you need to check where the interrupt came from (check relevant bits in intreqr) and deal with it accordingly.
EDIT: Doh! Beaten by The Stinger! |
03 April 2011, 21:57 | #4 |
Registered User
Join Date: Aug 2008
Location: Salisbury
Posts: 744
|
Right, wrote a blitter queueing system. my problem now is the level 3 interrupt.
Firstly, the entire demo type thing i'm working on runs all the main code on the vblank interrupt. The blit queues are added too during the vblank interrupt. However, while it is executing this, the interrupt for the blittler finishing does not start until the vblank routine is finished. Is there a way of allowing the blitter finish to interrupt the vblank when its running? Hope that makes sense? Last edited by h0ffman; 03 April 2011 at 21:58. Reason: whoops |
04 April 2011, 01:15 | #5 |
Registered User
Join Date: Nov 2006
Location: Stockholm, Sweden
Posts: 237
|
I recommend that you move your processing code out of the VBLANK. Let the VBLANK handler just set a flag whenever it is triggered, that is all. The code outside of the VBLANK does all the heavy lifting and queues up blits.
Then your BLIT interrupts will work wonderfully without any extra complications. Or you can decide to handle nested interrupts... First, some background. Let's take the A500 with a 68000 as an example. When the hardware raises VERTB, the VERTB bit gets set in INTREQ(R). INTREQ(R) is a register in one of the Amiga's custom chips. It is not part of the 68000 CPU. Then there are a bunch of logic gates (again in the custom Amiga hardware), which take all the bits from INTREQ, mask them against INTENA, locate the highest-priority bit that is set in the result, decodes that into a 68000-style Interrupt Priority Level (a value from 0 through 7), and then feeds that value to the Interrupt Priority Level (IPL0-IPL2) pins on the processor. (See M68000UM chapter 3.5 "Interrupt Control"). This operation is redone as soon as INTREQ/INTENA changes. So, the IPL0-IPL2 input pins of the processor always have the priority of the current highest-priority requested interrupt. (The entire following section is detailed in M68000UM 6.3.2 "Interrupts".) Now, every time that the CPU completes processing of an instruction, it will then compare the value on the IPL0-IPL2 pins against the value in the Interrupt Priority Mask (I0-I2) bits in the Status Register (see M68000PRM 1.3.2 "Status Register"). If the value on IPL0-IPL2 is less than or equal to the value in I0-I2, then nothing happens - the CPU continues processing normally. However, if IPL0-IPL2 has a higher value, then exception processing is invoked. This means: * SR is backed up internally in the CPU * SR is modified (Supervisor mode on, trace mode off, I0-I2 raised to IPL0-IPL2 level) * exception handler vector is fetched from $60+IPL*4 * SR is stored to stack * PC of the next instruction to be executed is stored on stack * PC is set to the exception handler vector location ... and then, the CPU will process the next instruction which the PC points at. So what does this mean? Well, since I0-I2 got raised, then the CPU is in a state where it ignores any further external interrupt requests of the same or lower priority. A while later, the exception processing code will reach an RTE. Since the RTE instruction restores PC + SR, then the I0-I2 bits in SR will drop down to a level where they were before the exception processing was invoked. Now. Let's say that you want to allow BLIT interrupts while the VERTB interrupt is being processed. The blitter will set the BLIT bit in INTREQ(R). BLIT raises IPL to 3. Same as VERTB. But if the CPU is inside of the VERTB handler, then the Interrupt Priority Mask is already set to 3. Therefore any BLIT interrupts are ignored. So what do you do? Lower the Interrupt Priority Mask. Code:
mylev3handler: move.w #$2000,sr ; re-enable interrupts .. do BLIT/VERTB processing .. rte So first acknowledge/clear INTREQ(R), then lower interrupt priority. New try: Code:
mylev3handler: move.w d0,-(sp) move.w intreqr+$dff000,d0 and.w #INTF_VERTB|INTF_BLIT|INTF_COPER,d0 btst #INTB_VERTB,d0 bne.s .handle_vertb btst #INTB_BLIT,d0 bne.s .handle_blit move.w d0,intreq+$dff000 move.w (sp)+,d0 rte .handle_vertb move.w #INTF_VERTB,intreq+$dff000 ; acknowledge/clear interrupt move.w #$2000,sr ; re-enable interrupts .. VERTB processing .. move.w (sp)+,d0 rte .handle_blit move.w #INTF_VERTB,intreq+$dff000 ; acknowledge/clear interrupt .. BLIT processing .. move.w (sp)+,d0 rte Because the write to INTREQ, and subsequent recomputation of IPL, takes a long time for the slow parts of the A4000 hardware to complete. (Which are the slow parts? Well, all the hardware that doesn't go faster when a turbocharged 68060 board is plugged in.) So even though you do clear the VERTB flag in INTREQ, and the CPU is supposed to wait for the the custom chips to acknowledge that the write to INTREQ has completed, it still takes "some time" until the logic gates which compute the new value for IPL0-IPL2 complete their work, and the CPU indeed sees that there is no longer a level-3 interrupt being requested. So, there is a race condition between these two lines: Code:
move.w #INTF_VERTB,intreq+$dff000 ; acknowledge/clear interrupt move.w #$2000,sr ; re-enable interrupts Code:
move.w #INTF_VERTB,intreq+$dff000 ; acknowledge/clear interrupt move.w #INTF_VERTB,intreq+$dff000 ; acknowledge/clear interrupt move.w #$2000,sr ; re-enable interrupts Note that you should write your interrupt handler such that it handles spurious interrupts (your handler gets entered even though there is no corresponding bit set in INTREQ) and it should not lose interrupts if multiple interrupt requests (say a COPER and a BLIT) get set in INTREQ(R) at exactly the same moment. Now. Exact choice of method is left to you. Godspeed. |
04 April 2011, 09:13 | #6 |
gone
Join Date: Apr 2007
Location: completely gone
Posts: 1,596
|
@ Kalms - what a great explanation.
@ h0ffman - when you say you've written a blitter queueing system, what are the mechanics of something like that...? You're using blitter interrupts to signal "blitter done" rather than manually waiting for the blitter. So, in your blitter interrupt processing code you keep some kind of counter that you read off to give an indication as to which values to load the blitter registers with for the next blit operation...? I've never done / thought about doing a blitter queue and, as usual, I'm interested in how things work. |
04 April 2011, 11:58 | #7 |
Registered User
Join Date: Aug 2008
Location: Salisbury
Posts: 744
|
@ Kalms - cheers for the explanation, of course I opted for the easy option but thanks very much for that, might come in handy one day.
@PMC - The blitter queue system I've made is probably pretty inefficient at the moment, but I wanted to do something that would be easy to translate my current blits into the queue. I have two data areas, the first is a temp area at a size of $70. The reason is that it maps from $dff000 to $dff070, which gets you to the end of the blit registers. I also have a queue area which is $2a large multiplied by the number of concurrent queued blits you want. Doing it this way means all you have to do to queue a blit is swap the $dff000 to the temp data area and then call the blit queue routine. The blit queue routine then picks up the next available slot in the queue and dumps the data for the blit onto it. At this point it also checks to see if the blitter is live, if not then it starts the next awaiting blit. The interrupt then confirms the last blit in the queue is done, checks for the next available blit and kicks it off if its ready. The queue kinda rotates, so when you get to the end of your slots it goes round to the beginning again. Additionally I added a separate call the the blit queue which returns and address in A0, this is so if you need to know when that specific blit has finished, you can store the pointer and check it later. I was quite pleased with the method, although running on cycle exact its looking like it takes over a whole raster line to queue the next blit. Think I can speed that up by doing a movem on and off the queue. However, more importantly, the blits i'm queueing aren't working properly at the moment. So needs more debugging. In a way this method is using more CPU time for the blits, but there is a lot of cpu savings not hanging around waiting for the blitter to finish. Acutally I have a quick quesiton regarding the blitter. Do any of the registers need to be written to as words? Currently I'm dumping the calls with a move.l, and the only blit that seems to work is the clear screen!. Edit - with the exception of blitsize of course... thats the last word that needs to be written Last edited by h0ffman; 04 April 2011 at 12:47. |
04 April 2011, 12:45 | #8 |
gone
Join Date: Apr 2007
Location: completely gone
Posts: 1,596
|
h0ffman - adjacent word regs can be written as a single .l to the first reg in the pair which then, obviously, writes both regs:
bltcon0 & bltcon1 bltafwm & bltalwm bltamod & bltdmod those are off the top of my head that'll work OK like that. Of course, there's nothing stopping you writing all the regs as word writes as necessary and non adjacent regs may need to be written with separate word writes as required. PS. Your blitter queue sounds pretty cool - any chance of dropping me the source code to the usual place for a little look see...? |
04 April 2011, 12:57 | #9 |
Registered User
Join Date: Aug 2008
Location: Salisbury
Posts: 744
|
Sure mate, once I've done some lunch time debugging, i'll fire the mess over to you
|
04 April 2011, 14:58 | #10 |
Registered User
Join Date: Aug 2008
Location: Salisbury
Posts: 744
|
Found my issue, I was setting bltcon0l with each blit, as soon as I took that out, all the blits started working. I had an additional issue where the blit interrupt would stop working, but looking at Kalms response above, I simply acknowledge the interrupt before processing my blit queue and everything was happy Not bad for a days work i think
Now I just hope that my laptop at work isn't powerful enough to run cycle exact as running my prod on here when at full pelt is slightly dropping frames |
04 April 2011, 15:33 | #11 |
gone
Join Date: Apr 2007
Location: completely gone
Posts: 1,596
|
Top skills h0ff
You're becoming a bug fixer extraordinaire! |
04 April 2011, 22:55 | #12 | |
Registered User
Join Date: Nov 2006
Location: Stockholm, Sweden
Posts: 237
|
Quote:
Also beware these two passages from HRM: "NOTE If a blit has just been started but has been locked out of memory access because of, for instance, display fetches, this bit may not yet be set. The processor, on the other hand, may be running completely uninhibited out of FAST memory or its internal cache, so it will continue to have memory cycles." "NOTE Starting with the Fat Agnus the blitter busy bit has been "fixed" to be set as soon as you write to BLTSIZE to start the blit, rather than when the blitter gets its first DMA cycle. However, not att machines will use thee newer chips, so it is best to rely on the above method of testing." so be careful when using the BBUSY bit to detect blitter activity. |
|
05 April 2011, 02:57 | #13 |
Registered User
Join Date: Aug 2008
Location: Salisbury
Posts: 744
|
Cheers Kalms, its actually an internal flag, not a hardware test so hopefully I should be fine.
|
05 April 2011, 04:23 | #14 |
Registered User
Join Date: Nov 2006
Location: Stockholm, Sweden
Posts: 237
|
Ok cool. One more thing though:
Do you have any blitter-interrupt-disabling code around your "add an item to the queue and trigger blitter if necessary" code? If not, then you have a race condition there. |
29 July 2019, 21:11 | #15 | ||
Moderator
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,604
|
Quote:
Quote:
The two most important points are: an interrupt interrupts main (maybe in the middle of loading Blitter registers(!)), and a blit can under some circumstances lock out the CPU completely and delay the next (any) interrupt until finished (and not when you expect it). |
||
06 August 2019, 23:40 | #16 |
Registered User
Join Date: Oct 2015
Location: Landsberg / Germany
Posts: 526
|
Thought I´d share my experience with interrupt driven Blitter code I used in RESHOOT R. Until just a few weeks prior to release, this was my approach:
While this code work just fine on lower end machines, on 68060-cpu it seemed to crash occasionally for reasons I never really understood. I therefore decided to change my whole approach and ditch the IRQ driven blitter code for a more traditional approach. This code was simple, ran more stable and worked comparably fast, since the IRQ code really needs a whole lot of overhead, for example saving and restoring cpu registers at the entry and the exit of each IRQ call. |
07 August 2019, 11:06 | #17 |
Registered User
Join Date: Jun 2016
Location: UK
Posts: 428
|
IRQ handling speed is a bit of a weakness on the Amiga. By the time you saved registers, figured out where the IRQ came from and jumped to the right code you might as well not have bothered in many cases. Polling is actually faster.
|
07 August 2019, 20:04 | #18 | |
Moderator
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,604
|
Quote:
There's a veritable bouquet of lovely interrupts to use on the Amiga. Use the flowers wisely (consider the overhead and don't use for things that take < 1 scanline of time), and you maximize CPU time, simple. On other computers, it's just not possible to get a this tight a fit of the glove of performance to the hand of the hardware specs. Last edited by Photon; 07 August 2019 at 20:14. |
|
08 August 2019, 11:58 | #19 | |
Registered User
Join Date: Jun 2016
Location: UK
Posts: 428
|
Quote:
Obviously you must pipeline as best you can, rather than just start the blitter than then busy wait. If you do it right it's going to be a lot more efficient than entering and exiting an interrupt. The issue is the 68000, it's just not designed for fast interrupt response. Some other CPUs, especially 8 bit ones, are much better. |
|
08 August 2019, 12:21 | #20 |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,474
|
The faster Blitter queues way is plainly with Copper BFD.
But it is also very complicated and with heavy side effects. There is related thread on board somewhere. |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Please help me!! Blitter pain! | h0ffman | Coders. Asm / Hardware | 5 | 15 June 2013 18:59 |
Blitter using the copper... | h0ffman | Coders. Asm / Hardware | 9 | 23 February 2012 08:25 |
Did Starglider use the blitter? | mc6809e | Retrogaming General Discussion | 8 | 04 February 2012 15:19 |
Filling with the blitter... | Lonewolf10 | Coders. Tutorials | 7 | 13 September 2011 14:30 |
Blitter nasty or not? | JackAsser | Coders. Tutorials | 5 | 28 March 2010 22:45 |
|
|