English Amiga Board

English Amiga Board (http://eab.abime.net/index.php)
-   Coders. General (http://eab.abime.net/forumdisplay.php?f=37)
-   -   Blitter queues (http://eab.abime.net/showthread.php?t=58398)

h0ffman 24 March 2011 12:28

Blitter queues
 
I've been thinking about blitter queues lately. Looking online, I've not been able to find any examples of it so far.

My guessing is that it uses the same interrupt as the copper/vblank interrupt (level 3?). I've seen in the AHRM that you can set it up to trigger this interrupt when the blitter has finished. Am i right in thinking in this scenario that means you cannot then use a copper or vblank interrupt as its being used for the blitter queue?

I'm also guessing that a blitter queue is only really any good when the size of the blits are quite large?

Any thoughts or suggestions would be great :)

StingRay 24 March 2011 12:46

Quote:

Originally Posted by h0ffman (Post 744669)
My guessing is that it uses the same interrupt as the copper/vblank interrupt (level 3?).

That's correct.

Quote:

Originally Posted by h0ffman (Post 744669)
Am i right in thinking in this scenario that means you cannot then use a copper or vblank interrupt as its being used for the blitter queue?

You can still use copper or vblank interrupts, you just need to check which interrupt was triggered in INTREQR.


Quote:

Originally Posted by h0ffman (Post 744669)
I'm also guessing that a blitter queue is only really any good when the size of the blits are quite large?

They can also be useful when you have lots of small blits.

pmc 24 March 2011 12:47

You can use both at the same time - it's just, when the interrupt is triggered, you need to check where the interrupt came from (check relevant bits in intreqr) and deal with it accordingly. :)

EDIT: Doh! Beaten by The Stinger!

h0ffman 03 April 2011 21:57

Right, wrote a blitter queueing system. my problem now is the level 3 interrupt.

Firstly, the entire demo type thing i'm working on runs all the main code on the vblank interrupt. The blit queues are added too during the vblank interrupt. However, while it is executing this, the interrupt for the blittler finishing does not start until the vblank routine is finished.

Is there a way of allowing the blitter finish to interrupt the vblank when its running?

Hope that makes sense?

Kalms 04 April 2011 01:15

I recommend that you move your processing code out of the VBLANK. Let the VBLANK handler just set a flag whenever it is triggered, that is all. The code outside of the VBLANK does all the heavy lifting and queues up blits.
Then your BLIT interrupts will work wonderfully without any extra complications.





Or you can decide to handle nested interrupts...


First, some background. Let's take the A500 with a 68000 as an example.


When the hardware raises VERTB, the VERTB bit gets set in INTREQ(R). INTREQ(R) is a register in one of the Amiga's custom chips. It is not part of the 68000 CPU.

Then there are a bunch of logic gates (again in the custom Amiga hardware), which take all the bits from INTREQ, mask them against INTENA, locate the highest-priority bit that is set in the result, decodes that into a 68000-style Interrupt Priority Level (a value from 0 through 7), and then feeds that value to the Interrupt Priority Level (IPL0-IPL2) pins on the processor. (See M68000UM chapter 3.5 "Interrupt Control"). This operation is redone as soon as INTREQ/INTENA changes.

So, the IPL0-IPL2 input pins of the processor always have the priority of the current highest-priority requested interrupt.

(The entire following section is detailed in M68000UM 6.3.2 "Interrupts".)
Now, every time that the CPU completes processing of an instruction, it will then compare the value on the IPL0-IPL2 pins against the value in the Interrupt Priority Mask (I0-I2) bits in the Status Register (see M68000PRM 1.3.2 "Status Register"). If the value on IPL0-IPL2 is less than or equal to the value in I0-I2, then nothing happens - the CPU continues processing normally. However, if IPL0-IPL2 has a higher value, then exception processing is invoked.
This means:
* SR is backed up internally in the CPU
* SR is modified (Supervisor mode on, trace mode off, I0-I2 raised to IPL0-IPL2 level)
* exception handler vector is fetched from $60+IPL*4
* SR is stored to stack
* PC of the next instruction to be executed is stored on stack
* PC is set to the exception handler vector location
... and then, the CPU will process the next instruction which the PC points at.

So what does this mean? Well, since I0-I2 got raised, then the CPU is in a state where it ignores any further external interrupt requests of the same or lower priority.

A while later, the exception processing code will reach an RTE. Since the RTE instruction restores PC + SR, then the I0-I2 bits in SR will drop down to a level where they were before the exception processing was invoked.


Now. Let's say that you want to allow BLIT interrupts while the VERTB interrupt is being processed.
The blitter will set the BLIT bit in INTREQ(R). BLIT raises IPL to 3. Same as VERTB. But if the CPU is inside of the VERTB handler, then the Interrupt Priority Mask is already set to 3. Therefore any BLIT interrupts are ignored.

So what do you do? Lower the Interrupt Priority Mask.

Code:

mylev3handler:
        move.w        #$2000,sr                ; re-enable interrupts

        .. do BLIT/VERTB processing ..

        rte

Whoops, that's no good. Your machine just hung, without reaching the BLIT/VERTB processing code. Why? Because the VERTB bit is set in INTREQ(R) -> IPL0-IPL2 is set to 3 -> if you lower I0-I2 in the exception handler, then the exception immediately gets triggered again!
So first acknowledge/clear INTREQ(R), then lower interrupt priority.

New try:

Code:

mylev3handler:
        move.w        d0,-(sp)
        move.w        intreqr+$dff000,d0
        and.w        #INTF_VERTB|INTF_BLIT|INTF_COPER,d0
        btst        #INTB_VERTB,d0
        bne.s        .handle_vertb
        btst        #INTB_BLIT,d0
        bne.s        .handle_blit

        move.w        d0,intreq+$dff000
        move.w        (sp)+,d0
        rte

.handle_vertb
        move.w        #INTF_VERTB,intreq+$dff000        ; acknowledge/clear interrupt
        move.w        #$2000,sr                ; re-enable interrupts

        .. VERTB processing ..

        move.w        (sp)+,d0
        rte

.handle_blit
        move.w        #INTF_VERTB,intreq+$dff000        ; acknowledge/clear interrupt

        .. BLIT processing ..

        move.w        (sp)+,d0
        rte

That's better. Works on your A500. But not on your friend's A4000/68060. Why is that?

Because the write to INTREQ, and subsequent recomputation of IPL, takes a long time for the slow parts of the A4000 hardware to complete. (Which are the slow parts? Well, all the hardware that doesn't go faster when a turbocharged 68060 board is plugged in.)
So even though you do clear the VERTB flag in INTREQ, and the CPU is supposed to wait for the the custom chips to acknowledge that the write to INTREQ has completed, it still takes "some time" until the logic gates which compute the new value for IPL0-IPL2 complete their work, and the CPU indeed sees that there is no longer a level-3 interrupt being requested.

So, there is a race condition between these two lines:

Code:

        move.w        #INTF_VERTB,intreq+$dff000        ; acknowledge/clear interrupt
        move.w        #$2000,sr                ; re-enable interrupts

You need a delay which you know is guaranteed to do the job. What you find in most code which has this problem is the following:


Code:

        move.w        #INTF_VERTB,intreq+$dff000        ; acknowledge/clear interrupt
        move.w        #INTF_VERTB,intreq+$dff000        ; acknowledge/clear interrupt
        move.w        #$2000,sr                ; re-enable interrupts

... that is, putting in an extra write to the custom chips, because that will take roughly as long regardless of how fast the CPU is. In practice it will take like 3-5 "chipset" clock cycles (if we presume the chipset clock frequency to be 7.14MHz). That ought to be enough for the change in INTREQ(R) to get crunched and propagated back to the IPL0-IPL2 pins on the CPU.

Note that you should write your interrupt handler such that it handles spurious interrupts (your handler gets entered even though there is no corresponding bit set in INTREQ) and it should not lose interrupts if multiple interrupt requests (say a COPER and a BLIT) get set in INTREQ(R) at exactly the same moment.


Now. Exact choice of method is left to you. Godspeed.

pmc 04 April 2011 09:13

@ Kalms - what a great explanation. :great

@ h0ffman - when you say you've written a blitter queueing system, what are the mechanics of something like that...?

You're using blitter interrupts to signal "blitter done" rather than manually waiting for the blitter.

So, in your blitter interrupt processing code you keep some kind of counter that you read off to give an indication as to which values to load the blitter registers with for the next blit operation...?

I've never done / thought about doing a blitter queue and, as usual, I'm interested in how things work. :)

h0ffman 04 April 2011 11:58

@ Kalms - cheers for the explanation, of course I opted for the easy option but thanks very much for that, might come in handy one day.

@PMC - The blitter queue system I've made is probably pretty inefficient at the moment, but I wanted to do something that would be easy to translate my current blits into the queue.

I have two data areas, the first is a temp area at a size of $70. The reason is that it maps from $dff000 to $dff070, which gets you to the end of the blit registers. I also have a queue area which is $2a large multiplied by the number of concurrent queued blits you want.

Doing it this way means all you have to do to queue a blit is swap the $dff000 to the temp data area and then call the blit queue routine. The blit queue routine then picks up the next available slot in the queue and dumps the data for the blit onto it. At this point it also checks to see if the blitter is live, if not then it starts the next awaiting blit.

The interrupt then confirms the last blit in the queue is done, checks for the next available blit and kicks it off if its ready.

The queue kinda rotates, so when you get to the end of your slots it goes round to the beginning again.

Additionally I added a separate call the the blit queue which returns and address in A0, this is so if you need to know when that specific blit has finished, you can store the pointer and check it later.

I was quite pleased with the method, although running on cycle exact its looking like it takes over a whole raster line to queue the next blit. Think I can speed that up by doing a movem on and off the queue.

However, more importantly, the blits i'm queueing aren't working properly at the moment. So needs more debugging.

In a way this method is using more CPU time for the blits, but there is a lot of cpu savings not hanging around waiting for the blitter to finish.

Acutally I have a quick quesiton regarding the blitter. Do any of the registers need to be written to as words? Currently I'm dumping the calls with a move.l, and the only blit that seems to work is the clear screen!.

Edit - with the exception of blitsize of course... thats the last word that needs to be written

pmc 04 April 2011 12:45

h0ffman - adjacent word regs can be written as a single .l to the first reg in the pair which then, obviously, writes both regs:

bltcon0 & bltcon1
bltafwm & bltalwm
bltamod & bltdmod

those are off the top of my head that'll work OK like that. Of course, there's nothing stopping you writing all the regs as word writes as necessary and non adjacent regs may need to be written with separate word writes as required.

PS. Your blitter queue sounds pretty cool - any chance of dropping me the source code to the usual place for a little look see...? :)

h0ffman 04 April 2011 12:57

Sure mate, once I've done some lunch time debugging, i'll fire the mess over to you ;)

h0ffman 04 April 2011 14:58

Found my issue, I was setting bltcon0l with each blit, as soon as I took that out, all the blits started working. I had an additional issue where the blit interrupt would stop working, but looking at Kalms response above, I simply acknowledge the interrupt before processing my blit queue and everything was happy :) Not bad for a days work i think :)

Now I just hope that my laptop at work isn't powerful enough to run cycle exact as running my prod on here when at full pelt is slightly dropping frames :(

pmc 04 April 2011 15:33

Top skills h0ff :)

You're becoming a bug fixer extraordinaire! :D

Kalms 04 April 2011 22:55

Quote:

Doing it this way means all you have to do to queue a blit is swap the $dff000 to the temp data area and then call the blit queue routine. The blit queue routine then picks up the next available slot in the queue and dumps the data for the blit onto it. At this point it also checks to see if the blitter is live, if not then it starts the next awaiting blit.

The interrupt then confirms the last blit in the queue is done, checks for the next available blit and kicks it off if its ready.
Marked the part in bold. Make sure you don't have a race condition there.

Also beware these two passages from HRM:

"NOTE If a blit has just been started but has been locked out of memory access because of, for instance, display fetches, this bit may not yet be set. The processor, on the other hand, may be running completely uninhibited out of FAST memory or its internal cache, so it will continue to have memory cycles."

"NOTE
Starting with the Fat Agnus the blitter busy bit has been "fixed" to be
set as soon as you write to BLTSIZE to start the blit, rather than when
the blitter gets its first DMA cycle. However, not att machines will use
thee newer chips, so it is best to rely on the above method of testing."

so be careful when using the BBUSY bit to detect blitter activity.

h0ffman 05 April 2011 02:57

Cheers Kalms, its actually an internal flag, not a hardware test so hopefully I should be fine.

Kalms 05 April 2011 04:23

Ok cool. One more thing though:

Do you have any blitter-interrupt-disabling code around your "add an item to the queue and trigger blitter if necessary" code? If not, then you have a race condition there.

Photon 29 July 2019 21:11

Quote:

Originally Posted by Kalms (Post 747241)
I recommend that you move your processing code out of the VBLANK. Let the VBLANK handler just set a flag whenever it is triggered, that is all. The code outside of the VBLANK does all the heavy lifting and queues up blits.
Then your BLIT interrupts will work wonderfully without any extra complications.

I recommend this and this

Quote:

Originally Posted by Kalms (Post 747489)
Do you have any blitter-interrupt-disabling code around your "add an item to the queue and trigger blitter if necessary" code? If not, then you have a race condition there.

or else you must know exactly when each blit executes, so main and interrupt don't fight over the Blitter.

The two most important points are: an interrupt interrupts main (maybe in the middle of loading Blitter registers(!)), and a blit can under some circumstances lock out the CPU completely and delay the next (any) interrupt until finished (and not when you expect it).

buzzybee 06 August 2019 23:40

Thought I´d share my experience with interrupt driven Blitter code I used in RESHOOT R. Until just a few weeks prior to release, this was my approach:
  • Main code: Queries position and pixel data of each object, then builds a structure of consecutive data pieces, each data piece carrying relevant data for loading blitter regs needed to draw one object
  • Main: Data structure is double buffered, List A and List B. While List A is built in main, List B is locked
  • IRQ: With every IRQ level 6 call, code querys the origin of the IRQ, then jumps to blitter control code if relevant
  • IRQ - blitter: Each call reads one data piece from the locked (!) List B data structure, loads Blitter regs with it, inits Blitter, then exits IRQ. Double buffered structure lists ensure that main code can´t modify data structure being processed by IRQ blitter code
  • IRQ - bitter: As soon as each consecutive data piece from the locked data structure has been processed and drawn by the Blitter, IRQ code sets codes own Blitter Finished flag (not Blitter Busy!)
  • Main: Waits and queries the Blitter Finished flag. As soon as it is set, pointers to the two structures are being swapped
  • Main: Now builds a structure of consecutive data pieces again, this time into list B, each data piece carrying relevant data for loading blitter regs / drawing one object ...
  • IRQ - blitter: Reads data from List A ... and so on ...

While this code work just fine on lower end machines, on 68060-cpu it seemed to crash occasionally for reasons I never really understood. I therefore decided to change my whole approach and ditch the IRQ driven blitter code for a more traditional approach.

This code was simple, ran more stable and worked comparably fast, since the IRQ code really needs a whole lot of overhead, for example saving and restoring cpu registers at the entry and the exit of each IRQ call.

zero 07 August 2019 11:06

IRQ handling speed is a bit of a weakness on the Amiga. By the time you saved registers, figured out where the IRQ came from and jumped to the right code you might as well not have bothered in many cases. Polling is actually faster.

Photon 07 August 2019 20:04

Quote:

Originally Posted by zero (Post 1336865)
IRQ handling speed is a bit of a weakness on the Amiga. By the time you saved registers, figured out where the IRQ came from and jumped to the right code you might as well not have bothered in many cases. Polling is actually faster.

Polling = busy-waiting, dev courses for any environment thumps it into you that such should be avoided, and for good reason.

There's a veritable bouquet of lovely interrupts to use on the Amiga. Use the flowers wisely (consider the overhead and don't use for things that take < 1 scanline of time), and you maximize CPU time, simple. :great

On other computers, it's just not possible to get a this tight a fit of the glove of performance to the hand of the hardware specs.

zero 08 August 2019 11:58

Quote:

Originally Posted by Photon (Post 1336961)
Polling = busy-waiting, dev courses for any environment thumps it into you that such should be avoided, and for good reason.

There's a veritable bouquet of lovely interrupts to use on the Amiga. Use the flowers wisely (consider the overhead and don't use for things that take < 1 scanline of time), and you maximize CPU time, simple. :great

On other computers, it's just not possible to get a this tight a fit of the glove of performance to the hand of the hardware specs.


Obviously you must pipeline as best you can, rather than just start the blitter than then busy wait. If you do it right it's going to be a lot more efficient than entering and exiting an interrupt.

The issue is the 68000, it's just not designed for fast interrupt response. Some other CPUs, especially 8 bit ones, are much better.

ross 08 August 2019 12:21

The faster Blitter queues way is plainly with Copper BFD.
But it is also very complicated and with heavy side effects.
There is related thread on board somewhere.


All times are GMT +2. The time now is 23:32.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2019, vBulletin Solutions Inc.

Page generated in 0.04591 seconds with 11 queries