English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 24 March 2011, 12:28   #1
h0ffman
Registered User
 
Join Date: Aug 2008
Location: Salisbury
Posts: 744
Blitter queues

I've been thinking about blitter queues lately. Looking online, I've not been able to find any examples of it so far.

My guessing is that it uses the same interrupt as the copper/vblank interrupt (level 3?). I've seen in the AHRM that you can set it up to trigger this interrupt when the blitter has finished. Am i right in thinking in this scenario that means you cannot then use a copper or vblank interrupt as its being used for the blitter queue?

I'm also guessing that a blitter queue is only really any good when the size of the blits are quite large?

Any thoughts or suggestions would be great
h0ffman is offline  
Old 24 March 2011, 12:46   #2
StingRay
move.l #$c0ff33,throat
 
StingRay's Avatar
 
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 6,863
Quote:
Originally Posted by h0ffman View Post
My guessing is that it uses the same interrupt as the copper/vblank interrupt (level 3?).
That's correct.

Quote:
Originally Posted by h0ffman View Post
Am i right in thinking in this scenario that means you cannot then use a copper or vblank interrupt as its being used for the blitter queue?
You can still use copper or vblank interrupts, you just need to check which interrupt was triggered in INTREQR.


Quote:
Originally Posted by h0ffman View Post
I'm also guessing that a blitter queue is only really any good when the size of the blits are quite large?
They can also be useful when you have lots of small blits.
StingRay is offline  
Old 24 March 2011, 12:47   #3
pmc
gone
 
pmc's Avatar
 
Join Date: Apr 2007
Location: completely gone
Posts: 1,596
You can use both at the same time - it's just, when the interrupt is triggered, you need to check where the interrupt came from (check relevant bits in intreqr) and deal with it accordingly.

EDIT: Doh! Beaten by The Stinger!
pmc is offline  
Old 03 April 2011, 21:57   #4
h0ffman
Registered User
 
Join Date: Aug 2008
Location: Salisbury
Posts: 744
Right, wrote a blitter queueing system. my problem now is the level 3 interrupt.

Firstly, the entire demo type thing i'm working on runs all the main code on the vblank interrupt. The blit queues are added too during the vblank interrupt. However, while it is executing this, the interrupt for the blittler finishing does not start until the vblank routine is finished.

Is there a way of allowing the blitter finish to interrupt the vblank when its running?

Hope that makes sense?

Last edited by h0ffman; 03 April 2011 at 21:58. Reason: whoops
h0ffman is offline  
Old 04 April 2011, 01:15   #5
Kalms
Registered User
 
Join Date: Nov 2006
Location: Stockholm, Sweden
Posts: 237
I recommend that you move your processing code out of the VBLANK. Let the VBLANK handler just set a flag whenever it is triggered, that is all. The code outside of the VBLANK does all the heavy lifting and queues up blits.
Then your BLIT interrupts will work wonderfully without any extra complications.





Or you can decide to handle nested interrupts...


First, some background. Let's take the A500 with a 68000 as an example.


When the hardware raises VERTB, the VERTB bit gets set in INTREQ(R). INTREQ(R) is a register in one of the Amiga's custom chips. It is not part of the 68000 CPU.

Then there are a bunch of logic gates (again in the custom Amiga hardware), which take all the bits from INTREQ, mask them against INTENA, locate the highest-priority bit that is set in the result, decodes that into a 68000-style Interrupt Priority Level (a value from 0 through 7), and then feeds that value to the Interrupt Priority Level (IPL0-IPL2) pins on the processor. (See M68000UM chapter 3.5 "Interrupt Control"). This operation is redone as soon as INTREQ/INTENA changes.

So, the IPL0-IPL2 input pins of the processor always have the priority of the current highest-priority requested interrupt.

(The entire following section is detailed in M68000UM 6.3.2 "Interrupts".)
Now, every time that the CPU completes processing of an instruction, it will then compare the value on the IPL0-IPL2 pins against the value in the Interrupt Priority Mask (I0-I2) bits in the Status Register (see M68000PRM 1.3.2 "Status Register"). If the value on IPL0-IPL2 is less than or equal to the value in I0-I2, then nothing happens - the CPU continues processing normally. However, if IPL0-IPL2 has a higher value, then exception processing is invoked.
This means:
* SR is backed up internally in the CPU
* SR is modified (Supervisor mode on, trace mode off, I0-I2 raised to IPL0-IPL2 level)
* exception handler vector is fetched from $60+IPL*4
* SR is stored to stack
* PC of the next instruction to be executed is stored on stack
* PC is set to the exception handler vector location
... and then, the CPU will process the next instruction which the PC points at.

So what does this mean? Well, since I0-I2 got raised, then the CPU is in a state where it ignores any further external interrupt requests of the same or lower priority.

A while later, the exception processing code will reach an RTE. Since the RTE instruction restores PC + SR, then the I0-I2 bits in SR will drop down to a level where they were before the exception processing was invoked.


Now. Let's say that you want to allow BLIT interrupts while the VERTB interrupt is being processed.
The blitter will set the BLIT bit in INTREQ(R). BLIT raises IPL to 3. Same as VERTB. But if the CPU is inside of the VERTB handler, then the Interrupt Priority Mask is already set to 3. Therefore any BLIT interrupts are ignored.

So what do you do? Lower the Interrupt Priority Mask.

Code:
mylev3handler:
	move.w	#$2000,sr		; re-enable interrupts

	.. do BLIT/VERTB processing ..

	rte
Whoops, that's no good. Your machine just hung, without reaching the BLIT/VERTB processing code. Why? Because the VERTB bit is set in INTREQ(R) -> IPL0-IPL2 is set to 3 -> if you lower I0-I2 in the exception handler, then the exception immediately gets triggered again!
So first acknowledge/clear INTREQ(R), then lower interrupt priority.

New try:

Code:
mylev3handler:
	move.w	d0,-(sp)
	move.w	intreqr+$dff000,d0
	and.w	#INTF_VERTB|INTF_BLIT|INTF_COPER,d0
	btst	#INTB_VERTB,d0
	bne.s	.handle_vertb
	btst	#INTB_BLIT,d0
	bne.s	.handle_blit

	move.w	d0,intreq+$dff000
	move.w	(sp)+,d0
	rte

.handle_vertb
	move.w	#INTF_VERTB,intreq+$dff000	; acknowledge/clear interrupt
	move.w	#$2000,sr		; re-enable interrupts

	.. VERTB processing ..

	move.w	(sp)+,d0
	rte

.handle_blit
	move.w	#INTF_VERTB,intreq+$dff000	; acknowledge/clear interrupt

	.. BLIT processing ..

	move.w	(sp)+,d0
	rte
That's better. Works on your A500. But not on your friend's A4000/68060. Why is that?

Because the write to INTREQ, and subsequent recomputation of IPL, takes a long time for the slow parts of the A4000 hardware to complete. (Which are the slow parts? Well, all the hardware that doesn't go faster when a turbocharged 68060 board is plugged in.)
So even though you do clear the VERTB flag in INTREQ, and the CPU is supposed to wait for the the custom chips to acknowledge that the write to INTREQ has completed, it still takes "some time" until the logic gates which compute the new value for IPL0-IPL2 complete their work, and the CPU indeed sees that there is no longer a level-3 interrupt being requested.

So, there is a race condition between these two lines:

Code:
	move.w	#INTF_VERTB,intreq+$dff000	; acknowledge/clear interrupt
	move.w	#$2000,sr		; re-enable interrupts
You need a delay which you know is guaranteed to do the job. What you find in most code which has this problem is the following:


Code:
	move.w	#INTF_VERTB,intreq+$dff000	; acknowledge/clear interrupt
	move.w	#INTF_VERTB,intreq+$dff000	; acknowledge/clear interrupt
	move.w	#$2000,sr		; re-enable interrupts
... that is, putting in an extra write to the custom chips, because that will take roughly as long regardless of how fast the CPU is. In practice it will take like 3-5 "chipset" clock cycles (if we presume the chipset clock frequency to be 7.14MHz). That ought to be enough for the change in INTREQ(R) to get crunched and propagated back to the IPL0-IPL2 pins on the CPU.

Note that you should write your interrupt handler such that it handles spurious interrupts (your handler gets entered even though there is no corresponding bit set in INTREQ) and it should not lose interrupts if multiple interrupt requests (say a COPER and a BLIT) get set in INTREQ(R) at exactly the same moment.


Now. Exact choice of method is left to you. Godspeed.
Kalms is offline  
Old 04 April 2011, 09:13   #6
pmc
gone
 
pmc's Avatar
 
Join Date: Apr 2007
Location: completely gone
Posts: 1,596
@ Kalms - what a great explanation.

@ h0ffman - when you say you've written a blitter queueing system, what are the mechanics of something like that...?

You're using blitter interrupts to signal "blitter done" rather than manually waiting for the blitter.

So, in your blitter interrupt processing code you keep some kind of counter that you read off to give an indication as to which values to load the blitter registers with for the next blit operation...?

I've never done / thought about doing a blitter queue and, as usual, I'm interested in how things work.
pmc is offline  
Old 04 April 2011, 11:58   #7
h0ffman
Registered User
 
Join Date: Aug 2008
Location: Salisbury
Posts: 744
@ Kalms - cheers for the explanation, of course I opted for the easy option but thanks very much for that, might come in handy one day.

@PMC - The blitter queue system I've made is probably pretty inefficient at the moment, but I wanted to do something that would be easy to translate my current blits into the queue.

I have two data areas, the first is a temp area at a size of $70. The reason is that it maps from $dff000 to $dff070, which gets you to the end of the blit registers. I also have a queue area which is $2a large multiplied by the number of concurrent queued blits you want.

Doing it this way means all you have to do to queue a blit is swap the $dff000 to the temp data area and then call the blit queue routine. The blit queue routine then picks up the next available slot in the queue and dumps the data for the blit onto it. At this point it also checks to see if the blitter is live, if not then it starts the next awaiting blit.

The interrupt then confirms the last blit in the queue is done, checks for the next available blit and kicks it off if its ready.

The queue kinda rotates, so when you get to the end of your slots it goes round to the beginning again.

Additionally I added a separate call the the blit queue which returns and address in A0, this is so if you need to know when that specific blit has finished, you can store the pointer and check it later.

I was quite pleased with the method, although running on cycle exact its looking like it takes over a whole raster line to queue the next blit. Think I can speed that up by doing a movem on and off the queue.

However, more importantly, the blits i'm queueing aren't working properly at the moment. So needs more debugging.

In a way this method is using more CPU time for the blits, but there is a lot of cpu savings not hanging around waiting for the blitter to finish.

Acutally I have a quick quesiton regarding the blitter. Do any of the registers need to be written to as words? Currently I'm dumping the calls with a move.l, and the only blit that seems to work is the clear screen!.

Edit - with the exception of blitsize of course... thats the last word that needs to be written

Last edited by h0ffman; 04 April 2011 at 12:47.
h0ffman is offline  
Old 04 April 2011, 12:45   #8
pmc
gone
 
pmc's Avatar
 
Join Date: Apr 2007
Location: completely gone
Posts: 1,596
h0ffman - adjacent word regs can be written as a single .l to the first reg in the pair which then, obviously, writes both regs:

bltcon0 & bltcon1
bltafwm & bltalwm
bltamod & bltdmod

those are off the top of my head that'll work OK like that. Of course, there's nothing stopping you writing all the regs as word writes as necessary and non adjacent regs may need to be written with separate word writes as required.

PS. Your blitter queue sounds pretty cool - any chance of dropping me the source code to the usual place for a little look see...?
pmc is offline  
Old 04 April 2011, 12:57   #9
h0ffman
Registered User
 
Join Date: Aug 2008
Location: Salisbury
Posts: 744
Sure mate, once I've done some lunch time debugging, i'll fire the mess over to you
h0ffman is offline  
Old 04 April 2011, 14:58   #10
h0ffman
Registered User
 
Join Date: Aug 2008
Location: Salisbury
Posts: 744
Found my issue, I was setting bltcon0l with each blit, as soon as I took that out, all the blits started working. I had an additional issue where the blit interrupt would stop working, but looking at Kalms response above, I simply acknowledge the interrupt before processing my blit queue and everything was happy Not bad for a days work i think

Now I just hope that my laptop at work isn't powerful enough to run cycle exact as running my prod on here when at full pelt is slightly dropping frames
h0ffman is offline  
Old 04 April 2011, 15:33   #11
pmc
gone
 
pmc's Avatar
 
Join Date: Apr 2007
Location: completely gone
Posts: 1,596
Top skills h0ff

You're becoming a bug fixer extraordinaire!
pmc is offline  
Old 04 April 2011, 22:55   #12
Kalms
Registered User
 
Join Date: Nov 2006
Location: Stockholm, Sweden
Posts: 237
Quote:
Doing it this way means all you have to do to queue a blit is swap the $dff000 to the temp data area and then call the blit queue routine. The blit queue routine then picks up the next available slot in the queue and dumps the data for the blit onto it. At this point it also checks to see if the blitter is live, if not then it starts the next awaiting blit.

The interrupt then confirms the last blit in the queue is done, checks for the next available blit and kicks it off if its ready.
Marked the part in bold. Make sure you don't have a race condition there.

Also beware these two passages from HRM:

"NOTE If a blit has just been started but has been locked out of memory access because of, for instance, display fetches, this bit may not yet be set. The processor, on the other hand, may be running completely uninhibited out of FAST memory or its internal cache, so it will continue to have memory cycles."

"NOTE
Starting with the Fat Agnus the blitter busy bit has been "fixed" to be
set as soon as you write to BLTSIZE to start the blit, rather than when
the blitter gets its first DMA cycle. However, not att machines will use
thee newer chips, so it is best to rely on the above method of testing."

so be careful when using the BBUSY bit to detect blitter activity.
Kalms is offline  
Old 05 April 2011, 02:57   #13
h0ffman
Registered User
 
Join Date: Aug 2008
Location: Salisbury
Posts: 744
Cheers Kalms, its actually an internal flag, not a hardware test so hopefully I should be fine.
h0ffman is offline  
Old 05 April 2011, 04:23   #14
Kalms
Registered User
 
Join Date: Nov 2006
Location: Stockholm, Sweden
Posts: 237
Ok cool. One more thing though:

Do you have any blitter-interrupt-disabling code around your "add an item to the queue and trigger blitter if necessary" code? If not, then you have a race condition there.
Kalms is offline  
Old 29 July 2019, 21:11   #15
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,602
Quote:
Originally Posted by Kalms View Post
I recommend that you move your processing code out of the VBLANK. Let the VBLANK handler just set a flag whenever it is triggered, that is all. The code outside of the VBLANK does all the heavy lifting and queues up blits.
Then your BLIT interrupts will work wonderfully without any extra complications.
I recommend this and this

Quote:
Originally Posted by Kalms View Post
Do you have any blitter-interrupt-disabling code around your "add an item to the queue and trigger blitter if necessary" code? If not, then you have a race condition there.
or else you must know exactly when each blit executes, so main and interrupt don't fight over the Blitter.

The two most important points are: an interrupt interrupts main (maybe in the middle of loading Blitter registers(!)), and a blit can under some circumstances lock out the CPU completely and delay the next (any) interrupt until finished (and not when you expect it).
Photon is offline  
Old 06 August 2019, 23:40   #16
buzzybee
Registered User
 
Join Date: Oct 2015
Location: Landsberg / Germany
Posts: 526
Thought I´d share my experience with interrupt driven Blitter code I used in RESHOOT R. Until just a few weeks prior to release, this was my approach:
  • Main code: Queries position and pixel data of each object, then builds a structure of consecutive data pieces, each data piece carrying relevant data for loading blitter regs needed to draw one object
  • Main: Data structure is double buffered, List A and List B. While List A is built in main, List B is locked
  • IRQ: With every IRQ level 6 call, code querys the origin of the IRQ, then jumps to blitter control code if relevant
  • IRQ - blitter: Each call reads one data piece from the locked (!) List B data structure, loads Blitter regs with it, inits Blitter, then exits IRQ. Double buffered structure lists ensure that main code can´t modify data structure being processed by IRQ blitter code
  • IRQ - bitter: As soon as each consecutive data piece from the locked data structure has been processed and drawn by the Blitter, IRQ code sets codes own Blitter Finished flag (not Blitter Busy!)
  • Main: Waits and queries the Blitter Finished flag. As soon as it is set, pointers to the two structures are being swapped
  • Main: Now builds a structure of consecutive data pieces again, this time into list B, each data piece carrying relevant data for loading blitter regs / drawing one object ...
  • IRQ - blitter: Reads data from List A ... and so on ...

While this code work just fine on lower end machines, on 68060-cpu it seemed to crash occasionally for reasons I never really understood. I therefore decided to change my whole approach and ditch the IRQ driven blitter code for a more traditional approach.

This code was simple, ran more stable and worked comparably fast, since the IRQ code really needs a whole lot of overhead, for example saving and restoring cpu registers at the entry and the exit of each IRQ call.
buzzybee is offline  
Old 07 August 2019, 11:06   #17
zero
Registered User
 
Join Date: Jun 2016
Location: UK
Posts: 428
IRQ handling speed is a bit of a weakness on the Amiga. By the time you saved registers, figured out where the IRQ came from and jumped to the right code you might as well not have bothered in many cases. Polling is actually faster.
zero is offline  
Old 07 August 2019, 20:04   #18
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,602
Quote:
Originally Posted by zero View Post
IRQ handling speed is a bit of a weakness on the Amiga. By the time you saved registers, figured out where the IRQ came from and jumped to the right code you might as well not have bothered in many cases. Polling is actually faster.
Polling = busy-waiting, dev courses for any environment thumps it into you that such should be avoided, and for good reason.

There's a veritable bouquet of lovely interrupts to use on the Amiga. Use the flowers wisely (consider the overhead and don't use for things that take < 1 scanline of time), and you maximize CPU time, simple.

On other computers, it's just not possible to get a this tight a fit of the glove of performance to the hand of the hardware specs.

Last edited by Photon; 07 August 2019 at 20:14.
Photon is offline  
Old 08 August 2019, 11:58   #19
zero
Registered User
 
Join Date: Jun 2016
Location: UK
Posts: 428
Quote:
Originally Posted by Photon View Post
Polling = busy-waiting, dev courses for any environment thumps it into you that such should be avoided, and for good reason.

There's a veritable bouquet of lovely interrupts to use on the Amiga. Use the flowers wisely (consider the overhead and don't use for things that take < 1 scanline of time), and you maximize CPU time, simple.

On other computers, it's just not possible to get a this tight a fit of the glove of performance to the hand of the hardware specs.

Obviously you must pipeline as best you can, rather than just start the blitter than then busy wait. If you do it right it's going to be a lot more efficient than entering and exiting an interrupt.

The issue is the 68000, it's just not designed for fast interrupt response. Some other CPUs, especially 8 bit ones, are much better.
zero is offline  
Old 08 August 2019, 12:21   #20
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
The faster Blitter queues way is plainly with Copper BFD.
But it is also very complicated and with heavy side effects.
There is related thread on board somewhere.
ross is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Please help me!! Blitter pain! h0ffman Coders. Asm / Hardware 5 15 June 2013 18:59
Blitter using the copper... h0ffman Coders. Asm / Hardware 9 23 February 2012 08:25
Did Starglider use the blitter? mc6809e Retrogaming General Discussion 8 04 February 2012 15:19
Filling with the blitter... Lonewolf10 Coders. Tutorials 7 13 September 2011 14:30
Blitter nasty or not? JackAsser Coders. Tutorials 5 28 March 2010 22:45

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 17:51.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.36454 seconds with 13 queries