Linedraw blitter vs. CPU on 68000

pmc · 10 February 2012, 12:56

Watcha people

I guess I'm not really expecting much of a definite (or even any?) answer to this, but here goes:

I've been writing some code lately that prompted me to think about this - I'll probably just do my own testing and get my own answers regardless - but I thought I'd canvass some opinion here. Who knows? Maybe one of you guys already found the definitive answer anyway?

If I'm drawing lots of short lines (maybe 10 or so pixels each in length), although the blitter will draw the lines quickly, all the register setup required to get the blitter *ready* to draw the line causes quite an overhead.

So, my question is: even on 68000 is it quicker to draw short lines of this type with a software Bresenham's routine ie. spending the processor time actually *doing* the drawing rather than getting the blitter *ready* to do the drawing?

If so, is this true only for short lines? And if yes, where's the break even point? ie. how long does a line need to become before the time taken to set the blitter to draw it is less than the time taken to draw it with the processor assuming I'd be using a "fast as it can be" software Bresenham's?

KevG · 10 February 2012, 18:46

Why don't you do some time-tests yourself and find out.

Galahad/FLT · 10 February 2012, 18:55

Quote:

Originally Posted by pmc

Watcha people

I guess I'm not really expecting much of a definite (or even any?) answer to this, but here goes:

I've been writing some code lately that prompted me to think about this - I'll probably just do my own testing and get my own answers regardless - but I thought I'd canvass some opinion here. Who knows? Maybe one of you guys already found the definitive answer anyway?

If I'm drawing lots of short lines (maybe 10 or so pixels each in length), although the blitter will draw the lines quickly, all the register setup required to get the blitter *ready* to draw the line causes quite an overhead.

So, my question is: even on 68000 is it quicker to draw short lines of this type with a software Bresenham's routine ie. spending the processor time actually *doing* the drawing rather than getting the blitter *ready* to do the drawing?

If so, is this true only for short lines? And if yes, where's the break even point? ie. how long does a line need to become before the time taken to set the blitter to draw it is less than the time taken to draw it with the processor assuming I'd be using a "fast as it can be" software Bresenham's?

With good use of the movem.l instruction, setting up registers needn't be a big overhead

pmc · 10 February 2012, 19:19

Quote:

Originally Posted by KevG

Why don't you do some time-tests yourself and find out.

Thanks for your helpful advice KevG.

Maybe you skim read my initial post? I'm often guilty of that myself.

Quote:

Originally Posted by pmc

I'll probably just do my own testing and get my own answers regardless - but I thought I'd canvass some opinion here. Who knows? Maybe one of you guys already found the definitive answer anyway?

Clearer now?

Quote:

Originally Posted by Galahad

With good use of the movem.l instruction, setting up registers needn't be a big overhead

Nice tip mate, thanks. Another one for me to play around with.

KevG · 11 February 2012, 19:18

By the way. I do know what the solution is and I found out myself by doing lots of time-tests on a 68000 a while ago. Nobody told me, I worked it out for myself based on my results. If you are proficient at programming then you should be able to find out for yourself.

Galahad/FLT · 11 February 2012, 19:57

Quote:

Originally Posted by KevG

By the way. I do know what the solution is and I found out myself by doing lots of time-tests on a 68000 a while ago. Nobody told me, I worked it out for myself based on my results. If you are proficient at programming then you should be able to find out for yourself.

I genuinely can't fathom if you're trying to be a conscientous helpful soul... or not.

Which is it?

pmc · 11 February 2012, 21:16

Quote:

Originally Posted by KevG

By the way. I do know what the solution is and I found out myself by doing lots of time-tests on a 68000 a while ago.

Well done you.

That was kind of the point, which I think you might have missed, of my post - to provoke discussion with other coders on here.

Quote:

Originally Posted by KevG

If you are proficient at programming then you should be able to find out for yourself.

Also true. And while I think I've reached some level of proficiency at coding I've never been foolish or arrogant enough to believe that my way is the best or only way to do something. There are some pretty talented guys on here who always will be better than me at this so why not ask them to share their experiences?

I kind of agree with Galahad that you seem to have ignored the spirit of my initial post purely to be pedantic which is a shame especially as, from what you say you know, we could have had an interesting discussion instead.

I still can't help thinking all of this could have been avoided by you just openly sharing the experiences you gained from your own testing, or if you didn't want to do that, just saying nothing and letting me and anyone else interested get on with the discussion.

korruptor · 11 February 2012, 22:33

I'm actually curious about this myself, as reading the Amiga 3D graphics book he's advising using the CPU for simplicity. Personally I think he's ported a load of code from the ST.

But I've not tried it myself

Galahad/FLT · 11 February 2012, 23:50

I think virtually without exception except for lots of very tiny blits, the blitter is always faster than the processor on the 68000.

So your ten pixel height lines might well be ever so slightly quicker with the cpu

korruptor · 12 February 2012, 01:24

I was going to try triple buffering, clear the 2nd back buffer with the blitter while doing the lines on the 1st back buffer with the cpu, then blitter fill it.

If there's not that much odds in it, then I might save that for later...

mc6809e · 12 February 2012, 05:58

Quote:

Originally Posted by pmc

Well done you.

That was kind of the point, which I think you might have missed, of my post - to provoke discussion with other coders on here.

[snip]

I kind of agree with Galahad that you seem to have ignored the spirit of my initial post purely to be pedantic which is a shame especially as, from what you say you know, we could have had an interesting discussion instead.

I still can't help thinking all of this could have been avoided by you just openly sharing the experiences you gained from your own testing, or if you didn't want to do that, just saying nothing and letting me and anyone else interested get on with the discussion.

I don't think many of us have much confidence that our own experiments will give you any real help.

So many things affect the decision. How many bit planes are in use, for example? Six plane EHB mode is going to require six line draws. That's a lot of blitter set ups. The CPU might be faster for some shorter lines since one set of calcs can be done per point and that point can be plotted with a series of six load/masks/stores. But what if you're just using one plane? Then the answer is probably different. It may be that all but lines of length one should be done with that blitter. Maybe you use the CPU for short lines and the blitter for long lines.

Or maybe the copper is available to load the blitter for you. Use the CPU to generate a copper list with all the needed blitter register loads. While the copper is controlling blitter line drawing, the CPU can do something else. Here the blitter might even be slower than the CPU, but you can free the CPU for other things. A 3d program requires plenty of MULs. Slower line draw might be okay since the CPU can be made busy with calcs while the blitter is line drawing. Maybe blitter line drawing even in EHB mode is the right call in such a case.

But suppose the copper isn't available to you? Or maybe there are too many lines to draw making the copper list too long. It's going to reset before all the blitter work is done. Maybe some combinations of shorter lists and CPU interrupts can be made to work. Get the copper to int after X lines. What about the overhead of interrupting the CPU? Should five or ten or twenty lines be drawn before interrupting the CPU and rebuilding a list? Maybe the copper idea should be avoided altogether and the blitter should interrupt the CPU after every line draw. But interrupt overhead can be large. Maybe for short lines interrupts should be avoided and the CPU instead should poll the blitter to see if it's finished. Maybe two lists of coordinates should be made: a list for short lines and polling, and a list for long lines and interrupts. How long does the line need to be to go on the interrupt based list? Well, that might depend on how many bit planes are being drawn, etc.

Maybe this. Maybe that. The decision is so context sensitive that our own experiments probably mean little.

h0ffman · 15 February 2012, 14:30

Actually, just to through this into the mix.. I've heard people talking about using the copper to push the blitter.. But what happens to your blit waits? Is there a blit wait instruction for the copper?

EDIT: Holy s*&t!!! I've just spotted that in the AHRM!!!! Waiting for the blitter with the copper!!!...

Well then heres an idea Paul.... In the frame you spend the CPU time building your copper list for all the blits you want to do. Once the CPU has done all of that, you tell the next frame to use that copper list. While the copper is happily running away with your blits, your CPU is then building then next frames worth of blits.

Now thats threading.. old skoool!

korruptor · 15 February 2012, 23:40

Wonder what the quickest way to clear the back buffer would be, doing that.

h0ffman · 16 February 2012, 00:11

That depends on where the load is. You could use the cpu with a movem if you blitter is well busy which means you could do a cpu clear while preparing the blits for the next frame

mc6809e · 16 February 2012, 02:53

Quote:

Originally Posted by korruptor

Wonder what the quickest way to clear the back buffer would be, doing that.

You mean clearing it using the some hybrid CPU/blitter technique?

I think there's an old thread somewhere discussing this.

The consensus seemed to be that it was best to clear one half of the buffer with the blitter and the other half with the CPU, the two running simultaneously.

There certainly seems to be a strong argument for mixing CPU and blitter here since blitter mem clear only uses every other memory bus cycle and you'd certainly don't want the rest of the cycles to go to waste if they're available (while the video beam is in the overscan area, for example). You want the CPU in there to take advantage of any cycles left over. I'm not sure about the 50/50 split, though. My guess would be more like 2 planes cleared by the blitter for every 1 plane by the CPU. But like line-drawing, it depends on context, I'd think.

Now here's a another crazy idea to maximize CPU/blitter concurrency: how about hunting for useful code sequences in kickstart that end in an RTS and chaining them together? The idea would be to build a program as a list of calls to these routines so that the CPU would spend time executing from kickstart while the blitter is altering chipram.

korruptor · 16 February 2012, 11:40

Quote:

Originally Posted by mc6809e

I think there's an old thread somewhere discussing this.

The consensus seemed to be that it was best to clear one half of the buffer with the blitter and the other half with the CPU, the two running simultaneously.

Actually, yeah, I remember reading something about that. I'll have a search.

kamelito · 16 February 2012, 14:05

I read that too, using movem.l ...
http://www.mways.co.uk/amiga/howtocode/text/blitter.php

look for blitter clear

Photon · 29 February 2012, 15:02

Re OT: Bresenham is not the fastest. The CPU is always much slower at linedrawing than the blitter, since normally the average slope is 22.5° v or h. If you have a 10x faster CPU and draw in fastmem, it can beat the blitter, but in your case the fastest alternative is to generate 2*54 10x10 bobs and blit the ones who fit the deltas.

CPU competes better at block ops than pixel ops, as others have mentioned.

10 February 2012, 12:56	#1
pmc gone Join Date: Apr 2007 Location: completely gone Posts: 1,596	Linedraw blitter vs. CPU on 68000 Watcha people I guess I'm not really expecting much of a definite (or even any?) answer to this, but here goes: I've been writing some code lately that prompted me to think about this - I'll probably just do my own testing and get my own answers regardless - but I thought I'd canvass some opinion here. Who knows? Maybe one of you guys already found the definitive answer anyway? If I'm drawing lots of short lines (maybe 10 or so pixels each in length), although the blitter will draw the lines quickly, all the register setup required to get the blitter ready to draw the line causes quite an overhead. So, my question is: even on 68000 is it quicker to draw short lines of this type with a software Bresenham's routine ie. spending the processor time actually doing the drawing rather than getting the blitter ready to do the drawing? If so, is this true only for short lines? And if yes, where's the break even point? ie. how long does a line need to become before the time taken to set the blitter to draw it is less than the time taken to draw it with the processor assuming I'd be using a "fast as it can be" software Bresenham's?

11 February 2012, 19:18	#5
KevG Banned Join Date: Jan 2009 Location: U.K. Posts: 93	By the way. I do know what the solution is and I found out myself by doing lots of time-tests on a 68000 a while ago. Nobody told me, I worked it out for myself based on my results. If you are proficient at programming then you should be able to find out for yourself. Last edited by prowler; 12 February 2012 at 00:34. Reason: Removed troll-like comments.

11 February 2012, 22:33	#8
korruptor TDI Join Date: Feb 2007 Location: Blitter Town Posts: 124	I'm actually curious about this myself, as reading the Amiga 3D graphics book he's advising using the CPU for simplicity. Personally I think he's ported a load of code from the ST. But I've not tried it myself Last edited by korruptor; 12 February 2012 at 01:20.

15 February 2012, 14:30	#12
h0ffman Registered User Join Date: Aug 2008 Location: Salisbury Posts: 744	Actually, just to through this into the mix.. I've heard people talking about using the copper to push the blitter.. But what happens to your blit waits? Is there a blit wait instruction for the copper? EDIT: Holy s&t!!! I've just spotted that in the AHRM!!!! Waiting for the blitter with the copper!!!... Well then heres an idea Paul.... In the frame you spend the CPU time building your copper list for all the blits you want to do. Once the CPU has done all of that, you tell the next frame to use that copper list. While the copper is happily running away with your blits, your CPU is then building then next frames worth of blits. Now thats threading.. old skoool! Last edited by prowler; 15 February 2012 at 22:44. Reason: Back-to-back posts merged.*

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Blitter fighting the CPU	h0ffman	Coders. General	5	05 April 2011 13:18
A500: max blitter+cpu throughput	zaz	Coders. General	2	30 March 2010 18:52
Looking for 68000 CPU Connector	BinoX	support.Hardware	6	11 June 2007 13:01
Amiga, 68000 and CPU speed	SilentBob	Retrogaming General Discussion	3	15 October 2006 15:45

10 February 2012, 18:46	#2
KevG Banned Join Date: Jan 2009 Location: U.K. Posts: 93	Why don't you do some time-tests yourself and find out.

11 February 2012, 23:50	#9
Galahad/FLT Going nowhere Join Date: Oct 2001 Location: United Kingdom Age: 50 Posts: 8,986	I think virtually without exception except for lots of very tiny blits, the blitter is always faster than the processor on the 68000. So your ten pixel height lines might well be ever so slightly quicker with the cpu

12 February 2012, 01:24	#10
korruptor TDI Join Date: Feb 2007 Location: Blitter Town Posts: 124	I was going to try triple buffering, clear the 2nd back buffer with the blitter while doing the lines on the 1st back buffer with the cpu, then blitter fill it. If there's not that much odds in it, then I might save that for later...

15 February 2012, 23:40	#13
korruptor TDI Join Date: Feb 2007 Location: Blitter Town Posts: 124	Wonder what the quickest way to clear the back buffer would be, doing that.

16 February 2012, 00:11	#14
h0ffman Registered User Join Date: Aug 2008 Location: Salisbury Posts: 744	That depends on where the load is. You could use the cpu with a movem if you blitter is well busy which means you could do a cpu clear while preparing the blits for the next frame

16 February 2012, 14:05	#17
kamelito Zone Friend Join Date: May 2006 Location: France Posts: 1,801	I read that too, using movem.l ... http://www.mways.co.uk/amiga/howtocode/text/blitter.php look for blitter clear

29 February 2012, 15:02	#18
Photon Moderator Join Date: Nov 2004 Location: Eksjö / Sweden Posts: 5,602	Re OT: Bresenham is not the fastest. The CPU is always much slower at linedrawing than the blitter, since normally the average slope is 22.5° v or h. If you have a 10x faster CPU and draw in fastmem, it can beat the blitter, but in your case the fastest alternative is to generate 2*54 10x10 bobs and blit the ones who fit the deltas. CPU competes better at block ops than pixel ops, as others have mentioned.

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)