Direct Blitter wait vs Gfx Library

sandruzzo · 29 March 2023, 04:43

is it true that on Stock A1200 waiting blitter with WaitBlit() is faster than direct wait since rom are 32bit?

Galahad/FLT · 29 March 2023, 08:43

Quote:

Originally Posted by sandruzzo

is it true that on Stock A1200 waiting blitter with WaitBlit() is faster than direct wait since rom are 32bit?

I think you have to consider all the extra instructions you need to execute to get to the system blit wait that would probably offset that.

hooverphonique · 29 March 2023, 11:04

Depends on what you mean by "faster" - gfx lib WaitBlit is multitasking-compatible, doing it directly (i.e. a dff002 polling loop) just burns cpu cycles until the blitter is done.

alkis · 29 March 2023, 11:27

I think OP refers to that

Quote:

Always use the graphics.library WaitBlit() routine for your end of blitter code. It does not change any registers, it takes into account any revision of blitter chip and any unusual circumstances, and on an Amiga 1200 will execute faster (because in 32-bit ROM) than any code that you could write in chipram.

source: https://jvaltane.kapsi.fi/amiga/howtocode/blitter.html

a/b · 29 March 2023, 14:27

Doesn't take icache and blitter nasty bit into account. In the best case for using a gfx call, it's situationally better, it really depends what you are doing.
Realistically, no, it isn't faster than embedded "optional" read + bit test + branch.
If you are super concerned about compatibility with every chipset in existance, then it makes sense to call WaitBlit(), otherwise it's slower.
You could simply follow the trail, look at the actual ROM code, and compare it to what you are doing and decide what's faster.

paraj · 29 March 2023, 19:16

I checked KS 1.3,2.0,3.0 and 3.1, and there seems to be very little overhead in calling WaitBlit, and the only difference from what you'd normally write yourself is 3.0/3.1 doing a CIA access before polling DMACONR (if the blit wasn't finished when the routine started). 1.3/2.0 uses NOPs instead.

If you're doing a blit where the polling loop might steal cycles from the blitter then it seems to me that it would nearly always be best to use WaitBlit rather than rolling your own loop. The benefit is none/marginal on 020+ (for the reasons a/b mentions) if your code is equivalent, but it could be quite a bit faster on 68000.

mark_k · 29 March 2023, 20:23

You could put the WaitBlit routine address in an address register, to allow you to do e.g. JSR (A5) to call it, to eliminate a long JMP. And if the call is at the end of a subroutine, do JMP (A5) instead.

Bruce Abbott · 10 April 2023, 01:53

The most efficient WaitBlit() is the one you don't call. If you are worried about how long it will take you shouldn't be doing it.

Galahad/FLT · 10 April 2023, 11:43

Quote:

Originally Posted by mark_k

You could put the WaitBlit routine address in an address register, to allow you to do e.g. JSR (A5) to call it, to eliminate a long JMP. And if the call is at the end of a subroutine, do JMP (A5) instead.

There would be no long JMP, it would be a -xxx(a6) JSR

mark_k · 10 April 2023, 14:32

I meant, having the address of WaitBlit() in a register allows you to avoid the indirect absolute long jump (which involves chip RAM access if GfxBase is in chip), e.g.

Code:

; GfxBase in A6
MOVEA.L (_LVOWaitBlit+2,A6),A5   ;Put address of (ROM) WaitBlit() routine in A5
... set up and start blit ...
JSR (A5)
... set up next blit ...
JSR (A5)
... set up last blit ...
JMP (A5)     ; Instead of JSR then RTS

29 March 2023, 04:43	#1
sandruzzo Registered User Join Date: Feb 2011 Location: Italy/Rome Posts: 2,291	Direct Blitter wait vs Gfx Library is it true that on Stock A1200 waiting blitter with WaitBlit() is faster than direct wait since rom are 32bit?

10 April 2023, 14:32	#10
mark_k Registered User Join Date: Aug 2004 Location: Posts: 3,343	I meant, having the address of WaitBlit() in a register allows you to avoid the indirect absolute long jump (which involves chip RAM access if GfxBase is in chip), e.g. Code: ; GfxBase in A6 MOVEA.L (_LVOWaitBlit+2,A6),A5 ;Put address of (ROM) WaitBlit() routine in A5 ... set up and start blit ... JSR (A5) ... set up next blit ... JSR (A5) ... set up last blit ... JMP (A5) ; Instead of JSR then RTS

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Immediate Blitter & Wait for Blitter...	volvo_0ne	support.WinUAE	32	18 September 2022 09:52
avoid blitter wait	leonard	Coders. Asm / Hardware	11	28 December 2021 10:58
wait for blitter vs immediate blitter	jotd	support.WinUAE	1	08 September 2020 04:14
Library not available in direct mode	tolkien	Coders. Blitz Basic	1	20 February 2017 18:56
Prefered GFX library	Vairn	Coders. General	0	03 August 2009 03:46

29 March 2023, 11:04	#3
hooverphonique ex. demoscener "Bigmama" Join Date: Jun 2012 Location: Fyn / Denmark Posts: 1,624	Depends on what you mean by "faster" - gfx lib WaitBlit is multitasking-compatible, doing it directly (i.e. a dff002 polling loop) just burns cpu cycles until the blitter is done.

29 March 2023, 14:27	#5
a/b Registered User Join Date: Jun 2016 Location: europe Posts: 1,039	Doesn't take icache and blitter nasty bit into account. In the best case for using a gfx call, it's situationally better, it really depends what you are doing. Realistically, no, it isn't faster than embedded "optional" read + bit test + branch. If you are super concerned about compatibility with every chipset in existance, then it makes sense to call WaitBlit(), otherwise it's slower. You could simply follow the trail, look at the actual ROM code, and compare it to what you are doing and decide what's faster.

29 March 2023, 19:16	#6
paraj Registered User Join Date: Feb 2017 Location: Denmark Posts: 1,106	I checked KS 1.3,2.0,3.0 and 3.1, and there seems to be very little overhead in calling WaitBlit, and the only difference from what you'd normally write yourself is 3.0/3.1 doing a CIA access before polling DMACONR (if the blit wasn't finished when the routine started). 1.3/2.0 uses NOPs instead. If you're doing a blit where the polling loop might steal cycles from the blitter then it seems to me that it would nearly always be best to use WaitBlit rather than rolling your own loop. The benefit is none/marginal on 020+ (for the reasons a/b mentions) if your code is equivalent, but it could be quite a bit faster on 68000.

29 March 2023, 20:23	#7
mark_k Registered User Join Date: Aug 2004 Location: Posts: 3,343	You could put the WaitBlit routine address in an address register, to allow you to do e.g. JSR (A5) to call it, to eliminate a long JMP. And if the call is at the end of a subroutine, do JMP (A5) instead.

10 April 2023, 01:53	#8
Bruce Abbott Registered User Join Date: Mar 2018 Location: Hastings, New Zealand Posts: 2,584	The most efficient WaitBlit() is the one you don't call. If you are worried about how long it will take you shouldn't be doing it.

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)