How to code shadebobs with source.

Lekman · 27 December 2013, 18:20

I was very proud back in 95 when i figured out how to use the blitter to add. I have google translated my comment so excuse my bad english:o

The routine is based on the Binar system is such that if you add one then you can do it like this: Change the first bit (eor 1). If it is 1 then stop, if it is zero then continue. This continues until there are no more bits (bitplanes) or a bit is 1. First I EOR first bitplane with the mask (BOB). Then I create a new mask where I mask off the bits that were 1. Do the same with all the Bitplanes. This routine is optimized so I do not create a new mask for each plane. I instead uses a combination of mask and NOT the plane I do not has made mask off. I save two blitt, but since I use C channel i save the time of one blitt.

Here it goes:)

Code:

MakeShade:	move.w	XPos(pc),d0
		moveq	#$f,d7		; Mask for shift
		and.b	d0,d7		; d7=Shift value
		lsr.w	#3,d0		; /8
		and.b	#$fe,d0		; Word Pos
		
		move.w	YPos(pc),d1
		lea	YMulTable(pc),a0
		add.w	d1,d1
		move.w	(a0,d1.w),d1	; d1=YPos*ScreenWide
		add.w	d0,d1		; d1=BlitPos

		lea	Screen,a0
		add.w	d1,a0		; BlitPos
		lea	BobData(pc),a1
		lea	MaskBuffer(pc),a2
		
		ror.w	#4,d7		; Shift it to right pos

		moveq	#0,d0		; Used to clean registrers
		moveq	#ScreenWide-BobWide,d1	; BLTXMOD
		move.w	#(BobTall<<6)+(BobWide/2),d2 ; BLTSIZE
					;                     _   _
		move.w	#$0d3c,d3	; Use: ABD, Miniterm: AB+AB
					;                      _ _  _
		move.w	#$0fb4,d4	; Use: ABCD, Miniterm: ABC+AB+AC
					;                      _ _
		move.w	#$0f04,d5	; Use: ABCD, Miniterm: ABC

; EOR plane1 with BobData
		
		tst.b	(a6)		; Agnus Bug
.wait		btst.b	#14-8,DMACONR(a6)
		bne.s	.wait

		move.l	a0,BLTAPTH(a6)	; Plane1
		move.l	a1,BLTBPTH(a6)	; BobData
		move.l	a0,BLTDPTH(a6)	; Plane1 (Destination)
		move.w	d3,BLTCON0(a6)	; EOR
		move.w	d7,BLTCON1(a6)
		move.l	#-1,BLTAFWM(a6)	; No mask (hit both masks at once)
		move.w	d0,BLTBMOD(a6)
		move.w	d1,BLTAMOD(a6)	; Clear BMOD, AMOD=ScreenWide-BobWide
		move.w	d1,BLTDMOD(a6)
		move.w	d2,BLTSIZE(a6)	; (BobTall<<6)+(BobWide/2)
				
; EOR plane 2 with NOT(PLANE1) and use BobData as mask

		lea	(a0),a3			; Plane1
		lea	ScreenSize(a0),a0	; Plane 2

		tst.b	(a6)		; Agnus Bug
.wait2		btst.b	#14-8,DMACONR(a6)
		bne.s	.wait2

		move.l	a0,BLTAPTH(a6)	; Plane2
		move.l	a1,BLTBPTH(a6)	; BobData
		move.l	a3,BLTCPTH(a6)	; Plane1
		move.l	a0,BLTDPTH(a6)	; Plane2 (Destination)
		move.w	d4,BLTCON0(a6)
		move.w	d1,BLTCMOD(a6)	; ScreenWide-BobWide
		move.w	d2,BLTSIZE(a6)	; (BobTall<<6)+(BobWide/2)

; AND (NOT(Plane1)NOT(PLANE2)) with BobData so that all the pixels that 
;were 1 in plane 1 or 2 are excluded for the next plan. .

		tst.b	(a6)		; Agnus Bug
.wait3		btst.b	#14-8,DMACONR(a6)
		bne.s	.wait3

		move.l	a3,BLTAPTH(a6)	; Plane1
		move.l	a1,BLTBPTH(a6)	; BobData
		move.l	a0,BLTCPTH(a6)	; Plane2
		move.l	a2,BLTDPTH(a6)	; MaskBuff (Destination)
		move.w	d5,BLTCON0(a6)
		move.w	d0,BLTDMOD(a6)	; Clear
		move.w	d2,BLTSIZE(a6)	; (BobTall<<6)+(BobWide/2)

; Here I EOR plane 3 with MaskBuff

		lea	ScreenSize(a0),a0	; Plane 3

		tst.b	(a6)		; Agnus Bug
.wait4		btst.b	#14-8,DMACONR(a6)
		bne.s	.wait4

		move.l	a0,BLTAPTH(a6)	; Screen
		move.l	a2,BLTBPTH(a6)	; MaskBuff
		move.l	a0,BLTDPTH(a6)	; Screen (Destination)
		move.w	d3,BLTCON0(a6)	; EOR
		move.w	d0,BLTCON1(a6)	; Clear
		move.w	d1,BLTDMOD(a6)	; ScreenWide-BobWide
		move.w	d2,BLTSIZE(a6)	; (BobTall<<6)+(BobWide/2)

;Here I EOR plane 4 with NOT (PLANE3) AND with MaskBuff. I copy back plane 
;4 where MaskBuff is 0 and where the plane 3 is 1

		lea	(a0),a3			; Plane 3
		lea	ScreenSize(a0),a0	; Plane 4

		tst.b	(a6)		; Agnus Bug
.wait5		btst.b	#14-8,DMACONR(a6)
		bne.s	.wait5

		move.l	a0,BLTAPTH(a6)	; Plane4
		move.l	a2,BLTBPTH(a6)	; MaskBuff
		move.l	a3,BLTCPTH(a6)	; Plane3
		move.l	a0,BLTDPTH(a6)	; Plane4 (Destination)
		move.w	d4,BLTCON0(a6)
		move.w	d2,BLTSIZE(a6)	; (BobTall<<6)+(BobWide/2)

; Here I AND NOT (Plane3) NOT (PLANE4) with MaskBuff so that all the pixels 
;that were 1 in plane 4 or 3 are excluded  for the next plan.

		tst.b	(a6)		; Agnus Bug
.wait6		btst.b	#14-8,DMACONR(a6)
		bne.s	.wait6

		move.l	a3,BLTAPTH(a6)	; Plane 3
		move.l	a2,BLTBPTH(a6)	; BobData
		move.l	a0,BLTCPTH(a6)	; Plane 4
		move.l	a2,BLTDPTH(a6)	; MaskBuff (Destination)
		move.w	d5,BLTCON0(a6)
		move.w	d0,BLTDMOD(a6)	; Clear
		move.w	d2,BLTSIZE(a6)	; (BobTall<<6)+(BobWide/2)
		
; Her I EOR plane 5 with MaskBuff

		lea	ScreenSize(a0),a0	; Plane 5

		tst.b	(a6)		; Agnus Bug
.wait7		btst.b	#14-8,DMACONR(a6)
		bne.s	.wait7

		move.l	a0,BLTAPTH(a6)	; Plane5
		move.l	a2,BLTBPTH(a6)	; MaskBuff
		move.l	a0,BLTDPTH(a6)	; Plane5 (Destination)
		move.w	d3,BLTCON0(a6)	; EOR
		move.w	d0,BLTCON1(a6)	; Clear
		move.w	d1,BLTDMOD(a6)	; ScreenWide-BobWide
		move.w	d2,BLTSIZE(a6)	; (BobTall<<6)+(BobWide/2)
		rts

****************************************************************************
** Set Up Mouse:
****************************************************************************

SetUpMouse:	move.w	JOY0DAT(a6),d0	; Set the Mouse Position
		move.b	d0,OldX		;
		lsr.w	#8,d0		;
		move.b	d0,OldY		;
		lea	XPos(pc),a0
		move.w	#152,(a0)+
		move.w	#120,(a0)
		rts

*******************************************************************************
**Make Y MulTable: Computes all multiplications by Screen Wide * Y, where
**Y ranges from 0 - (Screen High-1). Forexample if Screen High is 256, then
**the routine calculates the multiplication with ScreenWiht from 0-255 (256 lines)
*******************************************************************************

MakeYMulTable:	lea	YMulTable(pc),a0
		move.w	#ScreenTall-1,d0
		moveq	#0,d1
		moveq	#ScreenWide,d2		
.loop		move.w	d1,(a0)+
		add.w	d2,d1
		dbf	d0,.loop
		rts

Lekman · 29 December 2013, 13:39

There may be a faster method discussed here: http://ada.untergrund.net/forum/inde...um=4&topic=270

And ofcourse you can use copperblits to free the cpu.

copse · 29 December 2013, 20:38

Thanks for sharing the code. There's one comment at the end which you may as well translate, just so it's all in English :-)

Lekman · 29 December 2013, 20:44

I will do

The framework is old, so it may crash on some systems. It was developed on my Amiga 600.

Leffmann · 29 December 2013, 21:25

Quote:

Originally Posted by Lekman

There may be a faster method discussed here: http://ada.untergrund.net/forum/inde...um=4&topic=270

Yep, using a simplified LFSR (linear feedback shift register) should be a great way to optimize this graphics effect! An LFSR is basically a sequence of digits which we repeatedly shift one step and use some function of the digits to determine the value of the new input, resulting in a deterministic series of numbers.

The idea behind this effect is to use a pattern to draw on the screen, but instead of drawing in a single color, we cycle the underlying pixels through a gradient. The solution on Amiga is to use a palette with gradients, and increase the value of the pixels we draw on by 1. The best looking example of this on Amiga is probably in Hardwired ( [ Show youtube player ]). I think the original inspiration may have been the "Shade" drawing mode in Deluxe Paint, which does exactly this.

The traditional method on Amiga has been to do binary addition using the Blitter or the CPU. Since we don't care about adding together any arbitrary numbers, we can optimize for this special case of always adding 1, and this is exactly what Lekman's code does.

This can in a way be thought of as a Galois LFSR, but we can make a simpler LFSR which is much faster:

The realization is that we don't care about the pixel values specifically increasing from 1, 2, 3 to 4 and so on. What we really need is just for the pixels to enumerate over all values, in any order. The order doesn't matter because we can always arrange the palette accordingly so the colors look nice and flush on the screen.

One fast configuration of a 4-bit LFSR like this is f.ex, where A through D represent bits 0 through 3:
A' = A^D
B' = A
C' = B
D' = C

which says that you set the input to bit 0 XOR bit 3, and shift, or in the context of Amiga bitplanes, copy bitplane 3 to bitplane 4, 2 to 3, 1 to 2, and set bitplane 1 to bitplane 1 XOR bitplane 4.

to reverse this cycle try f.ex:
A' = B
B' = C
C' = D
D' = A^B

or for a period of 2^5-1 (i.e. 5 bitplanes):
A' = B^E
B' = A
C' = B
D' = C
E' = D

for 2^6-1:
A' = E^F
B' = A
C' = B
D' = C
E' = D
F' = E

One flaw of this is that the value 0 is never enumerated, but this is negligible.

Lekman · 29 December 2013, 21:55

Quote:

which says that you set the input to bit 0 XOR bit 3, and shift, or in the context of Amiga bitplanes, copy bitplane 3 to bitplane 4, 2 to 3, 1 to 2, and set bitplane 1 to bitplane 1 XOR bitplane 4.

Just to be sure: Copy the bitplane 4 to a buffer first and finally xor bitplane 1 with buffer?

Leffmann · 30 December 2013, 17:37

Yes that should work. Just look at the equations and you can figure out the best order of operations.

I have it running in 32 colors now, but implemented like this using the Blitter it's not that much faster actually. It's probably much more efficient if you do it with the CPU, or in a situation where you do square blits without the shape mask. On the upside, you can do it all in just 2 blits, regardless of number of bitplanes.

Lekman · 30 December 2013, 20:24

With simple mask you can use interleaved bitplanes (2 blits), but with realtime calculated vector shades you need 6 blits. I will write that routine soon. Using cpu to draw lines and fill/clear the vector. I remember my old 5bpls vectorshade routine was to slow, and you can't doublebuffer a shadebob routine. I've forgotten a lot since 1997, but I remember more and more. Took down the Amiga from the attic a month ago, my hdd with the latest sources was broken ofcourse. But i had some backup on floppies.

DanScott · 11 March 2018, 21:57

Apologies for resurrecting this old thread

Is it possible to clamp a simple 4 bit LFSR to the final output value somehow ?

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
My old source code	gemanix	Coders. General	36	09 July 2017 13:33
DoomAttack source code	MickJT	support.Games	10	15 October 2013 13:03
BBS Source Code	Methanoid	request.Apps	1	21 May 2012 18:41
Source Code	camelord	support.Games	2	06 August 2010 17:45
Source Code	Thalion	project.WinUAE - Kaillera	3	28 April 2006 09:55

29 December 2013, 13:39	#2
Lekman Registered User Join Date: Dec 2013 Location: Fredrikstad/Norway Age: 46 Posts: 17	There may be a faster method discussed here: http://ada.untergrund.net/forum/inde...um=4&topic=270 And ofcourse you can use copperblits to free the cpu.

29 December 2013, 20:38	#3
copse Registered User Join Date: Jul 2009 Location: Lala Land Posts: 520	Thanks for sharing the code. There's one comment at the end which you may as well translate, just so it's all in English :-)

29 December 2013, 20:44	#4
Lekman Registered User Join Date: Dec 2013 Location: Fredrikstad/Norway Age: 46 Posts: 17	I will do The framework is old, so it may crash on some systems. It was developed on my Amiga 600.

30 December 2013, 17:37	#7
Leffmann Join Date: Jul 2008 Location: Sweden Posts: 2,269	Yes that should work. Just look at the equations and you can figure out the best order of operations. I have it running in 32 colors now, but implemented like this using the Blitter it's not that much faster actually. It's probably much more efficient if you do it with the CPU, or in a situation where you do square blits without the shape mask. On the upside, you can do it all in just 2 blits, regardless of number of bitplanes.

30 December 2013, 20:24	#8
Lekman Registered User Join Date: Dec 2013 Location: Fredrikstad/Norway Age: 46 Posts: 17	With simple mask you can use interleaved bitplanes (2 blits), but with realtime calculated vector shades you need 6 blits. I will write that routine soon. Using cpu to draw lines and fill/clear the vector. I remember my old 5bpls vectorshade routine was to slow, and you can't doublebuffer a shadebob routine. I've forgotten a lot since 1997, but I remember more and more. Took down the Amiga from the attic a month ago, my hdd with the latest sources was broken ofcourse. But i had some backup on floppies.

11 March 2018, 21:57	#9
DanScott Lemon. / Core Design Join Date: Mar 2016 Location: Tier 5 Posts: 1,209	Apologies for resurrecting this old thread Is it possible to clamp a simple 4 bit LFSR to the final output value somehow ?

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)