English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 30 September 2011, 20:43   #1
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,767
Layered tile engine optimizing.

Hi,

I've been working on my Advance Wars 2 conversion, and have written a new layered tile engine (old one sucked) that needs to run as fast as possible Basically the lowest target is an A1200 with some trapdoor fastmem (2 MB?) and an HD.

The question is: Can the code below be optimized further? I think it's fast enough already (haven't tested it yet), but some input on the subject would be greatly appreciated The code should be optimized for 68020s and 68030s, anything above will run this fast enough anyway.

The code simply reads four layers of 16x16 pixel bitmaps with masks (except the first layer). The masks are interleaved into the bitmap data so you can read 32 mask bits and after that 32 tile bits (the routine reads two lines of 16 pixels at a time). This is done twice, after which there is a simple transpose (from Kalms' c2p), the two longwords are then written to chipmem.

Note that this routine does not handle movement of individual sprites, everything is simply 16x16 pixel aligned. The required frame rate is about 6 or 7 frames per second (I'll write other code for things that require super smoothness).

If anyone can see a way to do it better, then let's hear it! Any questions? Please ask, and sorry about the lack of comments

Code:
update
	movem.l	d0-a6,-(sp)
	subq.l	#12,sp

	move.l	gfx_bank_table,-(sp)
	move.l	screen_map,-(sp)
	move.l	#10240-16*4,d3	; may be wrong, check

	move.l	#160,-(sp)
.loopz
	move.l	4(sp),a5
	move.l	8(sp),a4

	move.l	(a4)+,a0
	add.l	(a5)+,a0
	move.l	(a4)+,a1
	add.l	(a5)+,a1
	move.l	(a4)+,a2
	add.l	(a5)+,a2
	move.l	(a4)+,d2
	add.l	(a5)+,d2

	move.l	(a4)+,a3
	add.l	(a5)+,a3
	move.l	(a4)+,d0
	add.l	(a5)+,d0
	move.l	(a4)+,d1
	add.l	(a5)+,d1
	move.l	(a4)+,d5
	add.l	(a5)+,d5

	move.l	d0,a4
	move.l	a5,4(sp)
	move.l	d1,a5

	moveq	#8-1,d6
.loopy
	moveq	#8-1,d7
.loopx
	move.l	(a0)+,d0
	and.l	(a1)+,d0
	or.l	(a1)+,d0
	and.l	(a2)+,d0
	or.l	(a2)+,d0
	exg	d2,a2
	and.l	(a2)+,d0
	or.l	(a2)+,d0

	move.l	(a3)+,d1
	and.l	(a4)+,d1
	or.l	(a4)+,d1
	and.l	(a5)+,d1
	or.l	(a5)+,d1
	exg	d5,a5
	and.l	(a5)+,d1
	or.l	(a5)+,d1

	swap	d1
	eor.w	d0,d1
	eor.w	d1,d0
	move.l	d0,(a6)
	add.l	d4,a6
	exg	d2,a2
	eor.w	d0,d1
	swap	d1
	move.l	d1,(a6)
	add.l	d4,a6
	exg	d5,a5
.nextx
	dbra	d7,.loopx
	add.l	d3,a6
.nexty
	dbra	d6,.loopy
	sub.l	#81920+40*16-4,a6	; may be wrong, check
.nextz
	move.l	(sp),d0
	subq.l	#1,d0
	move.l	d0,(sp)
	bne	.loopz
.exit
	add.l	#12,sp
	movem.l	(sp)+,d0-a6
	rts

Last edited by Thorham; 01 October 2011 at 00:49.
Thorham is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Optimizing WHDLoad config for 040/060 8bitbubsy project.WHDLoad 1 03 November 2011 22:37
Optimizing question: instruction order TheDarkCoder Coders. Asm / Hardware 9 29 October 2011 17:07
Benching and optimizing CF-IDE speed Photon support.Hardware 12 15 July 2009 01:48
For people who like optimizing 680x0 code. Thorham Coders. General 5 28 May 2008 11:48
Tile map sample Blip Coders. General 1 18 July 2007 13:53

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 12:19.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.06292 seconds with 13 queries