English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 26 November 2019, 23:25   #41
TCH
Newbie Amiga programmer
 
TCH's Avatar
 
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
@a/b, @Don_Adan:
Thanks, this latest version is finally faster than the C one, by ~3.6%.
TCH is offline  
Old 27 November 2019, 11:38   #42
deimos
It's coming back!
 
deimos's Avatar
 
Join Date: Jul 2018
Location: comp.sys.amiga
Posts: 762
Quote:
Originally Posted by TCH View Post
@a/b, @Don_Adan:
Thanks, this latest version is finally faster than the C one, by ~3.6%.
Am I missing something, or is the C version within 3 to 4% of the current best assembly version? Which C compiler and what flags?
deimos is offline  
Old 27 November 2019, 11:49   #43
Antiriad_UK
OCS forever!
 
Antiriad_UK's Avatar
 
Join Date: Mar 2019
Location: Birmingham, UK
Posts: 418
Makes me want to ditch assembler and go to C (Which I'm better at anyway lol)
Antiriad_UK is offline  
Old 27 November 2019, 12:17   #44
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,474
This is a rather particular case, due to the fact that the compiler generates perfect code in the main loop.
But yes, latest GCC is very good on 68k (I only saw the generated code, my laziness prevented me from installing it for Amiga).
I suppose it's GCC because I compiled on x86 and the generated ASM code is pretty much the same
ross is offline  
Old 27 November 2019, 12:33   #45
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,975
You can check this version, if you want, but perhaps same speed.
Code:
_PolygonBitmapToPlanes32: 
	movem.l	d2-d7/a2-a6,-(a7) 
	
        move.l	d2,a6 
	Add.l   a6,a6
	Add.l	a6,a6		; a6 = Modulo<<2 = BitplaneSize-Width<<2 
	add.w	d1,d2 
	mulu.w	d4,d2 
	Lsl.l	#2,d2		; longwords to bytes 
	sub.l	d3,d2       ; d2 = Depth*BitplaneSize-RowSize 
	subq.w	#1,d0		; Height--;
	subq.w	#1,d1		; Width--;
	subq.w	#1,d4		; Depth--; 
        Swap    D4
        Move.w  d1,d4
       
	Ext.l D0    ; d0.w can not be negative? If can use and.l #$ffff,d0
c_h:
        Swap D0
        Move.w d0,a4
        Add.l A2,a4
	eor.w	#8<<2,d0	; alternate between 0 and 8<<2 
	move.l	d4,d1
        Swap d1		    ; PlaneCounter = Depth; 
c_p:
	movea.l	a1,a3		; SrcPtr = TempArea; 
	move.l	(a4)+,A5	; CurrentPattern
	move.w	D4,d5		; WidthCounter = Width-1;
c_w:
	move.l	(a0),d7 
	move.l	A5,d6 
	eor.l	d7,d6 
	and.l	(a3)+,d6 
	eor.l	d7,d6 
	move.l	d6,(a0)+	; *DestPtr++ = (*DestPtr&~Temp)|(CurrentPattern&Temp);
	dbf	d5,c_w		; if (--WidthCounter >= 0) goto c_w;

	adda.l	a6,a0		; DestArea += BitplaneSize-Width<<2;
	dbf	d1,c_p		; if (--PlaneCounter >= 0) goto c_p;

	sub.l	d2,a0		; DestArea += RowSize-Depth*BitplaneSize;
	adda.l	d3,a1		; TempArea += RowSize; 
         Swap     d0                    ; to dbf
	dbf	d0,c_h		; if (--Height >= 0) goto c_h;

	movem.l	(a7)+,d2-d7/a2-a6
        rts
Don_Adan is offline  
Old 27 November 2019, 12:50   #46
TCH
Newbie Amiga programmer
 
TCH's Avatar
 
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
@deimos:
Yes, bebbo's GCC 6 is producing this fast results with
-O2
.

@Don_Adan:
It's faster by around 0.1% than your previous version, thanks.
TCH is offline  
Old 27 November 2019, 13:10   #47
deimos
It's coming back!
 
deimos's Avatar
 
Join Date: Jul 2018
Location: comp.sys.amiga
Posts: 762
Quote:
Originally Posted by TCH View Post
@deimos:
Yes, bebbo's GCC 6 is producing this fast results with
-O2
.
I'd be interested to see if GCC 8.3 can do even better. Would it be hard to try?
deimos is offline  
Old 27 November 2019, 13:30   #48
TCH
Newbie Amiga programmer
 
TCH's Avatar
 
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
No idea. Where can i get GCC 8.3?
TCH is offline  
Old 27 November 2019, 13:33   #49
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,474
Quote:
Originally Posted by TCH View Post
No idea. Where can i get GCC 8.3?
https://github.com/BartmanAbyss/vscode-amiga-debug
ross is offline  
Old 27 November 2019, 13:35   #50
Antiriad_UK
OCS forever!
 
Antiriad_UK's Avatar
 
Join Date: Mar 2019
Location: Birmingham, UK
Posts: 418
The Bartman lecture said they went through various compilers and that 8.3 was pretty sweet. https://www.twitch.tv/videos/468413972?t=02h20m09s

I'd probably still stay assembler, it's part of the charm of retro coding on an A500
Antiriad_UK is offline  
Old 27 November 2019, 13:37   #51
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,474
Quote:
Originally Posted by Antiriad_UK View Post
I'd probably still stay assembler, it's part of the charm of retro coding on an A500
This
ross is offline  
Old 27 November 2019, 13:37   #52
deimos
It's coming back!
 
deimos's Avatar
 
Join Date: Jul 2018
Location: comp.sys.amiga
Posts: 762
Quote:
Originally Posted by TCH View Post
No idea. Where can i get GCC 8.3?
What ross said, but here's the original thread about it too: http://eab.abime.net/showthread.php?t=98525

It's not as established as bebbo's, and I'm not sure if I find the VS Code integration all that useful, but 8 > 6?
deimos is offline  
Old 27 November 2019, 14:14   #53
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,975
Quote:
Originally Posted by TCH View Post
@deimos:
Yes, bebbo's GCC 6 is producing this fast results with
-O2
.

@Don_Adan:
It's faster by around 0.1% than your previous version, thanks.
Interesting. Something must be 2-4 cycles fastest. (SP) vs 2 swap and/or lea vs add/move?
Don_Adan is offline  
Old 27 November 2019, 14:35   #54
TCH
Newbie Amiga programmer
 
TCH's Avatar
 
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
@ross, @deimos:

This seems to be windows only. Do i miss something?

@Don_Adan:
I think it's because you spared the stack operations of
d2
. As for
d0
it can be negative as it is a coordinate.
TCH is offline  
Old 27 November 2019, 14:37   #55
deimos
It's coming back!
 
deimos's Avatar
 
Join Date: Jul 2018
Location: comp.sys.amiga
Posts: 762
Quote:
Originally Posted by TCH View Post
@ross, @deimos:

This seems to be windows only. Do i miss something?
No, all the cool kids use Windows nowadays.
deimos is offline  
Old 27 November 2019, 15:10   #56
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,474
Quote:
Originally Posted by TCH View Post
@ross, @deimos:

This seems to be windows only. Do i miss something?
No, you've to suffer
ross is offline  
Old 27 November 2019, 15:24   #57
hooverphonique
ex. demoscener "Bigmama"
 
Join Date: Jun 2012
Location: Fyn / Denmark
Posts: 1,624
Quote:
Originally Posted by deimos View Post
I'd be interested to see if GCC 8.3 can do even better. Would it be hard to try?
According to Bartman, it can - after benchmarking gcc 4,6, and 8 (I think - or maybe it was 6/7/8), he deemed that only 8 was good enough for demomaking, thus did the 8.3 vscode thing.

EDIT: Basically what Antiriad said
hooverphonique is offline  
Old 27 November 2019, 15:28   #58
a/b
Registered User
 
Join Date: Jun 2016
Location: europe
Posts: 1,039
Here is the thing with c vs. asm. And it's not meant as a critique of Bartman, original poster, or anyone else. It's good to have more people work with Amiga, regardless of the language.
Take a simple 16->1 loop. You can write it c in at least 16 different ways: for, while, do/while, predec, postdec, ...
It *does* matter. You cannot simply write c code and assume the compiler will produce optimal code. You kind of have to give it hints, like it's been demonstrated in this thread and the other one (XOR fill optimization), adjusting the counter to lead the compiler to use dbf.
And experienced asm coders generally do that on the fly. They see how the 'optimal' code should look like in asm and they write similar c constructs. Not so experienced people don't do that and the output can be moderately slower, even if the compiler is pretty good.

Last edited by a/b; 27 November 2019 at 15:34. Reason: typo
a/b is offline  
Old 27 November 2019, 15:32   #59
deimos
It's coming back!
 
deimos's Avatar
 
Join Date: Jul 2018
Location: comp.sys.amiga
Posts: 762
Quote:
Originally Posted by TCH View Post
This seems to be windows only. Do i miss something?
If you don't do Windows, and if your code can run and output two numbers (C vs asm) for valid comparison, then I don't mind doing it for you, as long as it's that easy.
deimos is offline  
Old 27 November 2019, 15:40   #60
deimos
It's coming back!
 
deimos's Avatar
 
Join Date: Jul 2018
Location: comp.sys.amiga
Posts: 762
Quote:
Originally Posted by a/b View Post
Here is the thing with c vs. asm. And it's not meant as a critique to Bartman, original poster, or anyone else. It's good to have more people work with Amiga, regardless of the language.
Take a simple 16->1 loop. You can write it c in at least 16 different ways: for, while, do/while, predec, postdec, ...
It *does* matter. You cannot simply write c code and assume the compiler will produce optimal code. You kind of have to give it hints, like it's been demonstrated in this thread and the other one (XOR fill optimization), adjusting the counter to lead the compiler to use dbf.
And experienced asm coders generally do that on the fly. They see how the 'optimal' code should look like in asm and they write similar c constructs. Not so experienced people don't do that and the output can be moderately slower, even if the compiler is pretty good.
Been there.

Even us sub-optimal people can rewrite our C code so that mostly decent assembly is produced. But we need constant reminders to not use complex indexes into arrays instead of pointers that increment, modulos instead of working around the end of arrays, etc. etc.

But, it usually only matters for a very small percentage of code that makes up the hot spots, which is easy to forget.
deimos is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Optimizing HAM8 renderer. Thorham Coders. Asm / Hardware 5 22 June 2017 18:29
NetSurf AGA optimizing arti Coders. Asm / Hardware 199 10 November 2013 14:36
Layered tile engine optimizing. Thorham Coders. General 0 30 September 2011 20:43
Benching and optimizing CF-IDE speed Photon support.Hardware 12 15 July 2009 01:48
For people who like optimizing 680x0 code. Thorham Coders. General 5 28 May 2008 11:48

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 14:53.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.10850 seconds with 13 queries