English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 27 November 2019, 15:50   #61
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 54
Posts: 4,488
https://franke.ms/cex/

EDIT: but it fail to compile many snippet in gcc 8 e 9 :|

Last edited by ross; 27 November 2019 at 16:21.
ross is offline  
Old 27 November 2019, 16:03   #62
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 56
Posts: 2,033
Quote:
Originally Posted by TCH View Post
@ross, @deimos:

This seems to be windows only. Do i miss something?

@Don_Adan:
I think it's because you spared the stack operations of
d2
. As for
d0
it can be negative as it is a coordinate.
If you can replace
Move.w d0,a4
Add.l A2,a4
with
Lea (a2,d0.w),a4
and check it will be nice
Don_Adan is offline  
Old 27 November 2019, 16:32   #63
TCH
Newbie Amiga programmer
 
TCH's Avatar
 
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
Quote:
Originally Posted by deimos View Post
No, all the cool kids use Windows nowadays.
Poor cool kids.
Quote:
Originally Posted by ross View Post
No, you've to suffer
On the contrary, i have my alternatives.
Quote:
Originally Posted by deimos View Post
If you don't do Windows, and if your code can run and output two numbers (C vs asm) for valid comparison, then I don't mind doing it for you, as long as it's that easy.
It's not that easy, but it's not that hard. As i explained previously, i have a main cycle of 64 loops which calls the drawing function. And inside the function i change which approach i use, by commenting out the other one. Then i compile it and measure the running time with 'amtime'.
You can download the sources here:
http://oscomp.hu/depot/polygon1.c
http://oscomp.hu/depot/ClearBlock32.68k
http://oscomp.hu/depot/PolygonBitmapToPlanes32.3.68k
You may pass the
-spaces
to compile the assembly sources, because otherwise VAsm chokes on whitespaces.
Also, i added the
USE_ASMPBTP
definition as a macro fork, to be able to switch between the C and ASM copying routine. (Do not mind the colour change, it is due the reverse order of the pattern LUT, it is irrelevant in the test.)
Quote:
Originally Posted by ross View Post
Thanks, i did not know that this site has a 68k version.
Quote:
Originally Posted by Don_Adan View Post
If you can replace
Move.w d0,a4
Add.l A2,a4
with
Lea (a2,d0.w),a4
and check it will be nice
Well, another 0.034% gain, thanks.
TCH is offline  
Old 27 November 2019, 16:40   #64
deimos
It's coming back!
 
deimos's Avatar
 
Join Date: Jul 2018
Location: comp.sys.amiga
Posts: 762
Quote:
Originally Posted by TCH View Post
It's not that easy, but it's not that hard. As i explained previously, i have a main cycle of 64 loops which calls the drawing function. And inside the function i change which approach i use, by commenting out the other one. Then i compile it and measure the running time with 'amtime'.
You're measuring the total elapsed time, including loading and initialising the executable?
deimos is offline  
Old 27 November 2019, 17:07   #65
deimos
It's coming back!
 
deimos's Avatar
 
Join Date: Jul 2018
Location: comp.sys.amiga
Posts: 762
Quote:
Originally Posted by TCH View Post
It's not that easy.
The way you're calling your assembly routines won't work under GCC 8.3.
deimos is offline  
Old 27 November 2019, 18:14   #66
TCH
Newbie Amiga programmer
 
TCH's Avatar
 
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
Quote:
Originally Posted by deimos View Post
You're measuring the total elapsed time, including loading and initialising the executable?
Yes. Since i run the function multiple times, the rest of the time what was taken by the main program is insignificant. But even if it would not be, since it is the same, it still would be almost identical in both case, so the difference between the two version's total time gives back the difference between the two algorithm.
Quote:
Originally Posted by deimos View Post
The way you're calling your assembly routines won't work under GCC 8.3.
Long live backward compatibility and versatility through optionality...

Is it even possible to do something similar in GCC 8?
TCH is offline  
Old 27 November 2019, 18:22   #67
deimos
It's coming back!
 
deimos's Avatar
 
Join Date: Jul 2018
Location: comp.sys.amiga
Posts: 762
Quote:
Originally Posted by TCH View Post
Yes. Since i run the function multiple times, the rest of the time what was taken by the main program is insignificant. But even if it would not be, since it is the same, it still would be almost identical in both case, so the difference between the two version's total time gives back the difference between the two algorithm.
Maybe, but possibly not in percentage terms, which is how we've been shown times so far. There may have been far more significant gains in the assembly language versions than we were led to believe.

Quote:
Originally Posted by TCH View Post
Long live backward compatibility and versatility through optionality...

Is it even possible to do something similar in GCC 8?
Yes, but in a more sophisticated and versatile way.

Edit:

If you feel like continuing with GCC 8 yourself, here's how I do my assembly integration - should get you far enough to google the rest.

Code:
void ClipAndFillPolygon2D(const UWORD n, const Point2D * polygon, const UWORD colour) {
    WaitBlit();

    {
        volatile register APTR * _a0 __asm("a0") = display.backBufferBitplanes;
        volatile register UWORD _d0 __asm("d0") = n;
        volatile register Point2D * _a1 __asm("a1") = polygon;
        volatile register UWORD _d1 __asm("d1") = colour;

        __asm volatile (
            "        jsr     _scanlineFill"
            : // no outputs
            : "r" (_d0), "r" (_d1), "r" (_a0), "r" (_a1) // inputs
            : // no registers clobbered
        );
    }
}

Last edited by deimos; 27 November 2019 at 18:29.
deimos is offline  
Old 27 November 2019, 18:30   #68
TCH
Newbie Amiga programmer
 
TCH's Avatar
 
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
Quote:
Originally Posted by deimos View Post
Maybe, but possibly not in percentage terms, which is how we've been shown times so far.
Yeah, that may be true.
Quote:
Originally Posted by deimos View Post
There may have been far more significant gains in the assembly language versions than we were led to believe.
Now, that can not be true. What would impact the total time so much, that there would be a significant delay on the whole runtime? And why would it only occur, when i use the assembly routines here? Because if it would occur all the time, then the differences still would be the same.
Besides, when i measure the time two times with the same setup, the difference can be positive and negative, but it is always in the microsecond interval.
Quote:
Originally Posted by deimos View Post
Yes, but in a more sophisticated and versatile way.
Okay, but how? Also, if it is more versatile, then why cannot work this way?

Edit: Isn't this approach available in GCC 6 too? (Because i remember it from GCC 4.) Besides this is not what i meant; i was talking about including a VAsm (or whatever) assembled object, and calling it's internal routines.
Also, i cannot continue with GCC 8, until it is not ported to macOS, Linux, Solaris, or any BSD.

Last edited by TCH; 27 November 2019 at 18:38. Reason: deimos edited his post, before i finished my answer
TCH is offline  
Old 27 November 2019, 18:46   #69
deimos
It's coming back!
 
deimos's Avatar
 
Join Date: Jul 2018
Location: comp.sys.amiga
Posts: 762
I really think we should just measure the thing we're trying to measure.

It's useful to know other measurements, so we can know where to put our effort into, but they don't help to measure the improvement that's been made to the function people have been working on. It also doesn't help that there are no direct measurements being taken, only through an external 'amtime' tool.

Regarding calling assembly code, of course you can code an assembly routine to accept parameters C style, but if you're trying to optimise a heavily used function that probably isn't what you want to do. You want to give everything possible to gcc and let it inline and optimise - that's its job.
deimos is offline  
Old 27 November 2019, 18:47   #70
deimos
It's coming back!
 
deimos's Avatar
 
Join Date: Jul 2018
Location: comp.sys.amiga
Posts: 762
Quote:
Originally Posted by TCH View Post
Also, i cannot continue with GCC 8, until it is not ported to macOS, Linux, Solaris, or any BSD.
Well then, I guess you're going to have a busy weekend.
deimos is offline  
Old 27 November 2019, 19:13   #71
TCH
Newbie Amiga programmer
 
TCH's Avatar
 
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
Quote:
Originally Posted by deimos View Post
I really think we should just measure the thing we're trying to measure.
Perhaps...but i think the precisity it would yield is not really significant. The more cycles the main loop has, the smaller significance the rest of the code has. In running time of course.
Quote:
Originally Posted by deimos View Post
It's useful to know other measurements, so we can know where to put our effort into, but they don't help to measure the improvement that's been made to the function people have been working on. It also doesn't help that there are no direct measurements being taken, only through an external 'amtime' tool.
If i have a program with one algorithm and have the same program with that algorithm changed, then the difference between the running times will show if there is improvement or not. You cannot just measure microsecond level improvements, not even by direct measurement; except if you do it in a loop which has a high count. But then the difference will be almost identical compared to the result gained with the external tool. Granted, your approach must be more precise, but unless the external measurement would not induce some kind of uncertainity wich would impact the measuring precisity by serious percents (which i think is not really possible), it is equally acceptable, compared to the direct approach.
Quote:
Originally Posted by deimos View Post
Regarding calling assembly code, of course you can code an assembly routine to accept parameters C style, but if you're trying to optimise a heavily used function that probably isn't what you want to do. You want to give everything possible to gcc and let it inline and optimise - that's its job.
You said, it will not work. I took that literally. About the rest, well, i only use C as a frame: i do the non-speed-critical parts in it, and write the algorithm for the speed-critical parts in it, to have a PoC code, to test everything and when it's done, i rewrite it in assembly.
It may be not perfect and not the most optimized, but i am a newbie regarding these kind of stuff, i am on the learning curve, with time, it will be more optimized.
Quote:
Originally Posted by deimos View Post
Well then, I guess you're going to have a busy weekend.
Yeah, i really have to tidy my flat already...

Or did you mean porting the GCC 8 crosscompiler to some UNIX? Then guess again, this is way beyond my expertise.
TCH is offline  
Old 27 November 2019, 19:30   #72
deimos
It's coming back!
 
deimos's Avatar
 
Join Date: Jul 2018
Location: comp.sys.amiga
Posts: 762
Quote:
Perhaps...but i think the precisity it would yield is not really significant. The more cycles the main loop has, the smaller significance the rest of the code has. In running time of course.
In one of your posts you reported a performance improvement of 0.034%. My issue is with numbers like that and with the way you've measured them. You haven't reported the actual improvement of the code that people have worked so hard on. If you disagree with me, fine. Not bothered.

Regarding your messy flat? I can't help you there, but maybe your time would be better spent installing a copy of Windows so that you have more spare time.
deimos is offline  
Old 27 November 2019, 19:39   #73
TCH
Newbie Amiga programmer
 
TCH's Avatar
 
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
Quote:
Originally Posted by deimos View Post
In one of your posts you reported a performance improvement of 0.034%. My issue is with numbers like that and with the way you've measured them. You haven't reported the actual improvement of the code that people have worked so hard on.
I've run the program with both algorithm several times, summed the total runtimes, divided by the number of calls and calculated the difference. The improvement was obvious. Besides, if you are worried about hiccups, those can interfere measurement with the direct approach too.
Quote:
Originally Posted by deimos View Post
Regarding your messy flat? I can't help you there, but maybe your time would be better spent installing a copy of Windows so that you have more spare time.
I do not own any copy of windows, but even if i would, i'd rather tidy the flat.
TCH is offline  
Old 27 November 2019, 19:47   #74
deimos
It's coming back!
 
deimos's Avatar
 
Join Date: Jul 2018
Location: comp.sys.amiga
Posts: 762
Quote:
Originally Posted by TCH View Post
The improvement was obvious.
Then why not tell us the improvement in the actual routine that was optimised?

Quote:
Originally Posted by TCH View Post
I do not own any copy of windows, but even if i would, i'd rather tidy the flat.
Everyone is allowed their own foibles, however misguided.
deimos is offline  
Old 27 November 2019, 19:56   #75
TCH
Newbie Amiga programmer
 
TCH's Avatar
 
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
Quote:
Originally Posted by deimos View Post
Then why not tell us the improvement in the actual routine that was optimised?
I did. You questioned it. Without actual countermeasures. All the sources are here in the topic, you can measure anything you want and you can provide the sources and results. If you're right and the measures done by external tools ('gnutime', 'amtime', whatever) are leading to vast imprecisity (up to measureable percentages), then i will change my measuring approach. I am convinceable.
Quote:
Originally Posted by deimos View Post
Everyone is allowed their own foibles, however misguided.
What do you mean? Flats needs to be tidied sometimes.
TCH is offline  
Old 27 November 2019, 20:00   #76
deimos
It's coming back!
 
deimos's Avatar
 
Join Date: Jul 2018
Location: comp.sys.amiga
Posts: 762
Quote:
Originally Posted by TCH View Post
I did.
Where did you do that?
deimos is offline  
Old 27 November 2019, 20:01   #77
TCH
Newbie Amiga programmer
 
TCH's Avatar
 
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
What? Whenever somebody sent in a new iteration of the routine, i did tell, if there was an improvement. Did you read the topic?
TCH is offline  
Old 27 November 2019, 20:05   #78
deimos
It's coming back!
 
deimos's Avatar
 
Join Date: Jul 2018
Location: comp.sys.amiga
Posts: 762
Quote:
Originally Posted by TCH View Post
What? Whenever somebody sent in a new iteration of the routine, i did tell, if there was an improvement. Did you read the topic?
You've obviously chosen not to hear what I've said. You've said whether or not there were improvements, and given percentage measurements. But the percentage measurements were not of the thing that was changing, i.e. you gave only the time for the entire executable, taken with an outside tool, giving the impression that the assembly optimisations where giving minute 3-4% improvements, when the reality was that they were probably 5 or 10 times that. But we'll never know because you won't measure.
deimos is offline  
Old 27 November 2019, 20:07   #79
TCH
Newbie Amiga programmer
 
TCH's Avatar
 
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
No, i just did not understood what are you wanted from me. And i am still not sure. If by "change", you mean the change in the source, that was not done by me, but other forum members. And it is posted here, so you can see for yourself.

Edit: If program with old algorithm runs 100 secs and with the new it runs 95 and everything else is unchanged, then it's a 5% percent speedup. What's your problem?

Last edited by TCH; 27 November 2019 at 20:13. Reason: deimos edited his post
TCH is offline  
Old 27 November 2019, 20:16   #80
deimos
It's coming back!
 
deimos's Avatar
 
Join Date: Jul 2018
Location: comp.sys.amiga
Posts: 762
Quote:
Originally Posted by TCH View Post
No, i just did not understood what are you wanted from me. And i am still not sure. If by "change", you mean the change in the source, that was not done by me, but other forum members. And it is posted here, so you can see for yourself.
No, not change in source code. Change in time spent in the effected parts of code. But I'm sure you know that. If you really want to know the how the code has improved, you'll measure around that - it's not hard, if your code is O/S friendly it's only few lines.
deimos is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Optimizing HAM8 renderer. Thorham Coders. Asm / Hardware 5 22 June 2017 18:29
NetSurf AGA optimizing arti Coders. Asm / Hardware 199 10 November 2013 14:36
Layered tile engine optimizing. Thorham Coders. General 0 30 September 2011 20:43
Benching and optimizing CF-IDE speed Photon support.Hardware 12 15 July 2009 01:48
For people who like optimizing 680x0 code. Thorham Coders. General 5 28 May 2008 11:48

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 17:58.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.12074 seconds with 13 queries