English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Language > Coders. C/C++

 
 
Thread Tools
Old 10 January 2021, 18:01   #1
Jobbo
Registered User
 
Jobbo's Avatar
 
Join Date: Jun 2020
Location: Druidia
Posts: 387
GCC optimization bug?

[Edit] It wasn't an optimization bug, turned out to be bad copper setup.

I've been using the awesome VSCode C/C++ environment by Bartman/Abyss.

However now that my project has reached a certain size I'm running into a weird bug that I can't easily attribute to any code, it's very hard to determine exactly what is going on. The bug seems to be some optimization problem causing my copper driven blitter lines to flicker.

I've been able to narrow the problem down to the use of -fwhole-program. If I turn that option off everything seems fine. While trying to investigate why this might be I found this documentation:

-fwhole-program
Assume that the current compilation unit represents the whole program being compiled. All public functions and variables with the exception of main and those merged by attribute externally_visible become static functions and in effect are optimized more aggressively by interprocedural optimizers.
This option should not be used in combination with -flto. Instead relying on a linker plugin should provide safer and more precise information.

The documentation says you should not combine fwhole-programe with -flto but that is what the sample makefile does so I'm doing the same.

This is where I found the information about -fwhole-program

https://gcc.gnu.org/onlinedocs/gcc/O...e-Options.html

It later states: "If the program does not require any symbols to be exported, it is possible to combine -flto and -fwhole-program..." So it all seems quite murky.

Does anyone know more about gcc compiler options than I do and had some thoughts on what might be going on, or suggestions for further narrowing the problem down?

Last edited by Jobbo; 14 January 2021 at 05:44.
Jobbo is online now  
Old 10 January 2021, 18:20   #2
Ernst Blofeld
<optimized out>
 
Ernst Blofeld's Avatar
 
Join Date: Sep 2020
Location: <optimized out>
Posts: 321
Quote:
Originally Posted by Jobbo View Post
copper driven blitter lines
Oh my, I can't even think how you would start debugging that.

The only thing I can think to suggest, based on what you've said about the optimisation flags you've tried toggling, is to check whether you need to add any memory barriers to force the optimiser to keep things in the order they're written: https://stackoverflow.com/questions/...0the%20barrier.
Ernst Blofeld is offline  
Old 10 January 2021, 18:40   #3
pipper
Registered User
 
Join Date: Jul 2017
Location: San Jose
Posts: 652
Are the hardware registers all marked volatile? If not the compiler may omit seemingly superfluous reads and reorder writes to them.
pipper is offline  
Old 10 January 2021, 20:35   #4
Jobbo
Registered User
 
Jobbo's Avatar
 
Join Date: Jun 2020
Location: Druidia
Posts: 387
Adding -fno-tree-loop-vectorize to the gcc options seems to have fixed it instead of removing -fwhole-program.

Not really sure what that does.
Jobbo is online now  
Old 10 January 2021, 21:30   #5
Ernst Blofeld
<optimized out>
 
Ernst Blofeld's Avatar
 
Join Date: Sep 2020
Location: <optimized out>
Posts: 321
Quote:
Originally Posted by Jobbo View Post
Adding -fno-tree-loop-vectorize to the gcc options seems to have fixed it instead of removing -fwhole-program.

Not really sure what that does.
If disabling certain compiler optimisations makes your code works, it's possible your code actually has undefined behaviour, and the non-working version is actually just as valid. Things like signed numbers overflowing or copying data between memory areas that overlap being optimised into memcpy calls. Or you may have disabled reordering that memory barriers would prevent, i.e. updates to your copper lists may happen out of order with your wait for vertical blank.

Last edited by Ernst Blofeld; 11 January 2021 at 09:08.
Ernst Blofeld is offline  
Old 11 January 2021, 01:04   #6
Jobbo
Registered User
 
Jobbo's Avatar
 
Join Date: Jun 2020
Location: Druidia
Posts: 387
I’ve been trying all sorts of things with no luck.

However turning off warp mode which was on during the init phase also fixes the problem.

Not sure if that gives any clues?
Jobbo is online now  
Old 11 January 2021, 01:34   #7
Antiriad_UK
OCS forever!
 
Antiriad_UK's Avatar
 
Join Date: Mar 2019
Location: Birmingham, UK
Posts: 418
I’ve only had warp mode break something when floppy emulation speed was not set to 100%. Doesn’t seem likely in this case though.

Only other issue I’ve had with gcc is with a macro change of mulsw in Amigaklang and a recent change in Bartmans macro definition. If optimisation is on something breaks in mulsw but we’ve not worked out why. When I turn optimisations off, or roll back to the old macro it works. Long shot, but if you using that macro maybe worth a look.
Antiriad_UK is offline  
Old 11 January 2021, 11:57   #8
hooverphonique
ex. demoscener "Bigmama"
 
Join Date: Jun 2012
Location: Fyn / Denmark
Posts: 1,624
I agree with Ernst that it sounds like you have a bug/undefined behavior, which can manifest itself differently depending on different optimization options (i.e. messing with them is probably not going to fix it)..
hooverphonique is offline  
Old 11 January 2021, 15:22   #9
Jobbo
Registered User
 
Jobbo's Avatar
 
Join Date: Jun 2020
Location: Druidia
Posts: 387
I agree that it has that stink about it, I'm just not really sure what sort of undefined behavior might lead to problems?

So far I've been able to fix the problem in lots of different ways.
- Adjust optimization settings
- Remove debug asserts
- Remove debug KPrint calls
- Move the warpmode call to earlier

The problem persists however when I change my many inline asm functions to use a C version.

I'm not sure that any of this gives me anything to go on. What it might suggest is that the problem is in the initialization phase where I allocate and setup my buffers, because those are where the debug asserts and prints mostly live and that is what the warpmode is used to speed up.

The per frame code is the inline asm and so it kind of makes sense that that changing that code isn't helping if the problem is all in initialization.

I've tried examining the different .s files from a working and non-working compile and so far there's nothing that really jumps out as the problem.

Any further suggestions on how to narrow this down would be much appreciated.
Jobbo is online now  
Old 11 January 2021, 15:29   #10
Jobbo
Registered User
 
Jobbo's Avatar
 
Join Date: Jun 2020
Location: Druidia
Posts: 387
I should also point out that I've tried turning on a whole bunch of extra warnings for undefined behavior and haven't found anything in my code.

Some of the gcc_support code causes warnings, so not sure if I am tripping something in there?
Jobbo is online now  
Old 11 January 2021, 15:37   #11
Ernst Blofeld
<optimized out>
 
Ernst Blofeld's Avatar
 
Join Date: Sep 2020
Location: <optimized out>
Posts: 321
Have you tried adding memory barriers all over the place?
Ernst Blofeld is offline  
Old 11 January 2021, 15:45   #12
Jobbo
Registered User
 
Jobbo's Avatar
 
Join Date: Jun 2020
Location: Druidia
Posts: 387
I have with no luck.
Jobbo is online now  
Old 11 January 2021, 15:47   #13
Jobbo
Registered User
 
Jobbo's Avatar
 
Join Date: Jun 2020
Location: Druidia
Posts: 387
I'm using:

asm volatile("" ::: "memory");

as a memory barrier. Not sure if there is some other method?
Jobbo is online now  
Old 11 January 2021, 16:23   #14
Ernst Blofeld
<optimized out>
 
Ernst Blofeld's Avatar
 
Join Date: Sep 2020
Location: <optimized out>
Posts: 321
Quote:
Originally Posted by Jobbo View Post
I'm using:

asm volatile("" ::: "memory");

as a memory barrier. Not sure if there is some other method?
That's the only method I know of:

Code:
#define BARRIER() asm volatile ("" ::: "memory")
Are all your inline assembly blocks declared with
volatile
too, and
memory
if they need that?

It's all guesswork without seeing your code, or even the effect you're describing.

It could be something as simple as how you wait for vertical blank.

Last edited by Ernst Blofeld; 12 January 2021 at 10:06.
Ernst Blofeld is offline  
Old 11 January 2021, 16:35   #15
Jobbo
Registered User
 
Jobbo's Avatar
 
Join Date: Jun 2020
Location: Druidia
Posts: 387
All but one inline assembly have been switched to the C version and the problem persists.

I think I will look into more compiler options that might help narrow down any undefined behavior. Stuff like signed overflow seems like it would be easy to trip without knowing.
Jobbo is online now  
Old 14 January 2021, 05:45   #16
Jobbo
Registered User
 
Jobbo's Avatar
 
Join Date: Jun 2020
Location: Druidia
Posts: 387
I resolved this, it was a copper setup bug.

http://eab.abime.net/showthread.php?t=105358
Jobbo is online now  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Abs to PC-rel optimization on 68040 phx Coders. Asm / Hardware 18 12 June 2019 13:31
Code optimization. gazj82 Coders. Blitz Basic 26 08 July 2018 15:56
3D Graphics: possible optimization? sandruzzo Coders. General 3 26 February 2016 08:01
Loop optimization + cycle counts losso Coders. Asm / Hardware 8 05 November 2013 11:50
ARM Assembler Optimization finkel Coders. General 10 01 December 2010 11:56

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 18:41.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.09037 seconds with 13 queries