English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Language > Coders. C/C++

 
 
Thread Tools
Old 11 January 2017, 20:44   #21
bebbo
bye
 
Join Date: Jun 2016
Location: Some / Where
Posts: 680
Quote:
Originally Posted by nogginthenog View Post
Late home today due to a train drivers strike in Southern England. 2 hours to get home from work :-(

This seems to be the cause of the problems.
Error: Unknown pseudo-op: `.section'

Confirmed that m68k-amigaos-as does not support .section

Example:
Code:
paul@debian:cd ~/source/gcc6.2/amigaos-cross-toolchain/submodules/libnix/sources/stubs/stubs

/opt/m68k-amigaos/bin/m68k-amigaos-gcc __dtor_list__.c
/tmp/cctMPpYM.s: Assembler messages:
/tmp/cctMPpYM.s:3: Error: Unknown pseudo-op:  `.section'

/opt/m68k-amigaos/bin/m68k-amigaos-gcc -S __dtor_list__.c
cat __dtor_list__.s
#NO_APP
        .globl  ___DTOR_LIST__
        .section        .bss
        .align  2
___DTOR_LIST__:
        .skip 8
I will try to investigate some more tomorrow.
Thanks for the great work Bebbo, we appreciate it :-)

You need the patched binutil-2.14 which is in the toolchain:
Code:
./toolchain-m68k --gcc 6 --binutils 2.14
... it's just a start.

Bebbo
bebbo is offline  
Old 18 January 2017, 17:33   #22
alkis
Registered User
 
Join Date: Dec 2010
Location: Athens/Greece
Age: 53
Posts: 719
Can we see some generated code?

For reference here is a small function strcpy.c

Code:
char *strcpy(char *dst, const char *src) {
  char *ret = dst;
  while(*dst++=*src++)
    ;
  return ret;
}
With amiga-gcc cross-compiler 3.4.0
Code:
m68k-amigaos-gcc -fno-builtin -S strcpy.c -c -fverbose-asm -fomit-frame-pointer -O3
The resulting strcpy.s
Code:
_strcpy:
        movel sp@(4),a0 ;# dst, dst
        movel sp@(8),a1 ;# src, src
        movel a0,d1     ;# dst, ret
        .even
L2:
        moveb a1@+,d0   ;#, tmp36
        moveb d0,a0@+   ;# tmp36,
        jne L2  ;#
        movel d1,d0     ;# ret, <result>
        rts
And with gcc-m68k (not for amiga) 5.4.0
Code:
strcpy:
        move.l %a2,-(%sp)       |,
        move.l 8(%sp),%a0       | dst, dst
        move.l 12(%sp),%a2      | src, src
        move.l %a0,%a1  | dst, ivtmp.12
.L2:
        move.b (%a2)+,%d0       | MEM[base: src_8, offset: 4294967295B], D.1040
        move.b %d0,(%a1)+       | D.1040, MEM[base: _14, offset: 0B]
        jne .L2 |
        move.l %a0,%d0  |,
        move.l (%sp)+,%a2       |,
        rts
Can we see the relevant strcpy.s from 6.x please?
Thanks.
alkis is offline  
Old 18 January 2017, 20:09   #23
nogginthenog
Amigan
 
Join Date: Feb 2012
Location: London
Posts: 1,309
I don't have libnix built yet but GCC 6.2.1 produces this:

Code:
_strcpy:
        move.l a2,-(sp) |,
        move.l a0,d0    | dst, dst
        move.l a0,a2    | dst, ivtmp.11
.L2:
        move.b (a1)+,d1 | MEM[base: src_8, offset: 4294967295B], _9
        move.b d1,(a2)+ | _9, MEM[base: _14, offset: 0B]
        jne .L2 |
        move.l (sp)+,a2 |,
        rts
nogginthenog is offline  
Old 18 January 2017, 21:15   #24
matthey
Banned
 
Join Date: Jan 2010
Location: Kansas
Posts: 1,284
Quote:
Originally Posted by nogginthenog View Post
I don't have libnix built yet but GCC 6.2.1 produces this:

Code:
_strcpy:
        move.l a2,-(sp) |,
        move.l a0,d0    | dst, dst
        move.l a0,a2    | dst, ivtmp.11
.L2:
        move.b (a1)+,d1 | MEM[base: src_8, offset: 4294967295B], _9
        move.b d1,(a2)+ | _9, MEM[base: _14, offset: 0B]
        jne .L2 |
        move.l (sp)+,a2 |,
        rts
This is not using the AT&T ABI so is not comparable to what alkis posted. What option did you use to get it to pass arguments in registers?

The code is flawed of course (as usual with GCC) and should probably be the following using register arguments as above with a0=dst a1=src (inlining would be better where possible).

Code:
_strcpy:
        move.l a0,d0
.L2:
        move.b (a1)+,(a0)+
        jne .L2
        rts
matthey is offline  
Old 19 January 2017, 13:15   #25
nogginthenog
Amigan
 
Join Date: Feb 2012
Location: London
Posts: 1,309
Quote:
Originally Posted by matthey View Post
This is not using the AT&T ABI so is not comparable to what alkis posted. What option did you use to get it to pass arguments in registers?
Same as alkis.
nogginthenog is offline  
Old 19 January 2017, 17:49   #26
matthey
Banned
 
Join Date: Jan 2010
Location: Kansas
Posts: 1,284
Quote:
Originally Posted by matthey View Post
This is not using the AT&T ABI so is not comparable to what alkis posted. What option did you use to get it to pass arguments in registers?
Quote:
Originally Posted by nogginthenog View Post
Same as alkis.
Very strange that the AT&T (stack args) ABI would not be used, especially with the same options. The Amiga Geek Gadgets guys tried to introduce a more efficient ABI using registers in the unofficial Amiga GCC versions up to 3.4.0 (good idea but buggy in my experience as was the RTD support). The official GCC maintainers refused to consider anything but the AT&T ABI for the 68k, including customizable register arguments for functions, while they supported several ABIs for the x86/x86_64 for years (regparm and fastcall among a wide selection). No new ABI was introduced even for the ColdFire although the MOV3Q was introduced primarily for popping the arguments off the stack since RTD was removed (because RTD support was buggy in GCC?). I don't know if the failed support for the 68k/CF is because of incompetence or bias but probably some of both by the GCC developers and the 68k/CF developers.
matthey is offline  
Old 19 January 2017, 17:51   #27
alkis
Registered User
 
Join Date: Dec 2010
Location: Athens/Greece
Age: 53
Posts: 719
Quote:
Originally Posted by matthey View Post
...
The code is flawed of course (as usual with GCC) ..
Are you taking a stand that C compilers are flawed in the amiga?

Or...what code does your favorite amiga c compiler produces???
alkis is offline  
Old 19 January 2017, 22:37   #28
matthey
Banned
 
Join Date: Jan 2010
Location: Kansas
Posts: 1,284
Quote:
Originally Posted by alkis View Post
Are you taking a stand that C compilers are flawed in the amiga?
The code generation is flawed in comparison to what modern compilers are capable of.

Quote:
Originally Posted by alkis View Post
Or...what code does your favorite amiga c compiler produces???
Vbcc uses assembler inlines by default giving the following code for strcpy().

Code:
   move.l a0,d0
.l1:
   move.b (a1)+,(a0)+
   bne .l1
Beautiful! Perfect! Short sweet and inlined. However, if you compile the C code above you get the following.

Code:
strcpy:
   movem.l a2-a3,-(sp)
   movea.l ($10,sp),a3
   movea.l ($c,sp),a2
   move.l a2,d0
   movea.l a3,a1
   addq.l #1,a3
   movea.l a2,a0
   addq.l #1,a2
   move.b (a1),(a0)
   beq.b .l2
.l1:
   movea.l a3,a1
   addq.l #1,a3
   movea.l a2,a0
   addq.l #1,a2
   move.b (a1),(a0)
   bne.b .l1
.l2:
   movem.l (sp)+,a2-a3
   rts
Doh! Epic fail! Dr. Barthelmann and his vbcc compiler are capable of more but realistically, as I told Frank Wille, it is not going to happen without support for the 68k and Amiga. Who wants to waste time on a dead platform for a handful of people?
matthey is offline  
Old 20 January 2017, 01:34   #29
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,749
Wow, that's bad. I don't think even old compilers like SAS/C produce that kind of code. Just wow

I certainly know which compiler to avoid now. What a shame

Last edited by Thorham; 20 January 2017 at 01:59.
Thorham is offline  
Old 20 January 2017, 02:16   #30
matthey
Banned
 
Join Date: Jan 2010
Location: Kansas
Posts: 1,284
Quote:
Originally Posted by Thorham View Post
Wow, that's bad. I don't think even old compilers like SAS/C produce that kind of code. Just wow

I certainly know which compiler to avoid now. What a shame
It is embarrassing and frustrating. Vbcc can generate some of the best code and some of the worst. I don't know what happened here. It was generating better code in an earlier version. It is capable of using advanced instructions and addressing modes like move.b (a1)+,(a0)+ but then so is GCC. Maybe there is a fix already for some of the problems but more problems seem to creep back in with both vbcc and GCC. Too much complexity and not enough developers any more. Sad .
matthey is offline  
Old 20 January 2017, 02:36   #31
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,749
At least that means SAS/C isn't useless

Here's what SAS/C produces, for those who are interested.
Code:
              SECTION      text,CODE
__code:
@strcopy:
              MOVE.L         A2,-(A7)                 ;2f0a 
___strcopy__1:
              MOVE.L         A1,A2                    ;2449 
___strcopy__2:
              MOVE.B         (A0)+,D0                 ;1018 
              MOVE.B         D0,(A1)+                 ;12c0 
              BNE.B          ___strcopy__2            ;66fa 
___strcopy__3:
              MOVE.L         A2,D0                    ;200a 
___strcopy__4:
              MOVE.L         (A7)+,A2                 ;245f 
              RTS                                     ;4e75 
__const:
__strings:
              XDEF           @strcopy
              END
Thorham is offline  
Old 20 January 2017, 08:28   #32
alkis
Registered User
 
Join Date: Dec 2010
Location: Athens/Greece
Age: 53
Posts: 719
Seems like the gcc-3.4.0 produces the most efficient code then, as it avoids saving/restoring a register to stack.
alkis is offline  
Old 20 January 2017, 09:45   #33
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,749
Quote:
Originally Posted by alkis View Post
Seems like the gcc-3.4.0 produces the most efficient code then, as it avoids saving/restoring a register to stack.
For code this trivial compilers should produce:
Code:
strcpy
    move.l  a0,d0
.loop
    move.b  (a0)+,(a1)+
    bne.s   .loop

    rts
Or better:
Code:
strcpy
    move.l  a0,d0
.loop
    move.b  (a0)+,(a1)+
    beq.s   .end
    move.b  (a0)+,(a1)+
    beq.s   .end
    move.b  (a0)+,(a1)+
    beq.s   .end
    move.b  (a0)+,(a1)+
    bne.s   .loop
.end
    rts
Thorham is offline  
Old 20 January 2017, 10:50   #34
alkis
Registered User
 
Join Date: Dec 2010
Location: Athens/Greece
Age: 53
Posts: 719
Well, with

Code:
m68k-amigaos-gcc -S -O3 strcpy.c -fomit-frame-pointer -funroll-all-loops
it unrolls it 8 times

Code:
_strcpy:
        movel sp@(4),a1
        movel sp@(8),a0
        movel a1,d1
L2:
        moveb a0@+,d0
        moveb d0,a1@+
        jeq L12
        moveb a0@+,d0
        moveb d0,a1@+
        jeq L12
        moveb a0@+,d0
        moveb d0,a1@+
        jeq L12
        moveb a0@+,d0
        moveb d0,a1@+
        jeq L12
        moveb a0@+,d0
        moveb d0,a1@+
        jeq L12
        moveb a0@+,d0
        moveb d0,a1@+
        jeq L12
        moveb a0@+,d0
        moveb d0,a1@+
        jeq L12
        moveb a0@+,d0
        moveb d0,a1@+
        jne L2
        .even
L12:
        movel d1,d0
        rts
I don't understand why the peephole optimiser doesn't convert
Code:
        moveb a0@+,d0
        moveb d0,a1@+
to moveb a0@,a1@ since d0 is dead after, but hey...
alkis is offline  
Old 20 January 2017, 11:20   #35
wawa
Registered User
 
Join Date: Aug 2007
Location: berlin/germany
Posts: 1,054
not that i understand much but for the sake of it, results of both aros68k compilers im currently using:

strcpy:
move.l 4(%sp),%d0 | dst, dst
move.l 8(%sp),%a1 | src, src
move.l %d0,%a0 | dst, ivtmp.11
.L2:
move.b (%a1)+,%d1 | MEM[base: src_8, offset: 4294967295B], _9
move.b %d1,(%a0)+ | _9, MEM[base: _14, offset: 0B]
jne .L2 |
rts
.size strcpy, .-strcpy
.ident "GCC: (GNU) 6.1.0"


strcpy:
move.l 4(%sp),%d0 | dst, dst
move.l 8(%sp),%a1 | src, ivtmp.9
move.l %d0,%a0 | dst, dst
.L2:
move.b (%a1)+,%d1 | MEM[base: D.779_19, offset: 0B], D.759
move.b %d1,(%a0)+ | D.759, MEM[base: dst_1, offset: 0B]
jne .L2 |
rts
.size strcpy, .-strcpy
.ident "GCC: (GNU) 4.6.4"
wawa is offline  
Old 20 January 2017, 14:13   #36
hooverphonique
ex. demoscener "Bigmama"
 
Join Date: Jun 2012
Location: Fyn / Denmark
Posts: 1,624
Quote:
Originally Posted by Thorham View Post
For code this trivial compilers should produce:
Code:
strcpy
    move.l  a0,d0
what is
Code:
strcpy
    move.l  a0,d0
for?
hooverphonique is offline  
Old 20 January 2017, 17:15   #37
idrougge
Registered User
 
Join Date: Sep 2007
Location: Stockholm
Posts: 4,332
Quote:
Originally Posted by Thorham View Post
Code:
strcpy
    move.l  a0,d0
.loop
    move.b  (a0)+,(a1)+
    beq.s   .end
    move.b  (a0)+,(a1)+
    beq.s   .end
    move.b  (a0)+,(a1)+
    beq.s   .end
    move.b  (a0)+,(a1)+
    bne.s   .loop
.end
    rts
What is gained by unrolling one loop into four loops?
idrougge is offline  
Old 20 January 2017, 17:20   #38
Samurai_Crow
Total Chaos forever!
 
Samurai_Crow's Avatar
 
Join Date: Aug 2007
Location: Waterville, MN, USA
Age: 49
Posts: 2,186
Quote:
Originally Posted by idrougge View Post
What is gained by unrolling one loop into four loops?
Duff's device is not cache friendly and offers no speed benefit in this case.
Samurai_Crow is offline  
Old 20 January 2017, 18:01   #39
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,749
Quote:
Originally Posted by alkis View Post
Well, with m68k-amigaos-gcc -S -O3 strcpy.c -fomit-frame-pointer -funroll-all-loops it unrolls it 8 times
That's something, at least.

Quote:
Originally Posted by alkis View Post
I don't understand why the peephole optimiser doesn't convert
Code:
        moveb a0@+,d0
        moveb d0,a1@+
to moveb a0@,a1@ since d0 is dead after, but hey...
Yeah, that's pretty shitty and makes no sense.

Quote:
Originally Posted by hooverphonique View Post
what is
Code:
strcpy
    move.l  a0,d0
for?
D0 is the return value.

Quote:
Originally Posted by idrougge View Post
What is gained by unrolling one loop into four loops?
On 68020s and 68030s taken branches are 8 cycles, not taken byte branches are 4 cycles.
Thorham is offline  
Old 20 January 2017, 18:24   #40
matthey
Banned
 
Join Date: Jan 2010
Location: Kansas
Posts: 1,284
Quote:
Originally Posted by alkis View Post
Well, with

Code:
m68k-amigaos-gcc -S -O3 strcpy.c -fomit-frame-pointer -funroll-all-loops
it unrolls it 8 times

Code:
_strcpy:
        movel sp@(4),a1
        movel sp@(8),a0
        movel a1,d1
L2:
        moveb a0@+,d0
        moveb d0,a1@+
        jeq L12
        moveb a0@+,d0
        moveb d0,a1@+
        jeq L12
        moveb a0@+,d0
        moveb d0,a1@+
        jeq L12
        moveb a0@+,d0
        moveb d0,a1@+
        jeq L12
        moveb a0@+,d0
        moveb d0,a1@+
        jeq L12
        moveb a0@+,d0
        moveb d0,a1@+
        jeq L12
        moveb a0@+,d0
        moveb d0,a1@+
        jeq L12
        moveb a0@+,d0
        moveb d0,a1@+
        jne L2
        .even
L12:
        movel d1,d0
        rts
I expect it is generally better to inline strcpy() where it is called than unroll strcpy() after a costly push/bsr/rts/pop (assuming strings are not excessively long). This is what vbcc does with the assembler inlines.

Quote:
Originally Posted by alkis View Post
I don't understand why the peephole optimiser doesn't convert
Code:
        moveb a0@+,d0
        moveb d0,a1@+
to moveb a0@,a1@ since d0 is dead after, but hey...
That is not a peephole optimization. Further analysis outside of this code snippet is needed to conclude the optimization can be done (it is not guaranteed to be an equivalent replacement). Two instruction input peephole optimizations are aggressive too. Vasm only looks at one instruction max (but can output multiple instructions) and it is currently the best 68k peephole optimizing assembler.

@wawa
The aros68k GCC compiler is doing a good job here. It should be able to merge 2 lines inside the loop but overall good.

Quote:
Originally Posted by hooverphonique View Post
what is
Code:
strcpy
    move.l  a0,d0
for?
strcpy() returns a pointer to the destination. The return is rarely used because it is already available and doesn't change but that is how it is defined.

Code:
char *strcpy(char *dst, const char *src);
It is much more useful to return a pointer to the end of the string like stpcpy().

Code:
char *stpcpy(char *dst, const char *src);

char *stpcpy(char *dst, const char *src)
{
while (*dst++ = *src++);
return (dst-1);
}
This came from SAS/C to BSD where it became part of POSIX but not C99 or C11.

Last edited by matthey; 20 January 2017 at 18:34.
matthey is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
New GCC based dev toolchain for AmigaOS 3.x cla Coders. Releases 8 24 December 2017 10:18
Issue with photon/xxxx WinUAE Toolchain arpz Coders. Asm / Hardware 2 26 September 2015 22:33
New 68k gcc toolchain arti Coders. C/C++ 17 31 July 2015 03:59
Hannibal's WinUAE Demo Toolchain 5 Bobic Amiga scene 1 23 July 2015 21:04
From gcc to vbcc. Cowcat Coders. General 9 06 June 2014 14:45

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 02:38.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.12783 seconds with 14 queries