English Amiga Board - GCC 6.2 toolchain for AmigaOS 3

Page 2 of 78

Last »

Show 20 post(s) from this thread on one page

English Amiga Board (https://eab.abime.net/index.php)

- Coders. C/C++ (https://eab.abime.net/forumdisplay.php?f=118)

- - GCC 6.2 toolchain for AmigaOS 3 (https://eab.abime.net/showthread.php?t=85474)

bebbo

11 January 2017 20:44

Quote:

Originally Posted by nogginthenog (Post 1133858)

Late home today due to a train drivers strike in Southern England. 2 hours to get home from work :-(

This seems to be the cause of the problems.
Error: Unknown pseudo-op: `.section'

Confirmed that m68k-amigaos-as does not support .section

Example:

Code:

paul@debian:cd ~/source/gcc6.2/amigaos-cross-toolchain/submodules/libnix/sources/stubs/stubs



/opt/m68k-amigaos/bin/m68k-amigaos-gcc __dtor_list__.c

/tmp/cctMPpYM.s: Assembler messages:

/tmp/cctMPpYM.s:3: Error: Unknown pseudo-op:  `.section'



/opt/m68k-amigaos/bin/m68k-amigaos-gcc -S __dtor_list__.c

cat __dtor_list__.s

#NO_APP

        .globl  ___DTOR_LIST__

        .section        .bss

        .align  2

___DTOR_LIST__:

        .skip 8

I will try to investigate some more tomorrow.
Thanks for the great work Bebbo, we appreciate it :-)

You need the patched binutil-2.14 which is in the toolchain:

Code:

./toolchain-m68k --gcc 6 --binutils 2.14

... it's just a start.

Bebbo

alkis

18 January 2017 17:33

Can we see some generated code?

For reference here is a small function strcpy.c

Code:

char *strcpy(char *dst, const char *src) {

  char *ret = dst;

  while(*dst++=*src++)

    ;

  return ret;

}

With amiga-gcc cross-compiler 3.4.0

Code:

m68k-amigaos-gcc -fno-builtin -S strcpy.c -c -fverbose-asm -fomit-frame-pointer -O3

The resulting strcpy.s

Code:

_strcpy:

        movel sp@(4),a0 ;# dst, dst

        movel sp@(8),a1 ;# src, src

        movel a0,d1     ;# dst, ret

        .even

L2:

        moveb a1@+,d0   ;#, tmp36

        moveb d0,a0@+   ;# tmp36,

        jne L2  ;#

        movel d1,d0     ;# ret, <result>

        rts

And with gcc-m68k (not for amiga) 5.4.0

Code:

strcpy:

        move.l %a2,-(%sp)       |,

        move.l 8(%sp),%a0       | dst, dst

        move.l 12(%sp),%a2      | src, src

        move.l %a0,%a1  | dst, ivtmp.12

.L2:

        move.b (%a2)+,%d0       | MEM[base: src_8, offset: 4294967295B], D.1040

        move.b %d0,(%a1)+       | D.1040, MEM[base: _14, offset: 0B]

        jne .L2 |

        move.l %a0,%d0  |,

        move.l (%sp)+,%a2       |,

        rts

Can we see the relevant strcpy.s from 6.x please?
Thanks.

nogginthenog

18 January 2017 20:09

I don't have libnix built yet but GCC 6.2.1 produces this:

Code:

_strcpy:

        move.l a2,-(sp) |,

        move.l a0,d0    | dst, dst

        move.l a0,a2    | dst, ivtmp.11

.L2:

        move.b (a1)+,d1 | MEM[base: src_8, offset: 4294967295B], _9

        move.b d1,(a2)+ | _9, MEM[base: _14, offset: 0B]

        jne .L2 |

        move.l (sp)+,a2 |,

        rts

matthey

18 January 2017 21:15

Quote:

Originally Posted by nogginthenog (Post 1135693)

I don't have libnix built yet but GCC 6.2.1 produces this:

Code:

_strcpy:

        move.l a2,-(sp) |,

        move.l a0,d0    | dst, dst

        move.l a0,a2    | dst, ivtmp.11

.L2:

        move.b (a1)+,d1 | MEM[base: src_8, offset: 4294967295B], _9

        move.b d1,(a2)+ | _9, MEM[base: _14, offset: 0B]

        jne .L2 |

        move.l (sp)+,a2 |,

        rts

This is not using the AT&T ABI so is not comparable to what alkis posted. What option did you use to get it to pass arguments in registers?

The code is flawed of course (as usual with GCC) and should probably be the following using register arguments as above with a0=dst a1=src (inlining would be better where possible).

Code:

_strcpy:

        move.l a0,d0

.L2:

        move.b (a1)+,(a0)+

        jne .L2

        rts

nogginthenog

19 January 2017 13:15

Quote:

Originally Posted by matthey (Post 1135714)

This is not using the AT&T ABI so is not comparable to what alkis posted. What option did you use to get it to pass arguments in registers?

Same as alkis.

matthey

19 January 2017 17:49

Quote:

Originally Posted by matthey (Post 1135714)

This is not using the AT&T ABI so is not comparable to what alkis posted. What option did you use to get it to pass arguments in registers?

Quote:

Originally Posted by nogginthenog (Post 1135851)

Same as alkis.

Very strange that the AT&T (stack args) ABI would not be used, especially with the same options. The Amiga Geek Gadgets guys tried to introduce a more efficient ABI using registers in the unofficial Amiga GCC versions up to 3.4.0 (good idea but buggy in my experience as was the RTD support). The official GCC maintainers refused to consider anything but the AT&T ABI for the 68k, including customizable register arguments for functions, while they supported several ABIs for the x86/x86_64 for years (regparm and fastcall among a wide selection). No new ABI was introduced even for the ColdFire although the MOV3Q was introduced primarily for popping the arguments off the stack since RTD was removed (because RTD support was buggy in GCC?). I don't know if the failed support for the 68k/CF is because of incompetence or bias but probably some of both by the GCC developers and the 68k/CF developers.

alkis

19 January 2017 17:51

Quote:

Originally Posted by matthey (Post 1135714)

...
The code is flawed of course (as usual with GCC) ..

Are you taking a stand that C compilers are flawed in the amiga? :)

Or...what code does your favorite amiga c compiler produces???

matthey

19 January 2017 22:37

Quote:

Originally Posted by alkis (Post 1135942)

Are you taking a stand that C compilers are flawed in the amiga? :)

The code generation is flawed in comparison to what modern compilers are capable of.

Quote:

Originally Posted by alkis (Post 1135942)

Or...what code does your favorite amiga c compiler produces???

Vbcc uses assembler inlines by default giving the following code for strcpy().

Code:

   move.l a0,d0

.l1:

   move.b (a1)+,(a0)+

   bne .l1

Beautiful! Perfect! Short sweet and inlined. However, if you compile the C code above you get the following.

Code:

strcpy:

   movem.l a2-a3,-(sp)

   movea.l ($10,sp),a3

   movea.l ($c,sp),a2

   move.l a2,d0

   movea.l a3,a1

   addq.l #1,a3

   movea.l a2,a0

   addq.l #1,a2

   move.b (a1),(a0)

   beq.b .l2

.l1:

   movea.l a3,a1

   addq.l #1,a3

   movea.l a2,a0

   addq.l #1,a2

   move.b (a1),(a0)

   bne.b .l1

.l2:

   movem.l (sp)+,a2-a3

   rts

Doh! Epic fail! Dr. Barthelmann and his vbcc compiler are capable of more but realistically, as I told Frank Wille, it is not going to happen without support for the 68k and Amiga. Who wants to waste time on a dead platform for a handful of people?

Thorham

20 January 2017 01:34

Wow, that's bad. I don't think even old compilers like SAS/C produce that kind of code. Just wow :rolleyes

I certainly know which compiler to avoid now. What a shame :(

matthey

20 January 2017 02:16

Quote:

Originally Posted by Thorham (Post 1136024)

Wow, that's bad. I don't think even old compilers like SAS/C produce that kind of code. Just wow :rolleyes

I certainly know which compiler to avoid now. What a shame :(

It is embarrassing and frustrating. Vbcc can generate some of the best code and some of the worst. I don't know what happened here. It was generating better code in an earlier version. It is capable of using advanced instructions and addressing modes like move.b (a1)+,(a0)+ but then so is GCC. Maybe there is a fix already for some of the problems but more problems seem to creep back in with both vbcc and GCC. Too much complexity and not enough developers any more. Sad :sad.

Thorham

20 January 2017 02:36

At least that means SAS/C isn't useless ;)

Here's what SAS/C produces, for those who are interested.

Code:

              SECTION      text,CODE

__code:

@strcopy:

              MOVE.L         A2,-(A7)                 ;2f0a 

___strcopy__1:

              MOVE.L         A1,A2                    ;2449 

___strcopy__2:

              MOVE.B         (A0)+,D0                 ;1018 

              MOVE.B         D0,(A1)+                 ;12c0 

              BNE.B          ___strcopy__2            ;66fa 

___strcopy__3:

              MOVE.L         A2,D0                    ;200a 

___strcopy__4:

              MOVE.L         (A7)+,A2                 ;245f 

              RTS                                     ;4e75 

__const:

__strings:

              XDEF           @strcopy

              END

alkis

20 January 2017 08:28

Seems like the gcc-3.4.0 produces the most efficient code then, as it avoids saving/restoring a register to stack.

Thorham

20 January 2017 09:45

Quote:

Originally Posted by alkis (Post 1136057)

Seems like the gcc-3.4.0 produces the most efficient code then, as it avoids saving/restoring a register to stack.

For code this trivial compilers should produce:

Code:

strcpy

    move.l  a0,d0

.loop

    move.b  (a0)+,(a1)+

    bne.s   .loop



    rts

Or better:

Code:

strcpy

    move.l  a0,d0

.loop

    move.b  (a0)+,(a1)+

    beq.s   .end

    move.b  (a0)+,(a1)+

    beq.s   .end

    move.b  (a0)+,(a1)+

    beq.s   .end

    move.b  (a0)+,(a1)+

    bne.s   .loop

.end

    rts

alkis

20 January 2017 10:50

Well, with

Code:

m68k-amigaos-gcc -S -O3 strcpy.c -fomit-frame-pointer -funroll-all-loops

it unrolls it 8 times

Code:

_strcpy:

        movel sp@(4),a1

        movel sp@(8),a0

        movel a1,d1

L2:

        moveb a0@+,d0

        moveb d0,a1@+

        jeq L12

        moveb a0@+,d0

        moveb d0,a1@+

        jeq L12

        moveb a0@+,d0

        moveb d0,a1@+

        jeq L12

        moveb a0@+,d0

        moveb d0,a1@+

        jeq L12

        moveb a0@+,d0

        moveb d0,a1@+

        jeq L12

        moveb a0@+,d0

        moveb d0,a1@+

        jeq L12

        moveb a0@+,d0

        moveb d0,a1@+

        jeq L12

        moveb a0@+,d0

        moveb d0,a1@+

        jne L2

        .even

L12:

        movel d1,d0

        rts

I don't understand why the peephole optimiser doesn't convert

Code:

        moveb a0@+,d0

        moveb d0,a1@+

to moveb a0@,a1@ since d0 is dead after, but hey...

wawa	20 January 2017 11:20

not that i understand much but for the sake of it, results of both aros68k compilers im currently using:

strcpy:
move.l 4(%sp),%d0 | dst, dst
move.l 8(%sp),%a1 | src, src
move.l %d0,%a0 | dst, ivtmp.11
.L2:
move.b (%a1)+,%d1 | MEM[base: src_8, offset: 4294967295B], _9
move.b %d1,(%a0)+ | _9, MEM[base: _14, offset: 0B]
jne .L2 |
rts
.size strcpy, .-strcpy
.ident "GCC: (GNU) 6.1.0"

strcpy:
move.l 4(%sp),%d0 | dst, dst
move.l 8(%sp),%a1 | src, ivtmp.9
move.l %d0,%a0 | dst, dst
.L2:
move.b (%a1)+,%d1 | MEM[base: D.779_19, offset: 0B], D.759
move.b %d1,(%a0)+ | D.759, MEM[base: dst_1, offset: 0B]
jne .L2 |
rts
.size strcpy, .-strcpy
.ident "GCC: (GNU) 4.6.4"

hooverphonique

20 January 2017 14:13

Quote:

Originally Posted by Thorham (Post 1136065)

For code this trivial compilers should produce:

Code:

strcpy

    move.l  a0,d0

what is

Code:

strcpy

    move.l  a0,d0

for?

idrougge

20 January 2017 17:15

Quote:

Originally Posted by Thorham (Post 1136065)

Code:

strcpy

    move.l  a0,d0

.loop

    move.b  (a0)+,(a1)+

    beq.s   .end

    move.b  (a0)+,(a1)+

    beq.s   .end

    move.b  (a0)+,(a1)+

    beq.s   .end

    move.b  (a0)+,(a1)+

    bne.s   .loop

.end

    rts

What is gained by unrolling one loop into four loops?

Samurai_Crow

20 January 2017 17:20

Quote:

Originally Posted by idrougge (Post 1136139)

What is gained by unrolling one loop into four loops?

Duff's device is not cache friendly and offers no speed benefit in this case.

Thorham

20 January 2017 18:01

Quote:

Originally Posted by alkis (Post 1136075)

Well, with m68k-amigaos-gcc -S -O3 strcpy.c -fomit-frame-pointer -funroll-all-loops it unrolls it 8 times

That's something, at least.

Quote:

Originally Posted by alkis (Post 1136075)

I don't understand why the peephole optimiser doesn't convert

Code:

        moveb a0@+,d0

        moveb d0,a1@+

to moveb a0@,a1@ since d0 is dead after, but hey...

Yeah, that's pretty shitty and makes no sense.

Quote:

Originally Posted by hooverphonique (Post 1136104)

what is

Code:

strcpy

    move.l  a0,d0

for?

D0 is the return value.

Quote:

Originally Posted by idrougge (Post 1136139)

What is gained by unrolling one loop into four loops?

On 68020s and 68030s taken branches are 8 cycles, not taken byte branches are 4 cycles.

matthey

20 January 2017 18:24

Quote:

Originally Posted by alkis (Post 1136075)

Well, with

Code:

m68k-amigaos-gcc -S -O3 strcpy.c -fomit-frame-pointer -funroll-all-loops

it unrolls it 8 times

Code:

_strcpy:

        movel sp@(4),a1

        movel sp@(8),a0

        movel a1,d1

L2:

        moveb a0@+,d0

        moveb d0,a1@+

        jeq L12

        moveb a0@+,d0

        moveb d0,a1@+

        jeq L12

        moveb a0@+,d0

        moveb d0,a1@+

        jeq L12

        moveb a0@+,d0

        moveb d0,a1@+

        jeq L12

        moveb a0@+,d0

        moveb d0,a1@+

        jeq L12

        moveb a0@+,d0

        moveb d0,a1@+

        jeq L12

        moveb a0@+,d0

        moveb d0,a1@+

        jeq L12

        moveb a0@+,d0

        moveb d0,a1@+

        jne L2

        .even

L12:

        movel d1,d0

        rts

I expect it is generally better to inline strcpy() where it is called than unroll strcpy() after a costly push/bsr/rts/pop (assuming strings are not excessively long). This is what vbcc does with the assembler inlines.

Quote:

Originally Posted by alkis (Post 1136075)

I don't understand why the peephole optimiser doesn't convert

Code:

        moveb a0@+,d0

        moveb d0,a1@+

to moveb a0@,a1@ since d0 is dead after, but hey...

That is not a peephole optimization. Further analysis outside of this code snippet is needed to conclude the optimization can be done (it is not guaranteed to be an equivalent replacement). Two instruction input peephole optimizations are aggressive too. Vasm only looks at one instruction max (but can output multiple instructions) and it is currently the best 68k peephole optimizing assembler.

@wawa
The aros68k GCC compiler is doing a good job here. It should be able to merge 2 lines inside the loop but overall good.

Quote:

Originally Posted by hooverphonique (Post 1136104)

what is

Code:

strcpy

    move.l  a0,d0

for?

strcpy() returns a pointer to the destination. The return is rarely used because it is already available and doesn't change but that is how it is defined.

Code:

char *strcpy(char *dst, const char *src);

It is much more useful to return a pointer to the end of the string like stpcpy().

Code:

char *stpcpy(char *dst, const char *src);



char *stpcpy(char *dst, const char *src)

{

while (*dst++ = *src++);

return (dst-1);

}

This came from SAS/C to BSD where it became part of POSIX but not C99 or C11.

All times are GMT +2. The time now is 07:33.

Page 2 of 78

Last »

Show 20 post(s) from this thread on one page

Page generated in 0.09604 seconds with 11 queries