Looking for C/C++ programs to benchmark compilers

bebbo · 07 April 2023, 18:14

I am looking for sources of C (and maybe C++) programs that can be used to compare the aspects of the generated code. Such a program should have only one source file and not use any parameters, to keep the scripts simple.

I want to compare size and execution time - something else?

To compare I'll use Vamos, WinUAE and real Amigas. Also an archive with the executables plus a script to run them all should be available at the end.

Any interest in contributing such a test program?

Attach it here or mail it to me.

THX

bebbo · 17 April 2023, 22:41

I'm starting with
* sieve
* tscp182

and the first results are here:https://franke.ms/bench2/chart.html

Thomas Richter · 17 April 2023, 23:04

Sorry, I do not understand. Code generated b what? The code generated by an Amiga C compiler does not depend on the (real or virtual) environment it was compiled on, only the execution speed differs. vamos itself does not generate any code. It is an interpreter around musashi, except all Os calls which are executed in python on the native machine. WinAUE does have a jitter.

bebbo · 17 April 2023, 23:39

Quote:

Originally Posted by Thomas Richter

Sorry, I do not understand. Code generated by what?

by the different compilers

Quote:

Originally Posted by Thomas Richter

The code generated by an Amiga C compiler does not depend on the (real or virtual) environment it was compiled on, only the execution speed differs.

hopefully, yes :-)

Quote:

Originally Posted by Thomas Richter

vamos itself does not generate any code. It is an interpreter around musashi, except all Os calls which are executed in python on the native machine. WinAUE does have a jitter.

And vamos is used to count the cpu cycles for the programs.

Less cycles used is better.

If someone provides a more reasonable text for what I am doing, I might consider using that text...

btw: the benchmark sources and all generated programs can be downloaded here: https://franke.ms/bench2/bench2.zip

Thomas Richter · 18 April 2023, 08:57

Quote:

Originally Posted by bebbo

by the different compilers

WinUAE, vamos and real machine are not "compilers". They can all execute compilers. The compiled code does depend on the compiler, but not within which environment it was compiled.

Quote:

Originally Posted by bebbo

hopefully, yes :-)

Definitely yes.

Quote:

Originally Posted by bebbo

And vamos is used to count the cpu cycles for the programs.

Not really. First, I would not know that Musashi has this option, but even if it had, the result would be wrong. From the 68020 onwards, the number of cycles spend by a processor on an instruction does not depend on the instruction alone anymore. It depends on what is in the cache, and whether the instruction could partially overlap with the previous and the next instruction.

Cycle counting is a very bad idea to learn about software performance. Look at the source code, and learn about algorithmic complexity and big-O notation.

alkis · 18 April 2023, 09:35

Thomas, I am pretty sure Bebbo knows what a compiler is, since you know....he ported gcc to amigaos-target.

Anyways, the point of the chart is to compare compiler-produced-code from various compilers. It's "let's see which compiler comes up with the best code" situation.

Cycle counting from vamos for the 68000 seems pretty good from my experiments.

For example

Code:

vamos -v prime 1000000 1
10:18:27.122       main:   INFO:  done. exit code=0
10:18:27.123       main:   INFO:  total cycles: 50779208
10:18:27.123       main:   INFO:  vamos is exiting
Counted 78498 primes up to 1000000. (Did it 1 times)

which for an A500 (PAL) translates to 7.something seconds

Code:

50779208/7.09e6
	~7.16208857545839210155

Now, if I fire up fs-uae and emulate an A500 and run the code, it does run in 7.something seconds.

The benefit of using vamos cycle counting is you "measure" the same number no matter what your host machine is. I get the above number in Ryzen 9, intel, raspberry pi. So any future runs that give a different number is because the compiler produced code changed. And let's say you are working in tuning a compiler, then you could see if your modifications drive the numbers down or up. I think that's the objective here.

bebbo · 18 April 2023, 17:16

Quote:

Originally Posted by alkis

Thomas, I am pretty sure Bebbo knows what a compiler is, since you know....he ported gcc to amigaos-target.

Anyways, the point of the chart is to compare compiler-produced-code from various compilers. It's "let's see which compiler comes up with the best code" situation.

Cycle counting from vamos for the 68000 seems pretty good from my experiments.

For example

Code:

vamos -v prime 1000000 1
10:18:27.122       main:   INFO:  done. exit code=0
10:18:27.123       main:   INFO:  total cycles: 50779208
10:18:27.123       main:   INFO:  vamos is exiting
Counted 78498 primes up to 1000000. (Did it 1 times)

which for an A500 (PAL) translates to 7.something seconds

Code:

50779208/7.09e6
    ~7.16208857545839210155

Now, if I fire up fs-uae and emulate an A500 and run the code, it does run in 7.something seconds.

The benefit of using vamos cycle counting is you "measure" the same number no matter what your host machine is. I get the above number in Ryzen 9, intel, raspberry pi. So any future runs that give a different number is because the compiler produced code changed. And let's say you are working in tuning a compiler, then you could see if your modifications drive the numbers down or up. I think that's the objective here.

Alkis, thank you for stepping in. There are those who have learned to ask politely, and those who have not.

Back to the topics - I omit the rants...

1. Does the same compiler produce identical code when running on different platforms?

If you are precise: it's not the same compiler. It gets compiled from the same sources. And I don't have an example at hand (I think it was on the RasPi with 32bit), but if you consider that the compiler sometimes has to trade off between statements that it considers equivalent, then different memory addresses and resulting hashes can lead to different results.

But that's not a topic here.

2. Is cycle counting a good idea?

I agree with Alkis: It's a good idea for simple CPUs like the 68000.
But what is more complex CPUs, which contain caches and whatever else?
From my point of view it is a good idea there too, because the cycles per instruction are the essential basis that the compiler can use to select the best instructions from his point of view. In some compilers - like gcc - you can further model the CPU, which can then be used to schedule the instructions.
In this respect, the total cycles per program are still a good indication. While these will not match perfectly with the real values. One can still run these tests on real systems at any time and evaluate these results. TBD.

That's a reasonable topic for me.

3. For me it's interesting to observe different compilers plus the evolution of the gcc compiler and maybe more compilers like LLVM - if I can get these to work.

For example, if you look at SIEVE, you find that -Os from gcc-6.5.0b is slower than -Os from gcc-10.2.1b. This effect can also be observed from gcc-9.5.0-elf to gcc 10.4.0-elf. The difference results from the fact that as of version 10 the built-in function memset is also recognized with -Os and -O2 and memset is significantly faster than the loops generated by the compiler. So backporting this change might be an option.
It also shows that the old gcc-2.95.3 does a real good job.

Looking at TSCP182 is also interesting.

For -O2 and -O3 the gcc-6.5.0-elf yields faster code than all successors. That's where I want to find out why.
Or comparing gcc-13av2 and gcc-13 (both experimental branches for the Amiga) differ only in the provided cost model.

Maybe there is a benchmark where a recent gcc version provides a quantum leap in performance?

... next is fixing gcc-2.95.3-elf for tscp182...

AnimaInCorpore · 18 April 2023, 17:31

https://netlib.org/benchmark/linpackc

bebbo · 18 April 2023, 17:41

Quote:

Originally Posted by AnimaInCorpore

https://netlib.org/benchmark/linpackc

Thank you, interesting.
Uses ~2MB stack and a lot of floating point stuff, hm. Maybe not so ideal for the 68000?

EDIT: err, that are static variables^^

bebbo · 19 April 2023, 20:22

I added clang-17 - the experimental m68k target of llvm - and got sieve to work. The tscp182 benchmark fails with an internal error...

alkis · 19 April 2023, 21:10

Double numbers performance and some io.

Code:

/* The Computer Language Benchmarks Game
 * https://salsa.debian.org/benchmarksgame-team/benchmarksgame/

   contributed by Greg Buchholz
*/

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {
  int w, h, bit_num = 0;
  char byte_acc = 0;
  int i, iter = 50;
  double x, y, limit = 2.0;
  double Zr, Zi, Cr, Ci, Tr, Ti;

  if (argc != 2)
    w = h = 100;
  else
    w = h = atoi(argv[1]);

  printf("P4\n%d %d\n", w, h);

  for (y = 0; y < h; ++y) {
    for (x = 0; x < w; ++x) {
      Zr = Zi = Tr = Ti = 0.0;
      Cr = (2.0 * x / w - 1.5);
      Ci = (2.0 * y / h - 1.0);

      for (i = 0; i < iter && (Tr + Ti <= limit * limit); ++i) {
        Zi = 2.0 * Zr * Zi + Ci;
        Zr = Tr - Ti + Cr;
        Tr = Zr * Zr;
        Ti = Zi * Zi;
      }

      byte_acc <<= 1;
      if (Tr + Ti <= limit * limit)
        byte_acc |= 0x01;

      ++bit_num;

      if (bit_num == 8) {
        putc(byte_acc, stdout);
        byte_acc = 0;
        bit_num = 0;
      } else if (x == w - 1) {
        byte_acc <<= (8 - w % 8);
        putc(byte_acc, stdout);
        byte_acc = 0;
        bit_num = 0;
      }
    }
  }
}

sample compiler & run

Code:

m68k-amigaos-gcc -mcrt=nix13 -O3 -o mandelbrot mandelbrot.c -lm
vamos -v ./mandelbrot >foo
22:02:38.801       main:   INFO:  done. exit code=0
22:02:38.801       main:   INFO:  total cycles: 693865522
22:02:38.801       main:   INFO:  vamos is exiting

file foo
foo: Netpbm image data, size = 100 x 100, rawbits, bitmap

Produced 'foo' can be seen with 'xdg-open foo'. Binary foo from amigaos-gcc-6.5 matches binary foo from linux gcc 11.3.0.

07 April 2023, 18:14	#1
bebbo bye Join Date: Jun 2016 Location: Some / Where Posts: 680	Looking for C/C++ programs to benchmark compilers I am looking for sources of C (and maybe C++) programs that can be used to compare the aspects of the generated code. Such a program should have only one source file and not use any parameters, to keep the scripts simple. I want to compare size and execution time - something else? To compare I'll use Vamos, WinUAE and real Amigas. Also an archive with the executables plus a script to run them all should be available at the end. Any interest in contributing such a test program? Attach it here or mail it to me. THX

18 April 2023, 09:35	#6
alkis Registered User Join Date: Dec 2010 Location: Athens/Greece Age: 53 Posts: 719	Thomas, I am pretty sure Bebbo knows what a compiler is, since you know....he ported gcc to amigaos-target. Anyways, the point of the chart is to compare compiler-produced-code from various compilers. It's "let's see which compiler comes up with the best code" situation. Cycle counting from vamos for the 68000 seems pretty good from my experiments. For example Code: vamos -v prime 1000000 1 10:18:27.122 main: INFO: done. exit code=0 10:18:27.123 main: INFO: total cycles: 50779208 10:18:27.123 main: INFO: vamos is exiting Counted 78498 primes up to 1000000. (Did it 1 times) which for an A500 (PAL) translates to 7.something seconds Code: 50779208/7.09e6 ~7.16208857545839210155 Now, if I fire up fs-uae and emulate an A500 and run the code, it does run in 7.something seconds. The benefit of using vamos cycle counting is you "measure" the same number no matter what your host machine is. I get the above number in Ryzen 9, intel, raspberry pi. So any future runs that give a different number is because the compiler produced code changed. And let's say you are working in tuning a compiler, then you could see if your modifications drive the numbers down or up. I think that's the objective here.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Earlier C++ Compilers	Anubis	Coders. General	25	23 October 2019 00:48
C Compilers?	Pheonix	support.Apps	7	05 December 2016 18:06
AmigaBASIC Compilers	Leandro Jardim	request.Apps	4	22 May 2013 22:59
C++ Compilers where/what to get?	Spadger	request.Apps	18	05 May 2006 05:10
c compilers?	kruwi	request.Apps	1	25 April 2006 18:30

17 April 2023, 22:41	#2
bebbo bye Join Date: Jun 2016 Location: Some / Where Posts: 680	I'm starting with * sieve * tscp182 and the first results are here:https://franke.ms/bench2/chart.html

17 April 2023, 23:04	#3
Thomas Richter Registered User Join Date: Jan 2019 Location: Germany Posts: 3,233	Sorry, I do not understand. Code generated b what? The code generated by an Amiga C compiler does not depend on the (real or virtual) environment it was compiled on, only the execution speed differs. vamos itself does not generate any code. It is an interpreter around musashi, except all Os calls which are executed in python on the native machine. WinAUE does have a jitter.

18 April 2023, 17:31	#8
AnimaInCorpore Registered User Join Date: Nov 2012 Location: Willich/Germany Posts: 232	https://netlib.org/benchmark/linpackc

19 April 2023, 20:22	#10
bebbo bye Join Date: Jun 2016 Location: Some / Where Posts: 680	I added clang-17 - the experimental m68k target of llvm - and got sieve to work. The tscp182 benchmark fails with an internal error...

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)