04 April 2022, 17:24 | #121 | ||||||
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,350
|
Quote:
Quote:
And even when you don't read it anymore, asm knowledge is useful for writing better code regardless of the language. Quote:
Quote:
Anyhow, large projects involving several programmers are a PITA regardless of the language. Quote:
Coding in asm needs a very different mindset, which some programmers simply don't have. Register allocation is indeed typical, in the sense it's easy to handle while writing the code with the d8/a8 trick. The mistake is doing this allocation too early. On the other hand, signed/unsigned mismatch is typical C problem that does not occur in asm. Different languages, different issues. Quote:
In asm we can perform computations on 8-bit and 16-bit entities directly, where C insists upon converting everything to "int" (usually 32-bit) - and programmers usually don't care about size (after all, we've got plenty of memory !). This leads to more data to handle, and of course it means more pressure on data cache. |
||||||
04 April 2022, 17:34 | #122 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,350
|
Quote:
But i reckon game design appears to be a little out of new ideas. It seems to me that games were more innovative in 8-bit days. |
|
04 April 2022, 17:44 | #123 | |
Registered User
Join Date: Dec 2019
Location: Ur, Atlantis
Posts: 2,009
|
Quote:
|
|
04 April 2022, 19:39 | #124 | ||
Registered User
Join Date: Sep 2013
Location: Poland
Posts: 867
|
Quote:
Quote:
Last edited by Promilus; 04 April 2022 at 19:46. |
||
04 April 2022, 20:12 | #125 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,350
|
Quote:
Usually the programmer doesn't know what he's doing. In addition, compilers must follow the specs and limitations of the source language where asm programmers must only get the right result. This means some optimizations are forever "forbidden" to even the best compilers. |
|
04 April 2022, 21:04 | #126 | |
Registered User
Join Date: Feb 2017
Location: Denmark
Posts: 1,172
|
Quote:
Don't know if that factors in to the estimate or how that's handled on the 080. |
|
04 April 2022, 21:31 | #127 | |||||||
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,289
|
Quote:
I can't write x86 assembler, but I can read it perfectly fine. I don't want to write it, actually. Neither do I want to write arm code, but I can read it fine. Yet, I don't want to care about all the details at instruction level. Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Cache friendly code is not about low-level optimizations like that. Cache friendly code means that you need to create a "data-flow" architecture within which the data remains "hot" all the time. Cache friendly means having the right architecture - and not about caring upconversion of data. The latter the compiler will care about just fine by itself. |
|||||||
04 April 2022, 21:32 | #128 | ||
Registered User
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,424
|
Quote:
Quote:
|
||
04 April 2022, 22:57 | #129 |
Registered User
Join Date: Apr 2018
Location: Glasgow
Posts: 161
|
|
04 April 2022, 23:00 | #130 |
Registered User
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,424
|
|
04 April 2022, 23:07 | #131 |
Registered User
Join Date: Apr 2018
Location: Glasgow
Posts: 161
|
|
04 April 2022, 23:21 | #132 |
Registered User
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,424
|
|
05 April 2022, 08:32 | #133 | |||||||
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,350
|
You sure do, but it is OT not only on this thread but also on this whole site.
Wanting to do some strawman fallacy maybe ? Quote:
You could benchmark my flac decoder and my picture viewer, for example. Good luck with your compiled code to even approach their performance. Or you could open a thread here for a nice asm-vs-compiler coding contest. Quote:
Quote:
What you are creating with your "large" projects is actually just waste, aka bloatware. I don't create bloatware (and when i have to work on some, i don't make things worse). But even. Complexity of a project shouldn't raise much with its size, there is something called modularity to cope with that. Quote:
Actually, a good asm program has more structure than with your average compiled code, simply because it's less tolerant to bad programming. It's like a violin - no space for mediocrity. Of course software is about structure and architecture - but those do not depend that much on the used language. And please do not make assumptions on the projects i worked on - you are clearly far from reality. Quote:
I'm not watching recent developments for x86 or whatever, if it's what you meant. Quote:
Quote:
Having the right architecture is of course mandatory but you won't see if your architecture is right or not by just trusting what the compiler does. |
|||||||
05 April 2022, 09:26 | #134 | ||
Registered User
Join Date: Aug 2014
Location: Netherlands
Posts: 699
|
Quote:
68K is more flexible in that regard as it can do arithmetic on 8/16/32 bit values. As for the discussion about code-size. The reason ARM has the "Thumb" mode is all about code size. Classic ARM code uses 32bit instruction words while Thumb mode uses 16it instruction words. And 32bit instruction words kinda add up when you are programming on a 16kb Flash / 4kb RAM device. (And dont't get me started on GCC with it's "newlib-nano". There is nothing "nano" about it ) Last edited by Mathesar; 05 April 2022 at 09:46. |
||
05 April 2022, 09:49 | #135 |
Registered User
Join Date: Aug 2014
Location: Netherlands
Posts: 699
|
point in case:
Code:
int8_t Mul_8 (int8_t a, int8_t b) { return (a*b); } Code:
26:main.c **** int8_t Mul_8 (int8_t a, int8_t b) 27:main.c **** { 54 .loc 1 27 0 55 .cfi_startproc 56 @ args = 0, pretend = 0, frame = 0 57 @ frame_needed = 0, uses_anonymous_args = 0 58 @ link register save eliminated. 59 .LVL0: 28:main.c **** return (a*b); 60 .loc 1 28 0 61 0000 4843 muls r0, r1 62 .LVL1: 63 0002 40B2 sxtb r0, r0 29:main.c **** } |
05 April 2022, 10:01 | #136 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,350
|
Quote:
But even when the compiler is able, it does not imply it really will. Consider the case of a complex enough computation whose intermediate results can eventually overflow (but do not in our use case). If a and b are int8 and you write a+b, what is the type of the result ? Furthermore, it's easy to have 'int' implicitly by several means - consider writing 'c' for example, spec says it's int and not char. That, you have to verify. It might be stupid enough to sign extend parameters while passing them ! |
|
05 April 2022, 10:30 | #137 | |
Registered User
Join Date: Aug 2014
Location: Netherlands
Posts: 699
|
Quote:
Oh, I am sure it will! In fact, it depends on the calling convention. But what I've learned over the time is to look (for critical parts) at the compiler output and adjust the C-code accordingly. Since ARM has become so widespread I often use 32bit variables (for a simple loop counter for example in a tight loop) even when 16bit or 8bit would have sufficed. When passing parameters around in registers it doesn't matter anyway and it prevents usage of the dreaded SXTB instruction and it's variants. |
|
05 April 2022, 10:53 | #138 | ||
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,350
|
Quote:
My guess is that they perform reasonably well on small, straightforward code so they give the illusion of being "good enough" - but when code starts to become more complicated and demanding, havoc is unleashed. Quote:
In registers, being full size is not normally problematic but in memory it can be. For aarch64 i don't know, but IIRC arm32 could not perform 16-bit memory accesses. Could be interesting, too, to see in your example what happens to r0,r1 if a function call is added before the multiply. Normally they should be considered scratch regs and lost in the call... |
||
05 April 2022, 11:47 | #139 | ||
Registered User
Join Date: Nov 2018
Location: Germany
Posts: 110
|
Quote:
Once when I was working on PowerPC AROS I had to dive deeply into PPC assembly. I wrote nice looking code which was easy to understand and to follow. Then I took the optimization guides for PPC and improved performance of the code. It did work better yet was harder to read, harder to follow and not so nicely written, anymore. What helps you while writing in m68k assembly is the (rather sad) fact that the architecture is already very archaic and, until vampire came out, not updated. Had it evolved as any other CPU architecture, then you would have hardly chance to write as effective code as compiler can do for you. Quote:
I bet compiler would copy them to some callee saved registers and later on performed the multiplication there. |
||
05 April 2022, 11:55 | #140 | |
Registered User
Join Date: Nov 2018
Location: Germany
Posts: 110
|
Quote:
Code:
#include <stdint.h> int8_t Mul_8 (int8_t a, int8_t b); int8_t foo() { return Mul_8(2,3)-5; } int8_t bar(int a, int b) { return Mul_8(a, b) + 3; } int8_t moo(int8_t a, int8_t b) { return Mul_8(a, b); } Code:
foo: str x30, [sp, -16]! mov w1, 3 mov w0, 2 bl Mul_8 sub w0, w0, #5 ldr x30, [sp], 16 ret bar: str x30, [sp, -16]! bl Mul_8 add w0, w0, 3 ldr x30, [sp], 16 ret moo: b Mul_8 |
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Apollo 1240 missing Mach chip | Benfromnorway | MarketPlace | 3 | 01 June 2016 21:53 |
Apollo 1240@25mhz + 32mb Ram (Mach131 chip so can be upgraded to 060) | fitzsteve | MarketPlace | 4 | 16 August 2010 19:01 |
Gauging interest: Amiga 600HD, Apollo 620, 2MB Chip, 8MB Fast | chiark | MarketPlace | 9 | 25 November 2009 20:18 |
Wanted: MACH131 chip from Apollo 040 or 060 | 8bitbubsy | MarketPlace | 8 | 29 October 2009 15:55 |
Cedric and the lost scepture Demo/Preview-Version | mai | request.Old Rare Games | 3 | 28 March 2008 16:27 |
|
|