English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 30 September 2022, 03:51   #41
Bruce Abbott
Registered User
 
Bruce Abbott's Avatar
 
Join Date: Mar 2018
Location: Hastings, New Zealand
Posts: 2,543
Quote:
Originally Posted by meynaf View Post
Then you don't need to encode the instructions manually. Just use opt 0 to turn all optimizations off, either at the command line or in the source itself.
I always turn optimization off because I don't want the assembler doing stuff behind my back. This is particularly important for cases like this where exact encoding is required.

Sometimes the only way to get exactly what you want is to write it in hex or create a macro for the instruction. For example the Barfly assembler 'optimizes' cmp to cmpi when the source is immediate, even with all optimizations turned off! Some compilers write word values into byte operands, which results in the (unused) upper byte being set to $ff when the signed byte value is negative. This becomes a problem if you are trying to create an identical executable from a disassembly, as the assembler may not do the same.
Bruce Abbott is offline  
Old 30 September 2022, 08:28   #42
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,322
Quote:
Originally Posted by Bruce Abbott View Post
I always turn optimization off because I don't want the assembler doing stuff behind my back. This is particularly important for cases like this where exact encoding is required.

Sometimes the only way to get exactly what you want is to write it in hex or create a macro for the instruction. For example the Barfly assembler 'optimizes' cmp to cmpi when the source is immediate, even with all optimizations turned off! Some compilers write word values into byte operands, which results in the (unused) upper byte being set to $ff when the signed byte value is negative. This becomes a problem if you are trying to create an identical executable from a disassembly, as the assembler may not do the same.
Exact encoding is rarely needed. You first want a program that works, don't you ?
Assemblers aren't compilers, they don't do important stuff behind your back. If they do, then they are broken.

Strictly identical exe is sometimes just impossible anyway, due to different ordering in reloc tables.

Turning optimization on has the advantage of choosing best options for branch sizes. Doing that by hand is a pita and sometimes close to impossible when using macros.

Resourcing code isn't that much different. First, reassemble with no opts. Then, compare with original exe and find out what the differences are. With PhxAss very few things can happen. So yeah, that $ff in negative byte immediates has disappeared. It's not a big deal, really. It won't prevent your program from working (and if it does, you know there is a checksum !).

Then simply turning optimizations on can earn many kilobytes and your executable is already better than the original one !
meynaf is offline  
Old 30 September 2022, 23:39   #43
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,602
As a reflection on the original discussion, you can know a few things that make instructions slower, without having to look up the cycle count for a specific CPU:
  1. Immediate values that don't use the "q" suffix increase instruction length, and so execution time. E.g. move.w #imm,d0.
  2. The same can be true for ea offsets, but not often. E.g. move.w d8(An,Rn),d0.
  3. Modifying an address register is always longword-wide, which makes word-size instructions slower than their data register versions, except on the higher Motorola CPUs.
  4. Extra instruction words means extra memory accesses, some of which may be blocked by the higher-priority parts of the chipset, sometimes for a short time, sometimes for a very long time. This doesn't happen if the instruction words are already in the instruction cache/prefetch buffer.
  5. Instructions performing r/w to RAM can similarly be blocked, but not if the data is already in the data cache. Normally, Chip RAM is not cached.
Photon is offline  
Old 30 September 2022, 23:57   #44
Bruce Abbott
Registered User
 
Bruce Abbott's Avatar
 
Join Date: Mar 2018
Location: Hastings, New Zealand
Posts: 2,543
Quote:
Originally Posted by meynaf View Post
Exact encoding is rarely needed. You first want a program that works, don't you ?
Except in cases like this, which are not exactly rare here. As for in general, I'm just saying what I do - not suggesting it's the 'best' way for everyone.

Quote:
Assemblers aren't compilers, they don't do important stuff behind your back. If they do, then they are broken.
So Barfly assembler is broken. And VASM too, because it trims zeros off the end of your code without being told to (and sometimes gets it wrong). Actually most assemblers are broken in one way or another. I use ProAsm, which has several bugs that I had to patch to get correct code generation (worst one was setting the wrong data register in certain 68020 instructions).

Quote:
Strictly identical exe is sometimes just impossible anyway, due to different ordering in reloc tables.
I don't worry about reloc tables because the order doesn't matter. To compare reassembled code to the original I wrote a program that relocates them before comparison. When my disassembler is working properly the results are identical, which is how I check it for accuracy.

Quote:
Turning optimization on has the advantage of choosing best options for branch sizes. Doing that by hand is a pita and sometimes close to impossible when using macros.
You are right. Actually asm is a pain in general. But I like to know what size my branches are, as trying to keep them short makes my code tighter. Also the assembler takes longer when it has to modify branch sizes, and sometimes you need code to be a certain size (eg. branch tables).

Of course I always have the option of getting the assembler to do optimization if I am too lazy (rarely) or to make the final release code tighter.

Quote:
Resourcing code isn't that much different.
Resourcing code accurately enough to handle different instruction sizes is not that easy. It's not unusual to find PC relative offsets in data words, which are not detectable without a close examination of the code that uses them. If the code size changes they may point to the wrong place.

Quote:
First, reassemble with no opts. Then, compare with original exe and find out what the differences are.
This also isn't that easy if the code size changes. Everything after the change is offset by some amount and a straight binary comparison fails from there on.

Quote:
With PhxAss very few things can happen. So yeah, that $ff in negative byte immediates has disappeared. It's not a big deal, really. It won't prevent your program from working (and if it does, you know there is a checksum !).
True provided that the disassembly correctly identified code and data. If it didn't you could end up with 'code' that assembles into incorrect data. While you shouldn't let that happen, carefully inspecting every line for accuracy can be quite time-consuming. Often I just want a 'quick and dirty' disassembly so I can generate labels for debugging.

Quote:
Then simply turning optimizations on can earn many kilobytes and your executable is already better than the original one !
True, though my code doesn't generally have many branches that can be optimized. I find the best optimization is to review the code and ask "is this really the best way to do it?".
Bruce Abbott is offline  
Old 01 October 2022, 09:16   #45
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,322
Quote:
Originally Posted by Bruce Abbott View Post
Except in cases like this, which are not exactly rare here. As for in general, I'm just saying what I do - not suggesting it's the 'best' way for everyone.
True. We have different use cases.


Quote:
Originally Posted by Bruce Abbott View Post
So Barfly assembler is broken. And VASM too, because it trims zeros off the end of your code without being told to (and sometimes gets it wrong). Actually most assemblers are broken in one way or another. I use ProAsm, which has several bugs that I had to patch to get correct code generation (worst one was setting the wrong data register in certain 68020 instructions).
I think VASM has an option to turn that "trim zeroes" off. Something to build kick 1.x compatible exes.


Quote:
Originally Posted by Bruce Abbott View Post
I don't worry about reloc tables because the order doesn't matter. To compare reassembled code to the original I wrote a program that relocates them before comparison. When my disassembler is working properly the results are identical, which is how I check it for accuracy.
Great idea. Too bad it would be more complicated for me due i'm by far not always disassembling Amiga code.


Quote:
Originally Posted by Bruce Abbott View Post
You are right. Actually asm is a pain in general. But I like to know what size my branches are, as trying to keep them short makes my code tighter. Also the assembler takes longer when it has to modify branch sizes, and sometimes you need code to be a certain size (eg. branch tables).
I can not always know what size my branches are, as i'm often using macros and conditional assembly (a likely story in an include file with features that only get assembled if they are used).
For branch tables you can turn optimizations off and on in the source itself, local to some code part (at least PhxAss can).
I see turning opts off useful only for initial comparison when resourcing and faster assembly of large programs, the latter being much less important with uae-jit.


Quote:
Originally Posted by Bruce Abbott View Post
Of course I always have the option of getting the assembler to do optimization if I am too lazy (rarely) or to make the final release code tighter.
It depends on the size. I don't think you'll like to hand-optimize all branches in a 1M resourced executable...


Quote:
Originally Posted by Bruce Abbott View Post
Resourcing code accurately enough to handle different instruction sizes is not that easy. It's not unusual to find PC relative offsets in data words, which are not detectable without a close examination of the code that uses them. If the code size changes they may point to the wrong place.
Right, but i've never seen the code size change with PhxAss opt 0.


Quote:
Originally Posted by Bruce Abbott View Post
This also isn't that easy if the code size changes. Everything after the change is offset by some amount and a straight binary comparison fails from there on.
I know, but it never happened to me with optimizations turned off (or it's the sign i've missed something important).


Quote:
Originally Posted by Bruce Abbott View Post
True provided that the disassembly correctly identified code and data. If it didn't you could end up with 'code' that assembles into incorrect data. While you shouldn't let that happen, carefully inspecting every line for accuracy can be quite time-consuming. Often I just want a 'quick and dirty' disassembly so I can generate labels for debugging.
That depends on what you want to do with the reassembled code.
Usually it's for making big alterations so you will have to check the code line by line. Actually i do all the code/data separation by hand.
If it's just to generate labels, i fail to see the usefulness - these labels are for the most part number-based and are meaningless.


Quote:
Originally Posted by Bruce Abbott View Post
True, though my code doesn't generally have many branches that can be optimized. I find the best optimization is to review the code and ask "is this really the best way to do it?".
Well, not only the branches are optimized. If, say, you use a structure and access the first member : by using name(An) (which is supposed to be recommended) the assembler will use 0(An) instead of (An) with optimizations off. Other zero constants may appear as well.
meynaf is offline  
Old 01 October 2022, 14:20   #46
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,118
Some really useful info guys. Manuals are one thing, real world examples sometimes another. Apologies for the nerd-snipe.
Karlos is online now  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
32bit PC-relative LEA ?? Nut Coders. General 22 18 March 2010 10:56

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 09:06.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.15659 seconds with 15 queries