English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 04 June 2019, 05:26   #81
Bruce Abbott
Registered User

Bruce Abbott's Avatar
 
Join Date: Mar 2018
Location: Hastings, New Zealand
Posts: 180
Quote:
Originally Posted by meynaf View Post
Anyway this is not a very realistic example. This kind of construct usually does not execute the nops it contains, otherwise it would just slow it down - and people who write such horrors are concerned with speed.
I found another example today - in AmigaBASIC!

This construct actually does execute the nop, as a substitute for 'short branch to the next address' (which is invalid). It's true that the code could be a tiny bit faster without it, but compilers rarely produce code that can't be improved.

Quote:
Perhaps your disassembler should not assume anything at all (mine doesn't).
If you want a simple rule to follow there, never ever emit a cnop directive with a disassembler. You can, however, emit the nop or dc.w 0 with a comment if it looks suspicious.
My disassembler provides the option of emitting cnops, but only when producing source for Devpac because other assemblers can't be trusted to use it correctly. It never changes nop to cnop, but does try to determine when the nop is executable code (since that may change the interpretation of code before and after it).

Quote:
Using opcodes eating another opcode is unsafe IMO - and a pain to disassemble later, like all that SAS/C generated code which uses CMPI.W #i,D0 as short branch skipping a word.
I agree. Getting the disassembler to handle these cases automatically was a pain.

Compilers can get away with doing do this kind of stuff because they don't have to produce code that a human can understand, so any trick that makes it smaller or faster is OK. Compiler writers think they are so smart for saving a few bytes or cycles with such twisted constructs, then blow it all by inserting cnops where they aren't needed!
Bruce Abbott is offline  
Old 04 June 2019, 07:42   #82
meynaf
son of 68k
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 46
Posts: 3,474
Quote:
Originally Posted by Bruce Abbott View Post
I found another example today - in AmigaBASIC!

This construct actually does execute the nop, as a substitute for 'short branch to the next address' (which is invalid). It's true that the code could be a tiny bit faster without it, but compilers rarely produce code that can't be improved.
Right, compilers can produce lots of horrors.

But is the nop really important at the end ?
A disassembler can not differenciate code from data in all cases. What can you do if there is a JSR directly followed by data to be used by the called routine ? These require human interpretation. So must cnops, end of story.


Quote:
Originally Posted by Bruce Abbott View Post
Compilers can get away with doing do this kind of stuff because they don't have to produce code that a human can understand, so any trick that makes it smaller or faster is OK. Compiler writers think they are so smart for saving a few bytes or cycles with such twisted constructs, then blow it all by inserting cnops where they aren't needed!
The funniest example i've seen was code where all routines were supposed to be longword aligned but ended up never aligned at all because the linker did not respect that...
meynaf is offline  
Old 05 June 2019, 05:47   #83
Bruce Abbott
Registered User

Bruce Abbott's Avatar
 
Join Date: Mar 2018
Location: Hastings, New Zealand
Posts: 180
Quote:
Originally Posted by meynaf View Post
A disassembler can not differenciate code from data in all cases.
True, but it can try. The better it does, the less human intervention is required.

Quote:
But is the nop really important at the end ?
The nop is used as the 'branch' for the highest case value. You could remove it and the code would (probably) still work, because it would then jump directly into the code for that case. However if you switched the nop for a dc.w 0 it would be fatal because dc.w 0 equates to ori.b #x,d0 which is a 2 word instruction. The only way to guarantee that the reassembled code is accurate is to leave the nop in, otherwise it's impossible to verify (file compare would show a difference for every byte past the missing nop).

Quote:
What can you do if there is a JSR directly followed by data to be used by the called routine ? These require human interpretation. So must cnops, end of story.
It's not quite the end of the story, because compilers are somewhat predictable. If the disassembler can recognize a particular compiler's code then it may be able to handle such cases. For example, Hisoft BASIC does a JSR followed by a string which is difficult to interpret correctly, but the offset is unique so once it has been identified the rest is easy (much easier than trying to fix up the disassembly manually).

Right now I am trying to do a better job of disassembling programs written in Blitz BASIC. Line-A emulation detection is working well enough, but I want to know what the emulated opcodes actually do, and also why code sometimes jumps into the middle of an instruction. I thought I would get some answers by looking at the source code for Blitz BASIC, but man what a mess! I'm starting to think the peculiar things I am finding are just horrid compiler bugs. But even if they are bugs, I still want the disassembler to handle them intelligently.
Bruce Abbott is offline  
Old 05 June 2019, 09:11   #84
meynaf
son of 68k
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 46
Posts: 3,474
Quote:
Originally Posted by Bruce Abbott View Post
True, but it can try. The better it does, the less human intervention is required.
In theory yes. In practice, said human can not trust what the disassembler does and has to scan the whole output in every case if he wants to be safe.


Quote:
Originally Posted by Bruce Abbott View Post
The nop is used as the 'branch' for the highest case value. You could remove it and the code would (probably) still work, because it would then jump directly into the code for that case. However if you switched the nop for a dc.w 0 it would be fatal because dc.w 0 equates to ori.b #x,d0 which is a 2 word instruction. The only way to guarantee that the reassembled code is accurate is to leave the nop in, otherwise it's impossible to verify (file compare would show a difference for every byte past the missing nop).
When you see a nop, leave the nop. It's the only safe solution before you can alter the code.


Quote:
Originally Posted by Bruce Abbott View Post
It's not quite the end of the story, because compilers are somewhat predictable. If the disassembler can recognize a particular compiler's code then it may be able to handle such cases. For example, Hisoft BASIC does a JSR followed by a string which is difficult to interpret correctly, but the offset is unique so once it has been identified the rest is easy (much easier than trying to fix up the disassembly manually).

Right now I am trying to do a better job of disassembling programs written in Blitz BASIC. Line-A emulation detection is working well enough, but I want to know what the emulated opcodes actually do, and also why code sometimes jumps into the middle of an instruction. I thought I would get some answers by looking at the source code for Blitz BASIC, but man what a mess! I'm starting to think the peculiar things I am finding are just horrid compiler bugs. But even if they are bugs, I still want the disassembler to handle them intelligently.
Compilers are somewhat predictable, true, but then you have to detect which compiler was used...

Having a disassembler handle particular cases intelligently is something, having it detect automatically when it must do so, is something else.
Making it automatic does not remove a lot of burden from the human, however it makes the disassembler more and more complex and taking the risk of false positives.

My disassembler contains several options to clean things up, but will do so only if i tell it to. As, for example, it can not detect if JSR $2A(A5) is normal call or one of these strange switch statements.
meynaf is offline  
Old 11 June 2019, 23:08   #85
ross
Sum, ergo Cogito

ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 49
Posts: 1,710
Quote:
Originally Posted by phx View Post
For all the friends of CNOP
-cnop=0
Hi phx, can be padded with 0.w (if
-cnop=0
used) even the last two bytes of the hunk (if unused)?
I can use
even
or
cnop 0,2
as last directive in section but if I forget to use it the damn
$4e71
appear
ross is offline  
Old 12 June 2019, 13:22   #86
phx
Natteravn

phx's Avatar
 
Join Date: Nov 2009
Location: Herford / Germany
Posts: 1,368
Quote:
Originally Posted by ross View Post
can be padded with 0.w (if
-cnop=0
used) even the last two bytes of the hunk (if unused)?
No. The hunk-padding has nothing to do with CNOP, but is done by the output module.

Quote:
I can use
even
or
cnop 0,2
as last directive in section but if I forget to use it the damn
$4e71
appear
True. To protect myself against complaints when your program crashes, because you linked a function over two objects files together.
When this is really so terrible I could still add another option for the output module. Maybe -hunkpad?
phx is offline  
Old 12 June 2019, 13:34   #87
ross
Sum, ergo Cogito

ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 49
Posts: 1,710
Quote:
Originally Posted by phx View Post
True. To protect myself against complaints when your program crashes, because you linked a function over two objects files together.
When this is really so terrible I could still add another option for the output module. Maybe -hunkpad?
Nah, not terrible
But
-hunkpad
would be a nice little addition

Thanks!
ross is offline  
Old 12 June 2019, 14:25   #88
phx
Natteravn

phx's Avatar
 
Join Date: Nov 2009
Location: Herford / Germany
Posts: 1,368
Quote:
Originally Posted by ross View Post
But
-hunkpad
would be a nice little addition
Done. Check tomorrow's daily snapshot.
phx is offline  
Old 13 June 2019, 16:54   #89
ross
Sum, ergo Cogito

ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 49
Posts: 1,710
Quote:
Originally Posted by phx View Post
Done. Check tomorrow's daily snapshot.


A new Win64 binary into the Zone!
ross is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
vasm basereg example directive mcgeezer Coders. Asm / Hardware 5 22 December 2018 23:38
Issues with ORG directive (vasm + FS-UAE) Maggot Coders. Asm / Hardware 6 21 November 2018 19:48
REPT directive in vasm phx Coders. Asm / Hardware 8 01 October 2014 21:48
AsmOne even directive...? pmc Coders. General 30 04 December 2009 09:33
Invalid Directive Kimmo support.WinUAE 1 23 July 2004 11:23

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 19:36.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2019, vBulletin Solutions Inc.
Page generated in 0.08742 seconds with 15 queries