10 September 2018, 23:27 | #1 |
Registered User
Join Date: May 2018
Location: Delta, Canada
Posts: 192
|
ADDI vs ADD #
It seems that both of these instructions can be used to encode add immediate to a data register.
Code:
ADD.L #0x1234,D1 ADDI.L #0x1234,D1 What is the general view on converting instructions like this? Is not the assembler supposed to do what you ask it to do, rather than to change things for you? |
10 September 2018, 23:39 | #2 |
Registered User
Join Date: Sep 2007
Location: Stockholm
Posts: 4,332
|
ADD.L is just a convenience offered by the assembler — given an immediate value, it will output ADDI.L.
|
11 September 2018, 00:44 | #3 |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
|
Actually is not a convenience for the assembler but a choice by the assembler.
I second hth313: i want an assembler that do what I tell to do (there are the compilers that choose for you..). If for some reason I prefer the encoding d2bc in place of that 0681? (ADD.l #imm,d1 and ADDI.l #imm,d1 respectively). Yes same length, same cycles, same operation but is my choice (I can generate opcode from these, make self-modifying, EOR patch, use for encrypt..). And some assembler generate different opcode for OR/ORI, AND/ANDI but not other legal combo |
11 September 2018, 00:50 | #4 |
Natteravn
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,496
|
No. "ADD #x,Dn" and "ADDI #x,Dn" are two different instructions with different opcodes, although their operation and number of cycles used is identical.
A good assembler will not automatically convert ADD into ADDI and vice versa, when the selected addressing modes are supported. It should generate what the developer wrote in the source. |
11 September 2018, 00:53 | #5 |
Natteravn
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,496
|
Ross was faster. I got distracted.
I should also mention that it is important that the assembler generates exactly what is written in the source, when you have a reassembler output and want to compare it with the original. |
11 September 2018, 00:57 | #6 | |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
|
Yes, asm coder appreciate this
I would also tend to not trust an assembler that does what it wants even in these small cases. Quote:
|
|
11 September 2018, 04:15 | #7 | |||
Registered User
Join Date: Mar 2018
Location: Hastings, New Zealand
Posts: 2,546
|
Quote:
While developing my own re-assembler I discovered that some compilers produce 'buggy' machine code that cannot be exactly reproduced with standard instructions. Some examples:- - Rubbish in the upper 8 bits of an immediate Byte value. - Bits set that are ignored on the 68000, but may affect operation of later CPUs. - CMPI.W #xx,D0 used to skip the next instruction, with a branch into the middle of the CMPI instruction. - MOVEM.L with an empty register list. To reproduce these I use DC.W to 'poke' word values into the code, possibly wrapped in a macro which represents the instruction. Creating an identical copy of the executable file is even harder. The assembler may not be able to reproduce the hunk structure exactly, and the offsets in relocation tables are often ordered differently. This makes verification of disassembled output harder, but even identical reassembled code doesn't prove that the disassembly is accurate. Quote:
Quote:
|
|||
11 September 2018, 09:37 | #8 | ||||
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
Seen that many times. Usually it's $FF with negative values. You can guess the assembler was written in C and treated the data as signed. Quote:
For this reason you could setup some way to identify 020+ instruction. (F.e. My disassembler puts an extra space at start for these.) Quote:
I don't remember having seen any, but a good assembler should be able to cope with that. Quote:
But true, the disassembly might reassemble ok and not being really editable as new source. A few extra things can be annoyances : - use of base reg to access data (sometimes for a jmp table as well) - C switch statement done with a data table immediately following a jsr (some Mac compiler does this) - absolute code (what is data ? what is mem address ?) - custom executable format - pc-relative stuff (jmp offset table) using reference in middle of instruction Because so many dirty things can happen, i didn't base my own disasm on automatic recognition but on a script system. |
||||
11 September 2018, 09:43 | #9 | ||||
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
|
Hi Bruce, I also noticed the cases that you described and probably are due to compilers that do not generate intermediate assembler code but directly bitcodes.
In particular: Quote:
I do not remember if it is expressly forbidden to use the upper 8 bits since they do not create problems in any processor of the family. I would be to accept the form (with strong warning) move.b #$1234,d0 ($103c1234) because I could have such code: Code:
.l move.b #$1234,d0 ... cmp.w .l+2(pc),d1 Quote:
Quote:
I can condone this practice, avoiding jumps is always a good thing.. but code become unreadable. Quote:
I can understand that routines are pre-spaced with placeholders and then filled, but if you really must, at least use another opcode.. For all intents and purposes, the statement is a longword nop. In similar situations I use $0.l (ori.b # 0,d0), packers compresses it better But I would be to accept (with warning) the form movem.x <dest> You could do an interesting and useless quadwords nop (dc.w $ 48f9,0,0,0-> movem.l $0.l) [EDIT: interesting.. on 68k movem.l <void> $11111111 trigger an exception, so dest is evaluated ] Congratulations for your disassembler. When it is ready I will certainly try it! Last edited by ross; 11 September 2018 at 09:54. |
||||
11 September 2018, 09:47 | #10 |
move.l #$c0ff33,throat
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 6,863
|
|
11 September 2018, 10:08 | #11 |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
|
|
11 September 2018, 11:56 | #12 | |||
Natteravn
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,496
|
Quote:
Most assemblers of the time couldn't do that and converted ADD into ADDI. I remember Devpac, AsmOne, Barfly, SNMA doing it. Devpac even swapped operands in the encoding of the EXG instruction. Quote:
Code:
movem.l ,-(sp) Quote:
I agree that you need some kind of script to tell the reassembler about code/data regions, jump-tables, small-data model and addresses in absolute code. No reassembler can ever produce 100% perfect output. It is always a long process of reassembling, checking the output, adapting the script. Hmm. Indeed, it would nice when the assembler could encode that - as sick as it is. Noted. |
|||
11 September 2018, 12:15 | #13 |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
|
11 September 2018, 17:27 | #14 | |
Registered User
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,960
|
Quote:
Seems you create own dissasembler. I used only ReSource 5 or 6 for resourcing, but it has some limitations. Cant handle word adressing code correctly, cant handle phx's favorite word relocs, has problems with big bss sections (not enough memory), and cant handle word/longword offsets at odd addreses. Last feature is perhaps dont handled by Amiga assemblers too, but i dont checked this exactly. And was used in some Titus games too, f.e in Dark Century music. Btw. for me - CMPI.W #xx,D0 can be optional resourced/fixed as bra.b, much easiest to understand. |
|
11 September 2018, 22:03 | #15 | |||
Registered User
Join Date: Mar 2018
Location: Hastings, New Zealand
Posts: 2,546
|
Quote:
Quote:
The only problem with my approach is that after many years of tweaking the code to handle different situations it has to started to look like spaghetti. I am now concentrating on cleaning it up, which unsurprisingly has made it work better! Small programs now often disassemble accurately without any user input, while larger programs only need a few minutes work to get right. The GUI is designed with an emphasis on quickly producing viable source code rather than offering a confusing array of menus or having to create complicated scripts. My focus is on OS-friendly applications rather than games. These are mostly written in C, so my disassembler is 'tuned' to handle typical compiled code. The goal is to produce source code that can be easily modified to fix bugs or add features. That means not necessarily producing identical output, but correctly identifying different data types and formatting them appropriately so that the code doesn't break when modified. A re-assembler might produce an identical executable without correctly identifying code and data structures. For example a block of code could be incorrectly disassembled as word data and still be executable, or some data might be misidentified as code which then (hopefully) assembles back to the same byte sequence. Branch tables (used in Case statements) and array element offsets may be disassembled as plain words, therefore not labeling the addresses they point to. This becomes a problem if you want to modify the source. Quote:
Big BSS sections can be a problem because I allocate equal amounts of memory for both the executable and the 'symbol table' (which has an entry for each byte in the executable). This is not a problem if you have plenty of RAM. The largest (in memory) executable size my disassembler can handle is 1MB, so it won't need more than 2MB total (in two contiguous blocks of 1MB each). This may seem wasteful, but has the advantage that the symbol table is always large enough, and no extra memory ever has to be allocated. It also avoids having to search through the symbol table, so disassembly doesn't slow down with large programs (which may have several thousand labels). Since disassembly is fully interactive and occurs 'on the fly', having fast access to entries in the symbol table is essential. |
|||
11 September 2018, 23:03 | #16 |
Registered User
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,960
|
Sorry for my poor english, but word/longword accesses at odd addresses are legal. I dont mean about code, but only about offsets. Something like this:
even Start dc.b 15 dc.w label1-Start .... label1 .... Of course it has no big sense for 68000 CPU because move.w cant be used, but two times move.b and lsl is used, if i remember right. Maybe if someone want to optimize size of data it has sense. You can check source and asembled version, here: http://wt.exotica.org.uk/customs.html 33th custom module. Because no support for word data at odd addresses, i added extra buff, and custom module is larger. |
11 September 2018, 23:43 | #17 | |||||
Natteravn
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,496
|
Quote:
The GUI approach is probably more comfortable and easier to use, but GUIs are not really portable. Quote:
Quote:
Quote:
Quote:
The reassembler which I am mostly using, IRA, must do a complete run after every little change, which may take a lot of time. But, on the other hand, it is portable and I can also run it on extremely fast machines, together with my assembler, which is also portable. |
|||||
12 September 2018, 08:56 | #18 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
My needs are different, and games are more difficult to get right because they can use all sorts of wicked tricks. I also need to disassemble things that are not Amiga executables. The most difficult is absolute code : you don't have the relocs but you need a way to set them up by hand. In the case of Back To The Golden Age i got saved because two versions with different offsets could be compared. But even OS friendly apps can have complications. Libraries/devices for example don't have obvious entry points. Same (and worse) for Delitracker/Eagleplayer plugins. |
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
MFMWarp Add | Giants | support.WinUAE | 9 | 01 January 2018 10:41 |
Thank you for the add | Starslayer | New to Emulation or Amiga scene | 1 | 25 December 2015 09:28 |
Can't add a harddrive | DyLucke | support.WinUAE | 5 | 15 January 2011 16:24 |
CD-ROM add-on | AliasXZ | New to Emulation or Amiga scene | 2 | 04 February 2008 11:53 |
Add HD Drive | zeke1312 | New to Emulation or Amiga scene | 7 | 04 February 2008 00:36 |
|
|