68K assembler/disassembler syntax

nocash · 22 April 2016, 00:03

What is that 68K assembler/disassembler syntax looking like exactly? My current disassembler supports 80x86-sytle addressing modes, and the official 68K addressing modes.

The problem is that nobody seems to be using that official syntax with expressions like "(imm,An)", or am I wrong there, and some people do use that syntax? If not, then I'll drop that syntax.

More common seem to be expressions like "imm(An)" for 68000 code, but I've no idea what kind of expressions would be used for the extra addressing modes of the later 680xx revisions - could somebody complete the table below?

Code:

  Mode     80x86             68K/Official        68K/Common?
  000rrr   Dn                Dn                  Dn
  001rrr   An                An                  An
  010rrr   [An]              (An)                (An)
  011rrr   [An]+             (An)+               (An)+
  100rrr   -[An]             -(An)               -(An)
  101rrr   [d16+An]          (d16,An)            d16(An)
  110rrr   [d8+An+Xn]        (d8,An,Xn)          ?
  110rrr   [bd+An+Xn]        (bd,An,Xn)          ?
  110rrr   [[bd+An+Xn]+od]   ([bd,An,Xn],od)     ?
  110rrr   [[bd+An]+Xn+od]   ([bd,An],Xn,od)     ?
  111000   [xxx]             (xxx).W             ?
  111001   [xxx]             (xxx).L             ?
  111010   [addr]            (d16,PC)            d16(PC)
  111011   [addr+Xn]         (d8,PC,Xn)          ?
  111011   [addr+Xn]         (bd,PC,Xn)          ?
  111011   [[addr+Xn]+od]    ([bd,PC,Xn],od)     ?
  111011   [[addr]+Xn+od]    ([bd,PC],Xn,od)     ?
  111100   xxx               #<xxx>              ?

Something that isn't officially documented is the "scale" feature, for the "Index". Which, I assume that "Index" means the "Xn" register, and scaling it would look as "Xn*4", for example?
And, the Index can be 16bit, also without officially defined syntax. I guess the above example would then look as "Xn.w*4"?

There seems to be also a MIT syntax (with percent-symbols preceeding all register names, like "%d0"), is there anybody using that syntax? Or are there more variants that should be supported in assemblers/disassemblers/debuggers?

Oh, and are there any common pseudo opcodes, like PUSH/POP or whatever?

I haven't looked into directives yet, only spotted something like "dc.b" here or there. Is there some document that lists the most imporant 68K assembler directives?

matthey · 22 April 2016, 01:53

Quote:

Originally Posted by nocash

What is that 68K assembler/disassembler syntax looking like exactly? My current disassembler supports 80x86-sytle addressing modes, and the official 68K addressing modes.

The problem is that nobody seems to be using that official syntax with expressions like "(imm,An)", or am I wrong there, and some people do use that syntax? If not, then I'll drop that syntax.

I use the Motorola new style (68020) syntax which I prefer but some 68k programmers learned on the 68000 and don't want to change. New 68k programmers often learn from the old programmers and old resources which also often use the old style syntax. It's nice when assemblers and disassemblers support both. Most assemblers and some disassemblers like IRA and ADis (my updated version) support both.

Quote:

Originally Posted by nocash

More common seem to be expressions like "imm(An)" for 68000 code, but I've no idea what kind of expressions would be used for the extra addressing modes of the later 680xx revisions - could somebody complete the table below?

Code:

  Mode     80x86             68K/Official        68K/Common?
  000rrr   Dn                Dn                  Dn
  001rrr   An                An                  An
  010rrr   [An]              (An)                (An)
  011rrr   [An]+             (An)+               (An)+
  100rrr   -[An]             -(An)               -(An)
  101rrr   [d16+An]          (d16,An)            d16(An)
  110rrr   [d8+An+Xn]        (d8,An,Xn)          ?
  110rrr   [bd+An+Xn]        (bd,An,Xn)          ?
  110rrr   [[bd+An+Xn]+od]   ([bd,An,Xn],od)     ?
  110rrr   [[bd+An]+Xn+od]   ([bd,An],Xn,od)     ?
  111000   [xxx]             (xxx).W             ?
  111001   [xxx]             (xxx).L             ?
  111010   [addr]            (d16,PC)            d16(PC)
  111011   [addr+Xn]         (d8,PC,Xn)          ?
  111011   [addr+Xn]         (bd,PC,Xn)          ?
  111011   [[addr+Xn]+od]    ([bd,PC,Xn],od)     ?
  111011   [[addr]+Xn+od]    ([bd,PC],Xn,od)     ?
  111100   xxx               #<xxx>              ?

Something that isn't officially documented is the "scale" feature, for the "Index". Which, I assume that "Index" means the "Xn" register, and scaling it would look as "Xn*4", for example?
And, the Index can be 16bit, also without officially defined syntax. I guess the above example would then look as "Xn.w*4"?

The old style syntax did not have many of the 68020 addressing modes so there is no official way to represent them. Immediate and absolute addressing are the same and the 2 other modes that did exist are:

(d8,An,Xn) or d8(An,Xn)
(d8,PC,Xn) or d8(PC,Xn)

Index register scaling did not exist on the 68000 but is generally accepted in old style syntax. Index registers and all the addressing modes should be documented in the 68000PRM.

https://www.nxp.com/files/archives/d.../M68000PRM.pdf

Quote:

Originally Posted by nocash

There seems to be also a MIT syntax (with percent-symbols preceeding all register names, like "%d0"), is there anybody using that syntax? Or are there more variants that should be supported in assemblers/disassemblers/debuggers?

MIT syntax is mostly used by compilers like GCC (vbcc and SAS/C do *not* use it). Most 68k programmers avoid using it if possible.

Quote:

Originally Posted by nocash

Oh, and are there any common pseudo opcodes, like PUSH/POP or whatever?

Some 68k assembler programmers use PUSH and POP macros.

Quote:

Originally Posted by nocash

I haven't looked into directives yet, only spotted something like "dc.b" here or there. Is there some document that lists the most important 68K assembler directives?

The vasm assembler supports most directives which the manual documents.

http://sun.hasenbraten.de/vasm/release/vasm.pdf

NorthWay · 22 April 2016, 10:41

MIT syntax is not meant for human consumption...

mark_k · 22 April 2016, 11:07

Read various assembler manuals to see which directives they support. The "official" Amiga Macro Assembler was written by Metacomco and marketed by Commodore in the Amiga's early days, and many assemblers aim to be compatible with that.

nocash · 22 April 2016, 12:49

Okay, I will avoid supporting MIT syntax.

There are old and new 68K syntaxes? I have a "M68000 8-/16-/32-Bit Microprocessors User’s Manual Ninth Edition" from 1993, it doesn't cover 68020 instructions, but the syntax looks the same as in documents that do cover newer instructions.
But well, it's a "Ninth Edition", maybe older versions did use different syntax (?)
Just guessing:
"d16(An)" = old syntax ?
(d16,An)" = new syntax ?

Quote:

Originally Posted by matthey

The old style syntax did not have many of the 68020 addressing modes so there is no official way to represent them. Immediate and absolute addressing are the same and the 2 other modes that did exist are:
(d8,An,Xn) or d8(An,Xn)
(d8,PC,Xn) or d8(PC,Xn)

Good to know it's spelled "d8(An,Xn)". And absolute would be "(xxx).W" and "(xxx).L". And, immediate "#<xxx>"? The sharp brackets are looking as if they could/should be omitted, unless they are intended to distinguish between "CMPI #imm" and "CMP #<imm>"?

Oh, and I got that "(d16,PC) = d16(PC)" wrong. Official specs do actually say "(d16,PC)". But the more common form seems to be "addr(PC)", with "addr" being a 32bit address, and "(PC)" just hinting that it shall be encoded as relative address, with automatically calculated "d16" displacement.
So, I am quite sure that assemblers would recognize this:
addr(PC)
But when enclosing everything in brackets, would an assembler treat it as...
(addr,PC)
or
(d16,PC) ?

Quote:

Originally Posted by matthey

Index register scaling did not exist on the 68000 but is generally accepted in old style syntax. Index registers and all the addressing modes should be documented in the 68000PRM.
https://www.nxp.com/files/archives/d.../M68000PRM.pdf

Oops, yes. It isn't described in the addressing mode summaries, but the more detailed per-mode descriptions have it documented as "ASSEMBLER SYNTAX: (d8,An,Xn.SIZE*SCALE)".

Quote:

Originally Posted by matthey

The vasm assembler supports most directives which the manual documents.
http://sun.hasenbraten.de/vasm/release/vasm.pdf

Ah, fine! Specifically, the directives from the "Mot Syntax Module" chapter for Motorola 68K code, right? I'll probably stick with implementing only the more basic directives like "dc.b" and "ds.b", but good to know which other directives do exist.

nocash · 22 April 2016, 13:29

Quote:

Originally Posted by mark_k

Read various assembler manuals to see which directives they support. The "official" Amiga Macro Assembler was written by Metacomco and marketed by Commodore in the Amiga's early days, and many assemblers aim to be compatible with that.

Good to know! This document http://www.pagetable.com/docs/amigad...dos_manual.pdf contains a chapter about an "AmigaDOS Macro Assembler", it doesn't mention Metacomco, but I guess that it's the same Macro Assembler.
Aside from directives it's also covering the "imm(An)" syntax.
And, surprisingly, "addr(PC)" and "addr(PC,Xn) are just spelled as "addr" and "Addr(Xn)", which looks nicer, but might cause problems when needing to distinguish between absolute and relative addressing. Especially as absolute addresses "(addr).W" and "(addr).L" are just spelled as "addr", too.

Hmmm, the MIT and vasm manuals also mention "ZDn" register operands (=for indicating not to use register Dn). Which is somewhere between nonsense & useful for knowing the exact opcode size/cycles.

matthey · 22 April 2016, 17:43

Quote:

Originally Posted by nocash

There are old and new 68K syntaxes? I have a "M68000 8-/16-/32-Bit Microprocessors User’s Manual Ninth Edition" from 1993, it doesn't cover 68020 instructions, but the syntax looks the same as in documents that do cover newer instructions.
But well, it's a "Ninth Edition", maybe older versions did use different syntax (?)

Just guessing:
"d16(An)" = old syntax ?
(d16,An)" = new syntax ?

Yes.

Motorola used the old syntax during the 68000/68010 era and updated to the new syntax with the 68020 ISA. Some of the new 68020 addressing modes are less readable with the old syntax so most official documentation was updated. The M68000PRM (Programmer's Reference Manual) I linked to is from Motorola and official documentation. It describes in detail all the new addressing modes and uses the new style syntax. It is a good "newer" 68k reference but has some errors which were never corrected.

Quote:

Originally Posted by nocash

Good to know it's spelled "d8(An,Xn)". And absolute would be "(xxx).W" and "(xxx).L". And, immediate "#<xxx>"? The sharp brackets are looking as if they could/should be omitted, unless they are intended to distinguish between "CMPI #imm" and "CMP #<imm>"?

The brackets mean something is substituted (including the brackets which are omitted). There should be a reference in the documentation as to the meaning of the symbols.

Quote:

Originally Posted by nocash

Oh, and I got that "(d16,PC) = d16(PC)" wrong. Official specs do actually say "(d16,PC)". But the more common form seems to be "addr(PC)", with "addr" being a 32bit address, and "(PC)" just hinting that it shall be encoded as relative address, with automatically calculated "d16" displacement.
So, I am quite sure that assemblers would recognize this:
addr(PC)
But when enclosing everything in brackets, would an assembler treat it as...
(addr,PC)
or
(d16,PC) ?

The treatment is the same for old and new style syntax. I believe (d16,PC) is more accurate as a 16 bit value can not fully describe an address when the address buss is wider than 16 bits (even the 68008 had a 20 bit address buss). I suppose with a small enough amount of memory the d16 would be large enough for a memory address but I don't like the use of (addr,PC) and I don't recall it appearing in the M68000PRM so it was likely an error or Motorola changed it.

Quote:

Originally Posted by nocash

Ah, fine! Specifically, the directives from the "Mot Syntax Module" chapter for Motorola 68K code, right? I'll probably stick with implementing only the more basic directives like "dc.b" and "ds.b", but good to know which other directives do exist.

The dc.<size> and ds.<size> are the most common directives. Section and CPU directives may be necessary also but a disassembler only uses a few directives.

nocash · 22 April 2016, 21:17

Quote:

Originally Posted by matthey

The brackets mean something is substituted (including the brackets which are omitted).

Yeah, thought so. Just asked because "[xxx].W" didn't have sharp brackets, whilst "#<xxx>" did have them. Might be one of the more subtly confusing errors in that document. I've used the M68000PRM manual a lot in past months, too. It isn't really perfect, but it seems to be the best official document that one could get.

Quote:

Originally Posted by matthey

The treatment is the same for old and new style syntax. I believe (d16,PC) is more accurate... I don't like the use of (addr,PC) and I don't recall it appearing in the M68000PRM so it was likely an error or Motorola changed it.

No. Yes. "d16" is more precise for describing the opcode encoding (in a tech doc about opcodes). But I thought that the daily-life assembler/disassembler syntax would replace the "d16(PC)" by the actual 32bit address, ie. "address(PC)". Somewhat similar as in BRA opcodes.
At least, I've spotted it that way here: http://wandel.ca/homepage/execdis/exec_disassembly.txt (see the various places that contain "(PC)" as operand) - or is that some big exception, and nobody else uses that notation?
Oh, or is one of the assembler directives allowing to select this or that notation?

mark_k · 22 April 2016, 21:24

Assemblers convert label references to offsets automatically when PC-relative addressing is used.

nocash · 22 April 2016, 21:58

Uh, but when is PC-relative addressing being used? Sorry, that question does probably sound stupid. But...

Does one just specify "label" or "(label)" and the assembler does automatically "know" if the address is relative addressable (ie. in cases where "label" is located within the same code segment)? That's about how it's described in the AmigaDOS manual.

Or could/should one specify it as "label(PC)" or "(label,PC)" to indicate that relative addressing is wanted? That's how it's done in the exec_disassembly. Also spotted something similar here: http://eab.abime.net/showthread.php?t=75779 - "move.w _joy_tableX(PC,d0.w),_dx_joy"

mark_k · 22 April 2016, 22:22

Yes the programmer specifies for example

Code:

    move.l   (SomeValue,PC),D0
    ...
SomeValue:
    dc.l $12345678

Some assemblers have an optimisation option/ability which would automatically assemble move.l SomeValue,D0 as the PC-relative move.l (SomeValue,PC),D0 when SomeValue is within 32KB of the instruction.

nocash · 22 April 2016, 22:54

Ah, great, then both variants are working (or working optionally, at least).
The "(SomeValue,PC)" was just what I meant when originally coming up with "(addr,PC)".

Only thing that isn't completely clear to me is what happens if "SomeValue" is not a "label", but rather some absolute immediate address value.
Like assigning it as "SomeValue EQU $FC05B4", and then using "(SomeValue,PC)" as opcode operand.
Or right using "($FC05B4,PC)" as opcode operand.

Not that it would make too much sense to use that kind of code... except maybe for assembling things like kernel patches, then it might be nice if the assembler would convert $FC05B4 into a PC-relative 16bit offset.

NorthWay · 22 April 2016, 23:11

You typically specify it as an absolute address(in reloc form unless you use the ORG directive) or label

Code:

move.l SOMEVAR,d0

Any assembler worth its salt will optimize it to pc relative if optimizations are turned on and it is within range and in the same section.

Motorola had some misplaced ideas about code re-use and non-self modifying that made them only let you do reads in pc mode.
Now, that matches unix type code segments that are read-only so it is not the worst of ideas, but you could have envisioned a segment with read-only parts and read-write parts. And during the design in pre-1979 not only did it not have an MMU, the world wasn't all unix either.

Photon · 23 April 2016, 00:35

Any Assembler worth its salt will assume you're not an ass and assemble exactly what you write.

This cheat sheet contains tables adapted from my Motorola 68000 Programmer's Manual, with EAs and cycle times. They will fill in the fields marked "?" in your table.

22 April 2016, 00:03	#1
nocash Registered User Join Date: Feb 2016 Location: Homeless Posts: 63	68K assembler/disassembler syntax What is that 68K assembler/disassembler syntax looking like exactly? My current disassembler supports 80x86-sytle addressing modes, and the official 68K addressing modes. The problem is that nobody seems to be using that official syntax with expressions like "(imm,An)", or am I wrong there, and some people do use that syntax? If not, then I'll drop that syntax. More common seem to be expressions like "imm(An)" for 68000 code, but I've no idea what kind of expressions would be used for the extra addressing modes of the later 680xx revisions - could somebody complete the table below? Code: Mode 80x86 68K/Official 68K/Common? 000rrr Dn Dn Dn 001rrr An An An 010rrr [An] (An) (An) 011rrr [An]+ (An)+ (An)+ 100rrr -[An] -(An) -(An) 101rrr [d16+An] (d16,An) d16(An) 110rrr [d8+An+Xn] (d8,An,Xn) ? 110rrr [bd+An+Xn] (bd,An,Xn) ? 110rrr [[bd+An+Xn]+od] ([bd,An,Xn],od) ? 110rrr [[bd+An]+Xn+od] ([bd,An],Xn,od) ? 111000 [xxx] (xxx).W ? 111001 [xxx] (xxx).L ? 111010 [addr] (d16,PC) d16(PC) 111011 [addr+Xn] (d8,PC,Xn) ? 111011 [addr+Xn] (bd,PC,Xn) ? 111011 [[addr+Xn]+od] ([bd,PC,Xn],od) ? 111011 [[addr]+Xn+od] ([bd,PC],Xn,od) ? 111100 xxx #<xxx> ? Something that isn't officially documented is the "scale" feature, for the "Index". Which, I assume that "Index" means the "Xn" register, and scaling it would look as "Xn4", for example? And, the Index can be 16bit, also without officially defined syntax. I guess the above example would then look as "Xn.w4"? There seems to be also a MIT syntax (with percent-symbols preceeding all register names, like "%d0"), is there anybody using that syntax? Or are there more variants that should be supported in assemblers/disassemblers/debuggers? Oh, and are there any common pseudo opcodes, like PUSH/POP or whatever? I haven't looked into directives yet, only spotted something like "dc.b" here or there. Is there some document that lists the most imporant 68K assembler directives?

22 April 2016, 21:58	#10
nocash Registered User Join Date: Feb 2016 Location: Homeless Posts: 63	Uh, but when is PC-relative addressing being used? Sorry, that question does probably sound stupid. But... Does one just specify "label" or "(label)" and the assembler does automatically "know" if the address is relative addressable (ie. in cases where "label" is located within the same code segment)? That's about how it's described in the AmigaDOS manual. Or could/should one specify it as "label(PC)" or "(label,PC)" to indicate that relative addressing is wanted? That's how it's done in the exec_disassembly. Also spotted something similar here: http://eab.abime.net/showthread.php?t=75779 - "move.w _joy_tableX(PC,d0.w),_dx_joy" Last edited by nocash; 22 April 2016 at 22:10.

22 April 2016, 22:22	#11
mark_k Registered User Join Date: Aug 2004 Location: Posts: 3,343	Yes the programmer specifies for example Code: move.l (SomeValue,PC),D0 ... SomeValue: dc.l $12345678 Some assemblers have an optimisation option/ability which would automatically assemble move.l SomeValue,D0 as the PC-relative move.l (SomeValue,PC),D0 when SomeValue is within 32KB of the instruction.

22 April 2016, 23:11	#13
NorthWay Registered User Join Date: May 2013 Location: Grimstad / Norway Posts: 840	You typically specify it as an absolute address(in reloc form unless you use the ORG directive) or label Code: move.l SOMEVAR,d0 Any assembler worth its salt will optimize it to pc relative if optimizations are turned on and it is within range and in the same section. Motorola had some misplaced ideas about code re-use and non-self modifying that made them only let you do reads in pc mode. Now, that matches unix type code segments that are read-only so it is not the worst of ideas, but you could have envisioned a segment with read-only parts and read-write parts. And during the design in pre-1979 not only did it not have an MMU, the world wasn't all unix either. Last edited by NorthWay; 23 April 2016 at 19:58. Reason: Typo

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Disassembler	copse	Coders. General	86	01 January 2023 20:34
CAPE 68k assembler	videofx	request.Apps	1	17 May 2014 14:42
Looking for ArtOfNoise Playroutine (68k assembler)	Herpes	Coders. Asm / Hardware	5	05 September 2012 00:10
VBCC assembler linking syntax?	NovaCoder	Coders. General	2	20 May 2011 03:04
A good 68K disassembler	TikTok	request.Apps	11	23 January 2002 03:49

22 April 2016, 10:41	#3
NorthWay Registered User Join Date: May 2013 Location: Grimstad / Norway Posts: 840	MIT syntax is not meant for human consumption...

22 April 2016, 11:07	#4
mark_k Registered User Join Date: Aug 2004 Location: Posts: 3,343	Read various assembler manuals to see which directives they support. The "official" Amiga Macro Assembler was written by Metacomco and marketed by Commodore in the Amiga's early days, and many assemblers aim to be compatible with that.

22 April 2016, 21:24	#9
mark_k Registered User Join Date: Aug 2004 Location: Posts: 3,343	Assemblers convert label references to offsets automatically when PC-relative addressing is used.

22 April 2016, 22:54	#12
nocash Registered User Join Date: Feb 2016 Location: Homeless Posts: 63	Ah, great, then both variants are working (or working optionally, at least). The "(SomeValue,PC)" was just what I meant when originally coming up with "(addr,PC)". Only thing that isn't completely clear to me is what happens if "SomeValue" is not a "label", but rather some absolute immediate address value. Like assigning it as "SomeValue EQU $FC05B4", and then using "(SomeValue,PC)" as opcode operand. Or right using "($FC05B4,PC)" as opcode operand. Not that it would make too much sense to use that kind of code... except maybe for assembling things like kernel patches, then it might be nice if the assembler would convert $FC05B4 into a PC-relative 16bit offset.

23 April 2016, 00:35	#14
Photon Moderator Join Date: Nov 2004 Location: Eksjö / Sweden Posts: 5,604	Any Assembler worth its salt will assume you're not an ass and assemble exactly what you write. This cheat sheet contains tables adapted from my Motorola 68000 Programmer's Manual, with EAs and cycle times. They will fill in the fields marked "?" in your table.

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)