DS directive - is it zeroed? - Page 4

phx · 21 May 2019, 22:26

Quote:

Originally Posted by ross

I second other coders, I don't like NOP for padding.

Then the -devpac option is for you (and Don).

Quote:

Two reason: if I need some NOP for debug I'll insert it by hand and I don't like this opcode somewhere else, and as not zero compress less.

Noted, but this won't make me change vasm's default. IMHO it is more important to have a consecutive program flow, when the CPU is running in such a padding (for whatever reason). And vasm is not alone here.

Quote:

Just because I'm used by PhxAss and in this very moment totally lazy to read a simple manual ,what's the syntax for a corresponding

OPT 3

?

Indeed, that's a lot of work to determine all the optimization options, because "opt o+" sets more than PhxAss' "opt 3", and one optimization, which is enabled by default, must even be cleared.

Starting in vasm default mode:

Code:

        opt     ol+,op+,oc+,ot+,om+,oj+,o7-

And when all optimizations were disabled (opt o- or -devpac):

Code:

        opt     a+,ol+,op+,oc+,ot+,of+,o2+,o8+,o9+,om+,o5+,o11+,o1+,oj+,og+,o3+,o4+,o10+,o12+

Or just run vasm in Phxass-compatibility mode (-phxass) and use "opt 3", as usual. The PhxAss OPT directive is not documented, but it works for compatibility reasons.

ross · 21 May 2019, 22:34

Thanks phx

Bruce Abbott · 22 May 2019, 04:38

Quote:

Originally Posted by phx

I know the padding is slightly offtopic, but I still would like to understand why this is a problem. Wouldn't a $4e71 make it easier for you to spot the potential CNOPs from the original source?

No, because the NOP could be executable code used to provide a delay or substitute for another instruction, so you can't assume it's a CNOP.

Inline NOPs can be used to get subsequent code longword aligned for faster operation on a 32 or 64 bit bus, which might justify using an alignment directive which inserts NOPs. But 68k instructions have varying lengths so there are probably not many cases where this alignment would be useful (though some compilers seem it do it all the time anyway, wasting space for no good reason!).

However as far as producing valid code is concerned anything goes. The compiler could just insert random garbage and it wouldn't make any difference so long as it wasn't touched. It only becomes a real problem when trying to create an identical binary from disassembled code or with a different assembler or compiler (particularly important when dealing with legacy code, as we so often are on the Amiga!).

A good assembler should therefore give you full control over what filler word is used, not make assumptions that you are forced to accept.

meynaf · 22 May 2019, 08:03

Quote:

Originally Posted by Bruce Abbott

No, because the NOP could be executable code used to provide a delay or substitute for another instruction, so you can't assume it's a CNOP.

If the NOP is right in the middle of a routine then you should see it easily because there is no RTS, BRA, or JMP before to end previous routine.
Else the mere presence of a label will tell you it's a NOP at startup of a routine rather than some padding.

Quote:

Originally Posted by Bruce Abbott

A good assembler should therefore give you full control over what filler word is used, not make assumptions that you are forced to accept.

If it's about producing an identical binary, you can use simple DC.W for padding. Then when you touch the code (= change it from original) the actual value becomes meaningless and you can even remove it completely (which I do).
I see no problem here.

Antiriad_UK · 22 May 2019, 11:26

Opened a can of worms with this question didn't I

I've been through my source of my intro and I found I'd used ds.l/ds/b in a few places in a standard data section. Now I know that I was only getting this zeroed "back in the day" because I was using devpac and "by luck" in ASMOne I changed all these to dcb x,0.

Pure BSS is zeroed on load (used for all my screen buffers) and I was clearing the screen are in each routine anyway, but as a quick check I changed it to a chip data section and filled it with $aa Lo and behold I could see glitches where my bitplanes weren't exactly lined up with my copperlist. In each routine I was sharing the screen buffer areas and they worked fine standalone. But when you put all the routines together the glitches appeared. Will be filling my buffers with $aa every so often now to check!

So I had two bugs. One where I was expected something to be zeroed might not be. And one where something was zeroed but because of that was hiding a bug.

Cheers all

phx · 22 May 2019, 19:35

For all the friends of CNOP: I just implemented a new command line option to set the 16-bit padding value (still defaults to 0x4e71). So CNOP can also pad with zeros in vasm-standard mode (no -devpac required).

-cnop=0

(Available with tomorrow's snapshot.)

Bruce Abbott · 22 May 2019, 23:14

Quote:

Originally Posted by meynaf

If it's about producing an identical binary, you can use simple DC.W for padding. Then when you touch the code (= change it from original) the actual value becomes meaningless and you can even remove it completely (which I do).
I see no problem here.

To get an identical binary you have to reproduce the nop. If you modify the binary then you need to know whether the nop is an actual opcode or padding for alignment.

Quote:

If the NOP is right in the middle of a routine then you should see it easily because there is no RTS, BRA, or JMP before to end previous routine.
Else the mere presence of a label will tell you it's a NOP at startup of a routine rather than some padding.

Not always. Take the following code for example:-

Code:

r09648:
 move.w  20(A5),D0
 ext.l   D0
 sub.l   #$00000001,D0
 blt.s   r096a6
 cmpi.l  #$00000004,D0
 bgt.s   r096a6
 asl.l   #1,D0
 jmp     *+4(PC,D0.w)
 bra.s   r0966c
 bra.s   r09688
 bra.s   r0966c
 nop
r0966c:
 move.l  26(A5),-(A7)
 pea     a09cff ; "Minimum=%d."
 move.l  l0ee7e,-(A7)
 jsr     r11a0c

Is the nop for alignment, or is it code? To answer that question you have to figure out what the code above it is doing - in this case it's a computed jump and the nop is one of the destinations. Changing this nop to dc.w 0 would be fatal.

Granted you don't see this very often. My disassembler assumes that nops are not code if they are found outside a labeled code block, but sometimes there's no label. If there is also 'data' before the nop then it gets tricky (at least if you want an accurate disassembly, rather having blocks of code shown as data). My disassembler shows it as a nop rather than cnop because it might actually be code.

dc.w 0 can also be tricky to identify. Luckily the instruction ori.b #x,d0 is not often found at the start of a code block, and if the following word's value is >255 then it's almost certainly not code. dc.l 0 is easier because ori.b #0,d0 is a useless instruction.

Don_Adan · 22 May 2019, 23:34

Quote:

Originally Posted by Bruce Abbott

To get an identical binary you have to reproduce the nop. If you modify the binary then you need to know whether the nop is an actual opcode or padding for alignment.

Not always. Take the following code for example:-

Code:

r09648:
 move.w  20(A5),D0
 ext.l   D0
 sub.l   #$00000001,D0
 blt.s   r096a6
 cmpi.l  #$00000004,D0
 bgt.s   r096a6
 asl.l   #1,D0
 jmp     *+4(PC,D0.w)
 bra.s   r0966c
 bra.s   r09688
 bra.s   r0966c
 nop
r0966c:
 move.l  26(A5),-(A7)
 pea     a09cff ; "Minimum=%d."
 move.l  l0ee7e,-(A7)
 jsr     r11a0c

Is the nop for alignment, or is it code? To answer that question you have to figure out what the code above it is doing - in this case it's a computed jump and the nop is one of the destinations. Changing this nop to dc.w 0 would be fatal.

Granted you don't see this very often. My disassembler assumes that nops are not code if they are found outside a labeled code block, but sometimes there's no label. If there is also 'data' before the nop then it gets tricky (at least if you want an accurate disassembly, rather having blocks of code shown as data). My disassembler shows it as a nop rather than cnop because it might actually be code.

dc.w 0 can also be tricky to identify. Luckily the instruction ori.b #x,d0 is not often found at the start of a code block, and if the following word's value is >255 then it's almost certainly not code. dc.l 0 is easier because ori.b #0,d0 is a useless instruction.

Right. dc.w 0 is not perfect for padding, but is better than nop. Perhaps perfect for padding can be $4AFC (Illegal) opcode. Anyway, i remember some code/sources which used cnop 0,8 or cnop 0,16 or even more bytes padded with Nq (nop) and some data which used cnop 0,2 or even, padded with $4E byte.

NorthWay · 23 May 2019, 00:05

Two things
-It should never have been named "cnop". That kinda ties you to the mast on what you expect it to do.
-I agree that $4AFC is a better pick.

ross · 23 May 2019, 00:59

Quote:

Originally Posted by Bruce Abbott

..because ori.b #0,d0 is a useless instruction.

Actually it's useful to me

, you should expect to find it in many of my patches.
I use it as a single longword nop instruction to skip code like bcc.w or similar.

ross · 23 May 2019, 01:24

Quote:

Originally Posted by Bruce Abbott

Not always. Take the following code for example:-

This is a further reason not to use nop with the CNOP directive.

In that case nop is used as a code therefore a specific choice of the programmer or compiler.
Nobody would use cnop in that position because it's not padding.
If it were my code I would immediately recognize the situation.

Quote:

Originally Posted by Don_Adan

Right. dc.w 0 is not perfect for padding, but is better than nop. Perhaps perfect for padding can be $4AFC (Illegal) opcode.

My order of preference for CNOP: $0, $4AFC, $4e71.

I appreciate the new vasm

-cnop=0

option

Leffmann · 23 May 2019, 09:09

Quote:

Originally Posted by ross

I use it as a single longword nop instruction to skip code like bcc.w or similar.

Though

nop

and

ori.b #0, D0

aren't true no-ops, and the latter in particular will change the status flags, so there's a potential for sneaky bugs you would never get with

nop

or a true no-op like

move.l A0, A0

.

ross · 23 May 2019, 09:26

Quote:

Originally Posted by Leffmann

Though

nop

and

ori.b #0, D0

aren't true no-ops, and the latter in particular will change the status flags, so there's a potential for sneaky bugs you would never get with

nop

or a true no-op like

move.l A0, A0

.

Yep, I use

ori.b #0

(or

nop

) only in specific cases (in-place code patch, if applicable), sure not in normal programming.
Anyway

movea.l a0,a0

is a two bytes instruction and his opcode is certainly not

0.w,0.w

(the reason why I use ori.b #0).

meynaf · 23 May 2019, 09:35

Quote:

Originally Posted by Bruce Abbott

To get an identical binary you have to reproduce the nop. If you modify the binary then you need to know whether the nop is an actual opcode or padding for alignment.

Write "nop" in the source whenever you see a nop. Whether it's padding or not is pretty much unimportant before you know what the code is doing -- and then it's no big deal to change.

Same story when you see two nulls after a string. Is the second of any use or is it padding ? Keep it until you know for sure.

Quote:

Originally Posted by Bruce Abbott

Not always. Take the following code for example:-

(...)

Is the nop for alignment, or is it code? To answer that question you have to figure out what the code above it is doing - in this case it's a computed jump and the nop is one of the destinations. Changing this nop to dc.w 0 would be fatal.

There are always exceptions that can be found here and there

Anyway this is not a very realistic example. This kind of construct usually does not execute the nops it contains, otherwise it would just slow it down - and people who write such horrors are concerned with speed.

Quote:

Originally Posted by Bruce Abbott

Granted you don't see this very often. My disassembler assumes that nops are not code if they are found outside a labeled code block, but sometimes there's no label. If there is also 'data' before the nop then it gets tricky (at least if you want an accurate disassembly, rather having blocks of code shown as data). My disassembler shows it as a nop rather than cnop because it might actually be code.

dc.w 0 can also be tricky to identify. Luckily the instruction ori.b #x,d0 is not often found at the start of a code block, and if the following word's value is >255 then it's almost certainly not code. dc.l 0 is easier because ori.b #0,d0 is a useless instruction.

Perhaps your disassembler should not assume anything at all (mine doesn't).
If you want a simple rule to follow there, never ever emit a cnop directive with a disassembler. You can, however, emit the nop or dc.w 0 with a comment if it looks suspicious.

Quote:

Originally Posted by ross

Yep, I know what it do

and I use ori #0 (or nop) only in specific cases (in-place code patch), sure not in normal programming.
Anyway movea.l a0,a0 is a two bytes instruction and his opcode is certainly not 0,0 (the reason why I use ori.b).

Using opcodes eating another opcode is unsafe IMO - and a pain to disassemble later, like all that SAS/C generated code which uses CMPI.W #i,D0 as short branch skipping a word.

If what you want is simple patch, use a true branch (6002).

ross · 23 May 2019, 09:47

Quote:

Originally Posted by meynaf

If what you want is simple patch, use a true branch (6002).

Nahh, it's slow

We are always talking about 'dirty' practices and applied in special cases, logical that if there are better alternatives it is always the case to use them.

ross · 23 May 2019, 10:23

Quote:

Originally Posted by meynaf

Using opcodes eating another opcode is unsafe IMO - and a pain to disassemble later, like all that SAS/C generated code which uses CMPI.W #i,D0 as short branch skipping a word.

ahh, at the first reading I didn't understand your sentence.. now is clear why you wrote about "opcodes eating another opcode"
I do not use this way (I edited my post where I specified that I insert a full

0.w,0.w

and not a single 0.w, so

ori.b #0

!
This is a little but significant difference (not that it changes the fact that it's dirty code

).
And yes SAS/C cmpi.w usage is

...

meynaf · 23 May 2019, 10:35

Quote:

Originally Posted by ross

Nahh, it's slow

We are always talking about 'dirty' practices and applied in special cases, logical that if there are better alternatives it is always the case to use them.

Slow maybe, but patches of this kind rarely touch time critical code, do they ?

Quote:

Originally Posted by ross

ahh, at the first reading I didn't understand your sentence.. now is clear why you wrote about "opcodes eating another opcode"
I do not use this way (I edited my post where I specified that I insert a full

0.w,0.w

and not a single 0.w, so

ori.b #0

!
This is a little but significant difference (not that it changes the fact that it's dirty code

).
And yes SAS/C cmpi.w usage is

...

I wouldn't recommend doing that either.
Using 4e71 will ensure anyone reading the code will know what was done.
And no, that it compresses a few bytes better in some cases isn't a valid reason for me

ross · 23 May 2019, 10:54

Quote:

Originally Posted by meynaf

Slow maybe, but patches of this kind rarely touch time critical code, do they ?

Quote:

And no, that it compresses a few bytes better in some cases isn't a valid reason for me

Yes, you are probably right, but unfortunately it's stronger than me: I look for the fastest code and that take up less space, in any case.

So be patient

meynaf · 23 May 2019, 10:58

Quote:

Originally Posted by ross

Yes, you are probably right, but unfortunately it's stronger than me: I look for the fastest code and that take up less space, in any case.

So be patient

I'm not patient

If you want short code there are better ways : disassemble it fully and reassemble with an optimising assembler. You will get a gain in the range of the hundred, if not the thousand.

ross · 23 May 2019, 11:10

Quote:

Originally Posted by meynaf

I'm not patient

If you want short code there are better ways : disassemble it fully and reassemble with an optimising assembler. You will get a gain in the range of the hundred, if not the thousand.

Yes, that's what I did in the last few days in the evening, a game full of Atari ST code and bugs
(contains even an emulator for the trap #1 I/O calls..).
Completely disassembled, corrected, optimized and reassembled gives me an exe thousands of less bytes.
I also do things properly if I want

22 May 2019, 19:35	#66
phx Natteravn Join Date: Nov 2009 Location: Herford / Germany Posts: 2,496	For all the friends of CNOP: I just implemented a new command line option to set the 16-bit padding value (still defaults to 0x4e71). So CNOP can also pad with zeros in vasm-standard mode (no -devpac required). -cnop=0 (Available with tomorrow's snapshot.)

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Issues with ORG directive (vasm + FS-UAE)	Maggot	Coders. Asm / Hardware	15	05 September 2023 11:56
vasm basereg example directive	mcgeezer	Coders. Asm / Hardware	7	18 November 2020 19:58
REPT directive in vasm	phx	Coders. Asm / Hardware	8	01 October 2014 21:48
AsmOne even directive...?	pmc	Coders. General	30	04 December 2009 09:33
Invalid Directive	Kimmo	support.WinUAE	1	23 July 2004 11:23

21 May 2019, 22:34	#62
ross Defendit numerus Join Date: Mar 2017 Location: Crossing the Rubicon Age: 53 Posts: 4,468	Thanks phx

22 May 2019, 11:26	#65
Antiriad_UK OCS forever! Join Date: Mar 2019 Location: Birmingham, UK Posts: 418	Opened a can of worms with this question didn't I I've been through my source of my intro and I found I'd used ds.l/ds/b in a few places in a standard data section. Now I know that I was only getting this zeroed "back in the day" because I was using devpac and "by luck" in ASMOne I changed all these to dcb x,0. Pure BSS is zeroed on load (used for all my screen buffers) and I was clearing the screen are in each routine anyway, but as a quick check I changed it to a chip data section and filled it with $aa Lo and behold I could see glitches where my bitplanes weren't exactly lined up with my copperlist. In each routine I was sharing the screen buffer areas and they worked fine standalone. But when you put all the routines together the glitches appeared. Will be filling my buffers with $aa every so often now to check! So I had two bugs. One where I was expected something to be zeroed might not be. And one where something was zeroed but because of that was hiding a bug. Cheers all

23 May 2019, 00:05	#69
NorthWay Registered User Join Date: May 2013 Location: Grimstad / Norway Posts: 839	Two things -It should never have been named "cnop". That kinda ties you to the mast on what you expect it to do. -I agree that $4AFC is a better pick.

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)