68K register management

hop · 03 June 2022, 12:29

Are there any widely adopted, standard best practices for robustly managing register usage in 68K assembly? Hoping that the collective knowlege of years of game and demo coding can give me a few tried and tested guidlines to save me headaches down the line.

As I write larger programs with more routines I've found myself debugging a good few times, only to find a called routine is stomping a register that the caller is using to store state.

I add a comment block to each routine with which registers it uses as inputs, outputs and what it modifies (uses as scratch internally). For example:

Code:

; ---------------------------------------------------------------------------
; Reads a longword from trackBuffer into D0, loading another tracks worth of data if required
;
; D0 <- value
;
; Modifies: D0, D1, D6, D7, A0, A1 
;
readLongIntoD0:
      ...

This is the best approach I've come up with, but I've been caught out when a called function uses a new register, and I've made an error manually propogating it up the comments: missed adding it to one of the comments on a function that calls it the modifed routine.

I'm not sure if this approach would scale to a large (game sized) project with dozens of files and hundreds of routines.

I know that OS calls can always trash D0,D1,A0,A1 so it's up to the caller to preserve, but guarantees to preserves others internally. I'm not sure if this is a good approach for games - sounds a bit heavyweight, but at least it is approach that would work. Is there a good middle ground, or is it all a bit freeform in practice? Thanks for any tips.

Thomas Richter · 03 June 2022, 14:22

The widely adopted standard you already mentioned. d0-d1/a0-a1 are scratches, everything else is preserved. It's not only the Os convention, it is also the convention adopted by most (if not all) compilers. Performance-wise, I doubt it makes much of a difference in most functions, and for low-level, high-duty functions, you can still document something else. Using a register allocation per-function is confusing and error-prone, so don't. As soon as you want to change something, you shoot yourself into the foot.

As suggestion, one could probably include a6 in the scratch register list if you call Os functions a lot. Or use a5 instead of a4 as base register for data, or...

But, as stated, consistency is most important. Select one convention, stick to it for a project, and for some low-level, heavy duty stuff, use proper documentation and reasoning.

NorthWay · 03 June 2022, 16:23

I remember way back when BYTE used to have selected snippets from BIX that Jez San had some discussion he entered about this. I can't remember his approach right now, but it might be possible to find online.

paraj · 03 June 2022, 17:35

For a set of functions working closely together I'll often have an entry function that adheres to the global calling convention, and then keep state/often used constants in fixed registers. For example in a decruncher you might keep the input buffer pointer in a0, output in a1 and the "current byte" in d0 (or whatever).

You can see an example of what I mean in Ross' Ocean Loader: http://eab.abime.net/showthread.php?t=88565 (might not be latest version)

meynaf · 03 June 2022, 18:57

A good practice is to put same data in same register whereever you can. Like on-map coordinate pairs in D6-D7, input data in A0, output data in A1, etc. This way it is not only cleaner, but also faster as the relevant data will often be at the right place without having anything to move. This is something a compiler can not do.
For documenting your functions, don't just list the registers they currently do alter, but also the registers they are eventually allowed to alter - so whenever you have to change something, you have a few of them readily available (or you know that none is usable and you have to save something).

a/b · 03 June 2022, 19:09

Quote:

Originally Posted by meynaf

This is something a compiler can not do.

Completely wrong. These days we have AI code generated by AI compilers running on AI hardware and operating on AI cloud AI data. They can do all of that and more, can't you see?
Also, you're selling yourself short. Every time you write an IF statement, holy crap, that's AI. Every time you write a SWITCH statement, wowowoowoww!11! that's *advanced* AI.

/SARCASM_OFF

meynaf · 03 June 2022, 19:14

Quote:

Originally Posted by a/b

Completely wrong. These days we have AI code generated by AI compilers running on AI hardware and operating on AI cloud AI data. They can do all of that and more, can't you see?
Also, you're selling yourself short. Every time you write an IF statement, holy crap, that's AI. Every time you write a SWITCH statement, wowowoowoww!11! that's *advanced* AI.

/SARCASM_OFF

And i suppose this AI is able to generate 68k code too

alkis · 03 June 2022, 19:46

Well, I don't think it's the compiler that couldn't do it. The compiler is restricted by the language definition.

A C language extension, say:

foo(int x, int y):boo(),bar() {
...
}

denoting that foo is only callable from boo() or bar(), would allow the compiler to shuffle the registers cross-functions wide.

Just a guess!

paraj · 03 June 2022, 20:11

Quote:

Originally Posted by alkis

Well, I don't think it's the compiler that couldn't do it. The compiler is restricted by the language definition.

A C language extension, say:

foo(int x, int y):boo(),bar() {
...
}

denoting that foo is only callable from boo() or bar(), would allow the compiler to shuffle the registers cross-functions wide.

Just a guess!

Whole program optimization would (in principle) allow this without any extensions. But that discussion is as at least as old as the concept of a wiki (https://wiki.c2.com/?SufficientlySmartCompiler).

Fact is that even in 2022 this thread is still relevant if you care about trying to do performant stuff for real Amiga HW.

Thomas Richter · 03 June 2022, 20:53

Even in 2022, the 80-20 rule applies, saying that in your typical program, 80% of the running time is spend in 20% of the code. For the 20% of the code that matters, optimize and worry about saving moves. For the remaining 80%, use a clean calling convention and don't worry too much about performance, or that 80% of functions will create 80% of the trouble later.

Compilers can remove registers moves by inlining functions, for example. That's not black magic.

hop · 03 June 2022, 23:38

Quote:

Originally Posted by Thomas Richter

Performance-wise, I doubt it makes much of a difference in most functions, and for low-level, high-duty functions, you can still document something else. Using a register allocation per-function is confusing and error-prone, so don't. As soon as you want to change something, you shoot yourself into the foot.

You're absolutely right. It seems very easy to fall into the premature optimisation trap when programming at such a low level. I'll try adopting the OS convention for my next application and see how it goes - I think it will be relaxing to be able to focus on one routine in isolation.

hop · 03 June 2022, 23:42

Quote:

Originally Posted by paraj

For a set of functions working closely together I'll often have an entry function that adheres to the global calling convention, and then keep state/often used constants in fixed registers.

I like this idea. I just wrote a bootblock trackdisk hunk loader and it made sense to reserve a set of registers for global state throughout. I didn't think of doing this as part of a bigger program. I'll definitely keep this in mind thanks.

hop · 03 June 2022, 23:49

Quote:

Originally Posted by meynaf

For documenting your functions, don't just list the registers they currently do alter, but also the registers they are eventually allowed to alter - so whenever you have to change something, you have a few of them readily available (or you know that none is usable and you have to save something).

I'm not sure I understand this. Would this mean that a function needs to be aware of everywhere it is called from and I need to look up and logically calculate the set of free registers? Sounds tough.

Auscoder · 04 June 2022, 00:06

I have a macro that I use at each end of a function, this macro is not used for performance builds.

Said macro will push a copy of all registers I need preserved, and compare state of registers on function exit. It’s ok to push/compare all registers to start.
In specific debugging mode the exit macro will compare for changed or unchanged registers depending what I am interested in observing. This allows me to analizar register candidates for scratch or register stomps.

Overhead is adding macros to each call of interest, but it’s not much overhead really. Performance wise it is expensive, but in release build macro injects nothing so…. Perf cost is nill. It may be enough to just run this check once a frame at main loop if your conventions are strict enough. But still, it’s always too easy to scrub a register in the hunt for optimisations.

meynaf · 04 June 2022, 09:04

Quote:

Originally Posted by hop

I'm not sure I understand this. Would this mean that a function needs to be aware of everywhere it is called from and I need to look up and logically calculate the set of free registers? Sounds tough.

Well, you have to do this in all cases otherwise you're at risk of a register conflict. The fact a register is really used, or could be used, doesn't change that.
Functions often aren't called from many places so it's usually not a big deal.
And for very common functions called everywhere it's best to alter as few registers as possible (ideally none).

Thomas Richter · 04 June 2022, 14:50

But it can become a big deal as soon as you continue to update your software and don't check or don't remember. Having to keep track for each function which registers it "pollutes" does not scale well. As said, this is all acceptable for heavy-duty functions that are local to a particular source (aka "not XDEF'd" aka "static") and really intended to be "helpers" for some other function, but for functions that are intended "general purpose" or "cross module", it is really a bad idea.

As an idea, one should adopt a particular naming convention for such "local stuff" and for global functions that are intended to be used cross-module. It does not matter much which convention you adopt, but some convention is better than nothing. To give you an idea, you could start the name of such helpers with two underscores.

Most software isn't written once and is then complete. It is typically an iterative process that continues over many years, over multiple versions, and then you need a clean style you remember even after years.

Thomas Richter · 04 June 2022, 14:54

*double post*

mcgeezer · 04 June 2022, 15:35

Quote:

Originally Posted by Auscoder

I have a macro that I use at each end of a function, this macro is not used for performance builds.

Said macro will push a copy of all registers I need preserved, and compare state of registers on function exit. It’s ok to push/compare all registers to start.
In specific debugging mode the exit macro will compare for changed or unchanged registers depending what I am interested in observing. This allows me to analizar register candidates for scratch or register stomps.

Overhead is adding macros to each call of interest, but it’s not much overhead really. Performance wise it is expensive, but in release build macro injects nothing so…. Perf cost is nill. It may be enough to just run this check once a frame at main loop if your conventions are strict enough. But still, it’s always too easy to scrub a register in the hunt for optimisations.

I do this aswell, it is super handy for crash purposes and debugging.

hop · 04 June 2022, 16:21

Quote:

Originally Posted by mcgeezer

I do this aswell, it is super handy for crash purposes and debugging.

This does sound useful. Is this a pair of macros that bookend the function body? Would it be possible to share?

meynaf · 04 June 2022, 16:28

Quote:

Originally Posted by Thomas Richter

But it can become a big deal as soon as you continue to update your software and don't check or don't remember. Having to keep track for each function which registers it "pollutes" does not scale well. As said, this is all acceptable for heavy-duty functions that are local to a particular source (aka "not XDEF'd" aka "static") and really intended to be "helpers" for some other function, but for functions that are intended "general purpose" or "cross module", it is really a bad idea.

Of course. This is what well-documented interfaces are for.
Actually for cross-module i've often chosen to not alter any register at all unless it's an output. My OS wrapper does exactly that. For end-user call-back routines (f.e. a codec in picture viewer or sound player) all of them may be altered, it's the caller who saves them if needed.

03 June 2022, 12:29	#1
hop Registered User Join Date: Apr 2019 Location: UK Posts: 172	68K register management Are there any widely adopted, standard best practices for robustly managing register usage in 68K assembly? Hoping that the collective knowlege of years of game and demo coding can give me a few tried and tested guidlines to save me headaches down the line. As I write larger programs with more routines I've found myself debugging a good few times, only to find a called routine is stomping a register that the caller is using to store state. I add a comment block to each routine with which registers it uses as inputs, outputs and what it modifies (uses as scratch internally). For example: Code: ; --------------------------------------------------------------------------- ; Reads a longword from trackBuffer into D0, loading another tracks worth of data if required ; ; D0 <- value ; ; Modifies: D0, D1, D6, D7, A0, A1 ; readLongIntoD0: ... This is the best approach I've come up with, but I've been caught out when a called function uses a new register, and I've made an error manually propogating it up the comments: missed adding it to one of the comments on a function that calls it the modifed routine. I'm not sure if this approach would scale to a large (game sized) project with dozens of files and hundreds of routines. I know that OS calls can always trash D0,D1,A0,A1 so it's up to the caller to preserve, but guarantees to preserves others internally. I'm not sure if this is a good approach for games - sounds a bit heavyweight, but at least it is approach that would work. Is there a good middle ground, or is it all a bit freeform in practice? Thanks for any tips.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Memory Management	Havie	Coders. Blitz Basic	6	10 April 2022 21:38
File management	marm	New to Emulation or Amiga scene	21	04 November 2018 13:41
Assign Management Utility	Daedalus	Amiga scene	22	28 March 2017 12:54
Better (perhaps) config files management	NewDeli	request.UAE Wishlist	7	24 June 2008 15:43
Icon Management	Makkinen	support.WinUAE	3	16 October 2004 00:30

03 June 2022, 14:22	#2
Thomas Richter Registered User Join Date: Jan 2019 Location: Germany Posts: 3,216	The widely adopted standard you already mentioned. d0-d1/a0-a1 are scratches, everything else is preserved. It's not only the Os convention, it is also the convention adopted by most (if not all) compilers. Performance-wise, I doubt it makes much of a difference in most functions, and for low-level, high-duty functions, you can still document something else. Using a register allocation per-function is confusing and error-prone, so don't. As soon as you want to change something, you shoot yourself into the foot. As suggestion, one could probably include a6 in the scratch register list if you call Os functions a lot. Or use a5 instead of a4 as base register for data, or... But, as stated, consistency is most important. Select one convention, stick to it for a project, and for some low-level, heavy duty stuff, use proper documentation and reasoning.

03 June 2022, 16:23	#3
NorthWay Registered User Join Date: May 2013 Location: Grimstad / Norway Posts: 839	I remember way back when BYTE used to have selected snippets from BIX that Jez San had some discussion he entered about this. I can't remember his approach right now, but it might be possible to find online.

03 June 2022, 17:35	#4
paraj Registered User Join Date: Feb 2017 Location: Denmark Posts: 1,099	For a set of functions working closely together I'll often have an entry function that adheres to the global calling convention, and then keep state/often used constants in fixed registers. For example in a decruncher you might keep the input buffer pointer in a0, output in a1 and the "current byte" in d0 (or whatever). You can see an example of what I mean in Ross' Ocean Loader: http://eab.abime.net/showthread.php?t=88565 (might not be latest version)

03 June 2022, 18:57	#5
meynaf son of 68k Join Date: Nov 2007 Location: Lyon / France Age: 51 Posts: 5,323	A good practice is to put same data in same register whereever you can. Like on-map coordinate pairs in D6-D7, input data in A0, output data in A1, etc. This way it is not only cleaner, but also faster as the relevant data will often be at the right place without having anything to move. This is something a compiler can not do. For documenting your functions, don't just list the registers they currently do alter, but also the registers they are eventually allowed to alter - so whenever you have to change something, you have a few of them readily available (or you know that none is usable and you have to save something).

03 June 2022, 19:46	#8
alkis Registered User Join Date: Dec 2010 Location: Athens/Greece Age: 53 Posts: 719	Well, I don't think it's the compiler that couldn't do it. The compiler is restricted by the language definition. A C language extension, say: foo(int x, int y):boo(),bar() { ... } denoting that foo is only callable from boo() or bar(), would allow the compiler to shuffle the registers cross-functions wide. Just a guess!

03 June 2022, 20:53	#10
Thomas Richter Registered User Join Date: Jan 2019 Location: Germany Posts: 3,216	Even in 2022, the 80-20 rule applies, saying that in your typical program, 80% of the running time is spend in 20% of the code. For the 20% of the code that matters, optimize and worry about saving moves. For the remaining 80%, use a clean calling convention and don't worry too much about performance, or that 80% of functions will create 80% of the trouble later. Compilers can remove registers moves by inlining functions, for example. That's not black magic.

04 June 2022, 00:06	#14
Auscoder Registered User Join Date: Jan 2019 Location: Brisbane Posts: 99	I have a macro that I use at each end of a function, this macro is not used for performance builds. Said macro will push a copy of all registers I need preserved, and compare state of registers on function exit. It’s ok to push/compare all registers to start. In specific debugging mode the exit macro will compare for changed or unchanged registers depending what I am interested in observing. This allows me to analizar register candidates for scratch or register stomps. Overhead is adding macros to each call of interest, but it’s not much overhead really. Performance wise it is expensive, but in release build macro injects nothing so…. Perf cost is nill. It may be enough to just run this check once a frame at main loop if your conventions are strict enough. But still, it’s always too easy to scrub a register in the hunt for optimisations.

04 June 2022, 14:50	#16
Thomas Richter Registered User Join Date: Jan 2019 Location: Germany Posts: 3,216	But it can become a big deal as soon as you continue to update your software and don't check or don't remember. Having to keep track for each function which registers it "pollutes" does not scale well. As said, this is all acceptable for heavy-duty functions that are local to a particular source (aka "not XDEF'd" aka "static") and really intended to be "helpers" for some other function, but for functions that are intended "general purpose" or "cross module", it is really a bad idea. As an idea, one should adopt a particular naming convention for such "local stuff" and for global functions that are intended to be used cross-module. It does not matter much which convention you adopt, but some convention is better than nothing. To give you an idea, you could start the name of such helpers with two underscores. Most software isn't written once and is then complete. It is typically an iterative process that continues over many years, over multiple versions, and then you need a clean style you remember even after years.

04 June 2022, 14:54	#17
Thomas Richter Registered User Join Date: Jan 2019 Location: Germany Posts: 3,216	double post

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)