English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 08 September 2021, 19:20   #1
LeCaravage
Registered User
 
LeCaravage's Avatar
 
Join Date: May 2017
Location: AmigaLand
Posts: 459
How to trace back Stack ?

Hi,

I need your help.

My config: A500 KS1.3, Trash'm one 1.6

My program crashes, Trash'm one handle this crash but the PC is somewhere in memory non expected.
I need to know where is the instruction at the origin of this bug. I'd like to use the content of SSP.

So my question is : how can I know where was the last RTS before crash ? Using SSP displayed by the rescue option of Trash'm one ?

I guess, general speaking, the question would be how to interprete the stack to know where is the last RTS ?

Thanks,
LeCaravage is offline  
Old 08 September 2021, 22:51   #2
a/b
Registered User
 
Join Date: Jun 2016
Location: europe
Posts: 1,039
SSP would be the place to find where the exception occured, but asm/trash-one&co process and remove the exception stack frame in order to print the status report (where and why the program crashed, dump registers, etc.. assuming you are running it within trash).
Last rts... that depends whether your program is running in user or supervisor mode. If user, then USP is the place to check for the return address that the routine that crashed would use with rts.
If that proves to be a problem, easy way around it is to put something like "move.l #*,$70000" at the start of each routine you think could cause a problem and then simply look up what's at address $70000 to find the problematic one.
a/b is offline  
Old 08 September 2021, 23:13   #3
LeCaravage
Registered User
 
LeCaravage's Avatar
 
Join Date: May 2017
Location: AmigaLand
Posts: 459
The crash occurs in a memory location where my code is not present at all :

The point where crash occurs is a LINE-F instruction somwhere around $00005000. My code has probably made an unwanted jump to this location. If I could find the last RTS proceed by my code, I'd probably sharpen the location of the original problematic instruction.
My best bet was to use the USP (and not the SSP as written above) but I have no idea how to interpret the values pointed by A7.

Edit : I checked the address contained in (A7) and nothing of my code is present.
LeCaravage is offline  
Old 08 September 2021, 23:31   #4
DanScott
Lemon. / Core Design
 
DanScott's Avatar
 
Join Date: Mar 2016
Location: Tier 5
Posts: 1,211
More than likely something is corrupting your stack ?
DanScott is offline  
Old 08 September 2021, 23:40   #5
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
Exist many options to trace stack return, you can use a/b method. Anyway if you know place where wrong jump is done. You can use next method too. Fill area from $4000 to $6000 with $4AFC and call your program, or add filling routine to your program in init part. When wrong jump occured you will be know stack return. Anyway often for buggy routines some (sometimes many) code (trash data) is executed without crashing Amiga. F.e if your code jump to area filled f.e. with empty bytes ($00000000).
Don_Adan is offline  
Old 09 September 2021, 00:00   #6
mcgeezer
Registered User
 
Join Date: Oct 2017
Location: Sunderland, England
Posts: 2,702
Probably easier ways to do it but if the crash is intermittent then i’d stick a macro at the start and end of the suspect routines, the macro simply places a unique identifier in a known location in ram. When the crash happens it will identify the problem routine…. Then it’s a case of just doing a bit further digging.

Edit: - Just to elaborate on what I do - basically it does what a/b suggested.

I'll typically get a list of all of my function labels like this (you can use whatever means necessary to generate the list as long as the routine names generate a unique hex ID).

Code:
Graeme@HYPERSPIN MINGW64 /c/Development/DevilsTemple
$ find . -type f -name '*.asm' -exec grep ^[a-z] {} \;  |egrep -i -v "dc|equ" |awk '{ print $ 1 }' | while read line; do A=`echo $line |md5sum -t |cut -c1-8`; echo $A $line ; done
59da7965 mainLoadAudio:
7b3f02ec mainLoadThomas:
31a84b5c mainLoadTiles:
429b4bdd mainLoadEnemies:
cb497ff6 agdInstallAudioPlayer:
28cb9635 agdRemoveAudioPlayer:
98e29e4e agdGetSoundSample:
2b6e09b4 agdPlayModule:
.
.
.

Then I'll setup some macros.

Code:
FUNCENTRY MACRO
	move.l	\1,$100
	ENDM
Code:
FUNCRETURN MACRO
	move.l	\1,$104
	ENDM
Then each of the suspect routines I'll mark with the ID on entry and return from the table I generated.

for example:

Code:
mainLoadTiles:  FUNCENTRY $31a84b5c 
                 .
                 do stuff
                 .

                 FUNCRETURN $31a84b5c 
                 rts
When the crash occurs the last entered id will be in $100 and the last id returned will be in $104 which you can then map back to the routine names using the lookup generated in step 1.

I'll point out that I've only had to use this in some pretty extreme circumstances and as a last resort, WinUAE has some pretty amazing functionality that assists in debugging that I would recommend you look up on first before venturing down this route.

Geezer

Last edited by mcgeezer; 09 September 2021 at 11:52.
mcgeezer is offline  
Old 09 September 2021, 07:17   #7
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
There are several reasons why instruction streams can go haywire, not only RTS.
If the crash always happens then i'd just trace the code with a debugger, executing subroutine calls directly instead of stepping in, until the culprit is found.
meynaf is offline  
Old 09 September 2021, 12:26   #8
phx
Natteravn
 
phx's Avatar
 
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,496
There are many ways to debug such a crash, but the original question remains an interesting one: how does an m68k debugger generate a back trace of the stack?

I know there are debuggers which can do that, quite reliably. But it seems to be a kind of black art, because no m68k ABI I know has a fixed stack frame layout, like PPC V.4-ABI for example, which makes it easy.

I would guess that you advance word by word on the stack and verify if it contains a valid memory pointer (or even a pointer to a location in your current process' code section). Then check if the previous instruction is a BSR or JSR.
phx is offline  
Old 09 September 2021, 12:49   #9
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
I suspect that your code using something like this:

movem.l d0-d4,-(sp)
....

movem.l (sp)+,d0-d3
rts
Don_Adan is offline  
Old 09 September 2021, 18:51   #10
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,215
Quote:
Originally Posted by phx View Post
I would guess that you advance word by word on the stack and verify if it contains a valid memory pointer (or even a pointer to a location in your current process' code section). Then check if the previous instruction is a BSR or JSR.
This is pretty much what COP does. Verify that the LW on the stack is pointing to one of the valid recognized address space region recorded by it, check whether it is even, and check whether the instruction above is a JSR or BSR. Of course, this is not perfect. One can "fake" subroutines by PEA, or pushing addresses manually on the stack, and this will of course be missed. If the stack is swapped, it will also miss subroutines behind the stack swap.
Thomas Richter is offline  
Old 09 September 2021, 22:51   #11
LeCaravage
Registered User
 
LeCaravage's Avatar
 
Join Date: May 2017
Location: AmigaLand
Posts: 459
Strange, today Trash'm one can't handle the bug. Everything freezes and crashes. It means, the bug depends on what memory contains where the PC jumps. Too bad, if I knew it could change badly, I'd have used the a/b and mcgeezer method quickly to have chances to locate the faulty area.

I have no longer any memory informations and symbols. So, I'll make an exe and load/execute it via Action Replay and try to investigate by using the method of a/b and mcgeezer.

@DanScott & Don_Adan : I checked the stack operations in source and all seems to be ok.


So If I understand well, the stack could be accessed word and long size aswell but never byte ?
LeCaravage is offline  
Old 09 September 2021, 23:12   #12
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
Quote:
Originally Posted by LeCaravage View Post
Strange, today Trash'm one can't handle the bug. Everything freezes and crashes. It means, the bug depends on what memory contains where the PC jumps. Too bad, if I knew it could change badly, I'd have used the a/b and mcgeezer method quickly to have chances to locate the faulty area.

I have no longer any memory informations and symbols. So, I'll make an exe and load/execute it via Action Replay and try to investigate by using the method of a/b and mcgeezer.

@DanScott & Don_Adan : I checked the stack operations in source and all seems to be ok.


So If I understand well, the stack could be accessed word and long size aswell but never byte ?
If you checked stack acceses, and no bug in registers handling. Then for normal coding PC can jump out of your code only via jmp/jsr commands.
Like
jmp 10(Ax) ; buggy Ax
jsr 16(Ax,Dx.W); buggy Ax or Dx values

etc.
especially it can occured when jsr (Ax,Dx.W) or jsr(Ax,Dx.L) is used.
because Dx highword can be trashed, or Dx.W can be negative, when Dx.L is necessary.

f.e if you wrote jsr (Ax,Dx), it will be works as jsr (Ax,Dx.W), because all (?) assemblers handles this Dx as Dx.W
From my memory i see one strange access, for missing .L or .W.

Something like this, if i remember right

jsr (A0,A1)

some assemblers handles this as jsr (A0,A1.W), but correct was jsr (A0,A1.L)

Last edited by Don_Adan; 09 September 2021 at 23:20.
Don_Adan is offline  
Old 10 September 2021, 02:44   #13
a/b
Registered User
 
Join Date: Jun 2016
Location: europe
Posts: 1,039
It probably means you are trashing important parts of OS (and the memory is cleared during full reboot). Did you try testing it in winuae and being ready to quickly hit shift+f12 when things go wrong? So you can examine the memory in winuae debugger while it's still intact.
Do you have "clear DS" enabled in assembler preferences? Sometimes it's easier to find out what's going on if you are dealing with zeroes and null pointers, and not random trash in memory.
And finally, the elimination method. Start commenting out routines and parts of the code until it's working (or vice versa, comment out most of it and then start putting stuff back in).
a/b is offline  
Old 10 September 2021, 06:30   #14
morbid
Registered User
 
Join Date: Aug 2020
Location: Huddinge
Posts: 24
If your code or the stack is in chip mem,
it could be that you have a broken blitter operation that overwrites either of them.
morbid is offline  
Old 10 September 2021, 14:08   #15
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,602
To trace the stack, you must trace it because there's no telling how many movems (etc) are between SP and caller address when the crash occurs. Does Thrash'm One trace?

If not, you can trace yourself in a simple way, by logging the address you're calling just before the call. This could be in the form of a macro that replaces the call with code that logs it (PC address, call address) and then calls.

This list would grow very quickly. But you can have a circular buffer, and you can place it in a part of memory that you know the address to and which is not allocated, and then even a soft reset wouldn't prevent you from learning the last known address.

If you want subroutine names instead of addresses, it must survive back to the Assembler, though. ORGanizing the code temporarily while bug-hunting allows the logged addresses to mean something.

"Checkpoints" are a simpler similar alternative to tracing. You can write a single variable or address with the current PC address (or checkpoint name), and after a crash find the "last-survived-to" point in your code.
Photon is offline  
Old 10 September 2021, 16:24   #16
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,215
That needs a debugger that can debug parts other debuggers cannot reach. Did I mention "COP"?
Thomas Richter is offline  
Old 10 September 2021, 18:40   #17
Rock'n Roll
German Translator
 
Rock'n Roll's Avatar
 
Join Date: Aug 2018
Location: Drübeck / Germany
Age: 49
Posts: 183
here https://eab.abime.net/showthread.php?t=91321

New debugger commands:
...
- rs = show tracked stack frame.
- rss = show tracked supervisor stack frame.
- ts = break when tracked stack frame count decreases. (=tracked stack frame matched executed RTS)
- tsp = break when tracked stack frame count decreases or increases. (RTS/BSR/JSR).
- tse/tsd = enable/disable full stack frame tracking. Can be used when no debugmem debugging is active.

but I don't know how it works and how it can maybe help......
Rock'n Roll is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Amiga Disk Duplication (Trace Machine) logik project.SPS (was CAPS) 1 21 November 2019 13:02
Amiga 600 Recap Project - Trace Gone! cbmeeks support.Hardware 5 17 June 2017 11:12
WinUAE debugger trace bug b00mer support.WinUAE 7 23 August 2014 15:37
Amiga 1200 - burnt trace king2k support.Hardware 3 30 December 2013 12:34
Trace Machine to sell on ebay dlfrsilver project.SPS (was CAPS) 40 05 September 2010 23:11

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 02:11.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.12989 seconds with 15 queries