08 September 2021, 19:20 | #1 |
Registered User
Join Date: May 2017
Location: AmigaLand
Posts: 459
|
How to trace back Stack ?
Hi,
I need your help. My config: A500 KS1.3, Trash'm one 1.6 My program crashes, Trash'm one handle this crash but the PC is somewhere in memory non expected. I need to know where is the instruction at the origin of this bug. I'd like to use the content of SSP. So my question is : how can I know where was the last RTS before crash ? Using SSP displayed by the rescue option of Trash'm one ? I guess, general speaking, the question would be how to interprete the stack to know where is the last RTS ? Thanks, |
08 September 2021, 22:51 | #2 |
Registered User
Join Date: Jun 2016
Location: europe
Posts: 1,039
|
SSP would be the place to find where the exception occured, but asm/trash-one&co process and remove the exception stack frame in order to print the status report (where and why the program crashed, dump registers, etc.. assuming you are running it within trash).
Last rts... that depends whether your program is running in user or supervisor mode. If user, then USP is the place to check for the return address that the routine that crashed would use with rts. If that proves to be a problem, easy way around it is to put something like "move.l #*,$70000" at the start of each routine you think could cause a problem and then simply look up what's at address $70000 to find the problematic one. |
08 September 2021, 23:13 | #3 |
Registered User
Join Date: May 2017
Location: AmigaLand
Posts: 459
|
The crash occurs in a memory location where my code is not present at all :
The point where crash occurs is a LINE-F instruction somwhere around $00005000. My code has probably made an unwanted jump to this location. If I could find the last RTS proceed by my code, I'd probably sharpen the location of the original problematic instruction. My best bet was to use the USP (and not the SSP as written above) but I have no idea how to interpret the values pointed by A7. Edit : I checked the address contained in (A7) and nothing of my code is present. |
08 September 2021, 23:31 | #4 |
Lemon. / Core Design
Join Date: Mar 2016
Location: Tier 5
Posts: 1,211
|
More than likely something is corrupting your stack ?
|
08 September 2021, 23:40 | #5 |
Registered User
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
|
Exist many options to trace stack return, you can use a/b method. Anyway if you know place where wrong jump is done. You can use next method too. Fill area from $4000 to $6000 with $4AFC and call your program, or add filling routine to your program in init part. When wrong jump occured you will be know stack return. Anyway often for buggy routines some (sometimes many) code (trash data) is executed without crashing Amiga. F.e if your code jump to area filled f.e. with empty bytes ($00000000).
|
09 September 2021, 00:00 | #6 |
Registered User
Join Date: Oct 2017
Location: Sunderland, England
Posts: 2,702
|
Probably easier ways to do it but if the crash is intermittent then i’d stick a macro at the start and end of the suspect routines, the macro simply places a unique identifier in a known location in ram. When the crash happens it will identify the problem routine…. Then it’s a case of just doing a bit further digging.
Edit: - Just to elaborate on what I do - basically it does what a/b suggested. I'll typically get a list of all of my function labels like this (you can use whatever means necessary to generate the list as long as the routine names generate a unique hex ID). Code:
Graeme@HYPERSPIN MINGW64 /c/Development/DevilsTemple $ find . -type f -name '*.asm' -exec grep ^[a-z] {} \; |egrep -i -v "dc|equ" |awk '{ print $ 1 }' | while read line; do A=`echo $line |md5sum -t |cut -c1-8`; echo $A $line ; done 59da7965 mainLoadAudio: 7b3f02ec mainLoadThomas: 31a84b5c mainLoadTiles: 429b4bdd mainLoadEnemies: cb497ff6 agdInstallAudioPlayer: 28cb9635 agdRemoveAudioPlayer: 98e29e4e agdGetSoundSample: 2b6e09b4 agdPlayModule: . . . Then I'll setup some macros. Code:
FUNCENTRY MACRO move.l \1,$100 ENDM Code:
FUNCRETURN MACRO move.l \1,$104 ENDM for example: Code:
mainLoadTiles: FUNCENTRY $31a84b5c . do stuff . FUNCRETURN $31a84b5c rts I'll point out that I've only had to use this in some pretty extreme circumstances and as a last resort, WinUAE has some pretty amazing functionality that assists in debugging that I would recommend you look up on first before venturing down this route. Geezer Last edited by mcgeezer; 09 September 2021 at 11:52. |
09 September 2021, 07:17 | #7 |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
There are several reasons why instruction streams can go haywire, not only RTS.
If the crash always happens then i'd just trace the code with a debugger, executing subroutine calls directly instead of stepping in, until the culprit is found. |
09 September 2021, 12:26 | #8 |
Natteravn
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,496
|
There are many ways to debug such a crash, but the original question remains an interesting one: how does an m68k debugger generate a back trace of the stack?
I know there are debuggers which can do that, quite reliably. But it seems to be a kind of black art, because no m68k ABI I know has a fixed stack frame layout, like PPC V.4-ABI for example, which makes it easy. I would guess that you advance word by word on the stack and verify if it contains a valid memory pointer (or even a pointer to a location in your current process' code section). Then check if the previous instruction is a BSR or JSR. |
09 September 2021, 12:49 | #9 |
Registered User
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
|
I suspect that your code using something like this:
movem.l d0-d4,-(sp) .... movem.l (sp)+,d0-d3 rts |
09 September 2021, 18:51 | #10 |
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,215
|
This is pretty much what COP does. Verify that the LW on the stack is pointing to one of the valid recognized address space region recorded by it, check whether it is even, and check whether the instruction above is a JSR or BSR. Of course, this is not perfect. One can "fake" subroutines by PEA, or pushing addresses manually on the stack, and this will of course be missed. If the stack is swapped, it will also miss subroutines behind the stack swap.
|
09 September 2021, 22:51 | #11 |
Registered User
Join Date: May 2017
Location: AmigaLand
Posts: 459
|
Strange, today Trash'm one can't handle the bug. Everything freezes and crashes. It means, the bug depends on what memory contains where the PC jumps. Too bad, if I knew it could change badly, I'd have used the a/b and mcgeezer method quickly to have chances to locate the faulty area.
I have no longer any memory informations and symbols. So, I'll make an exe and load/execute it via Action Replay and try to investigate by using the method of a/b and mcgeezer. @DanScott & Don_Adan : I checked the stack operations in source and all seems to be ok. So If I understand well, the stack could be accessed word and long size aswell but never byte ? |
09 September 2021, 23:12 | #12 | |
Registered User
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
|
Quote:
Like jmp 10(Ax) ; buggy Ax jsr 16(Ax,Dx.W); buggy Ax or Dx values etc. especially it can occured when jsr (Ax,Dx.W) or jsr(Ax,Dx.L) is used. because Dx highword can be trashed, or Dx.W can be negative, when Dx.L is necessary. f.e if you wrote jsr (Ax,Dx), it will be works as jsr (Ax,Dx.W), because all (?) assemblers handles this Dx as Dx.W From my memory i see one strange access, for missing .L or .W. Something like this, if i remember right jsr (A0,A1) some assemblers handles this as jsr (A0,A1.W), but correct was jsr (A0,A1.L) Last edited by Don_Adan; 09 September 2021 at 23:20. |
|
10 September 2021, 02:44 | #13 |
Registered User
Join Date: Jun 2016
Location: europe
Posts: 1,039
|
It probably means you are trashing important parts of OS (and the memory is cleared during full reboot). Did you try testing it in winuae and being ready to quickly hit shift+f12 when things go wrong? So you can examine the memory in winuae debugger while it's still intact.
Do you have "clear DS" enabled in assembler preferences? Sometimes it's easier to find out what's going on if you are dealing with zeroes and null pointers, and not random trash in memory. And finally, the elimination method. Start commenting out routines and parts of the code until it's working (or vice versa, comment out most of it and then start putting stuff back in). |
10 September 2021, 06:30 | #14 |
Registered User
Join Date: Aug 2020
Location: Huddinge
Posts: 24
|
If your code or the stack is in chip mem,
it could be that you have a broken blitter operation that overwrites either of them. |
10 September 2021, 14:08 | #15 |
Moderator
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,602
|
To trace the stack, you must trace it because there's no telling how many movems (etc) are between SP and caller address when the crash occurs. Does Thrash'm One trace?
If not, you can trace yourself in a simple way, by logging the address you're calling just before the call. This could be in the form of a macro that replaces the call with code that logs it (PC address, call address) and then calls. This list would grow very quickly. But you can have a circular buffer, and you can place it in a part of memory that you know the address to and which is not allocated, and then even a soft reset wouldn't prevent you from learning the last known address. If you want subroutine names instead of addresses, it must survive back to the Assembler, though. ORGanizing the code temporarily while bug-hunting allows the logged addresses to mean something. "Checkpoints" are a simpler similar alternative to tracing. You can write a single variable or address with the current PC address (or checkpoint name), and after a crash find the "last-survived-to" point in your code. |
10 September 2021, 16:24 | #16 |
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,215
|
That needs a debugger that can debug parts other debuggers cannot reach. Did I mention "COP"?
|
10 September 2021, 18:40 | #17 |
German Translator
Join Date: Aug 2018
Location: Drübeck / Germany
Age: 49
Posts: 183
|
here https://eab.abime.net/showthread.php?t=91321
New debugger commands: ... - rs = show tracked stack frame. - rss = show tracked supervisor stack frame. - ts = break when tracked stack frame count decreases. (=tracked stack frame matched executed RTS) - tsp = break when tracked stack frame count decreases or increases. (RTS/BSR/JSR). - tse/tsd = enable/disable full stack frame tracking. Can be used when no debugmem debugging is active. but I don't know how it works and how it can maybe help...... |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Amiga Disk Duplication (Trace Machine) | logik | project.SPS (was CAPS) | 1 | 21 November 2019 13:02 |
Amiga 600 Recap Project - Trace Gone! | cbmeeks | support.Hardware | 5 | 17 June 2017 11:12 |
WinUAE debugger trace bug | b00mer | support.WinUAE | 7 | 23 August 2014 15:37 |
Amiga 1200 - burnt trace | king2k | support.Hardware | 3 | 30 December 2013 12:34 |
Trace Machine to sell on ebay | dlfrsilver | project.SPS (was CAPS) | 40 | 05 September 2010 23:11 |
|
|