English Amiga Board

English Amiga Board (https://eab.abime.net/index.php)
-   Coders. General (https://eab.abime.net/forumdisplay.php?f=37)
-   -   Disassembling games for fun (https://eab.abime.net/showthread.php?t=36130)

crabfists 20 April 2008 23:29

Disassembling games for fun
 
Hello,

I was thinking about taking up an little project by disassembling an amiga game to reverse engineer it, partly for fun, and partly to perhaps create cross platform code to run the original game on other platforms (yeah, I know this is technically pointless considering emulators exist but don't forget it's mainly for fun and to learn more about the amiga ;)). Has anybody done this before and can they offer any advice on how the best way to do it?

I did try and do this a couple of years ago but I got stuck trying to come up with a good way of setting it up. What I would like to do is use IDA Pro to disassemble the code so I can add comments and annotate and label routines and variables etc and hopefully use WinUAE to step through it to help figure out what is going on.

I am probably about to make a fool of myself with some of my assumptions but what the heck but I am assuming that...

- it will be better to disassemble the memory after the exe is loaded into ram rather than to disassemble the exe itself?
- I will need to load the exe into a fixed memory location if I want the addresses of subroutines/data etc in WinUAE correspond with the addresses of the IDA disassembly?
- if I can work out how to load the exe into a fixed memory location every time I run it I can use WinUAE to set static breakpoints (ie the same addresses each time I run the game) to examine/disable particular routines. Last time I had a go at this the exe would get loaded into a different memory location every time making it impossible to know where particular routines where in memory.

Some ideas and further questions...

I wonder if the best way to do it is to use a memory snapshot from WinUAE and then disassemble that?
I would love it if I could use IDA to modify the code and then reassemble it.

I partially disassembled a spectrum game using IDA Pro and I found it really fun. It was quite easy to setup because it was just a snapshot of the Spectrum's 48k memory I was disassembling and there was no operating system to get in the way and complicate things.

If anybody can offer me any pointers on the best way to take apart an Amiga game then I would really appreciate it!

Thanks

mark_k 21 April 2008 00:57

Quote:

Originally Posted by crabfists (Post 408921)
I was thinking about taking up an little project by disassembling an amiga game to reverse engineer it, partly for fun, and partly to perhaps create cross platform code to run the original game on other platforms (yeah, I know this is technically pointless considering emulators exist but don't forget it's mainly for fun and to learn more about the amiga ;)). Has anybody done this before and can they offer any advice on how the best way to do it?

I have done that for Emerald Mine (pretty much complete disassembly with comments, meaningful label names etc.). Quite interesting, and found a few bugs. :) Also did it to a lesser degree with Carrier Command, but that's a much larger program. For those and various other programs I have disassembled in the past I used ReSource.


Quote:

Originally Posted by crabfists (Post 408921)
- it will be better to disassemble the memory after the exe is loaded into ram rather than to disassemble the exe itself?

If the program is a normal AmigaDOS executable, generally speaking it is preferable to load the executable itself into the disassembler. The disassembler can use information in reloc32 and symbol hunks to improve the disassembly, and hunk information would be preserved. (At least ReSource can do that, not sure about IDA Pro but it does apparently support the Amiga load file format.)

If the executable has multiple hunks, they could get loaded into memory anywhere, not necessarily in contiguous locations. (In fact the addresses of successive hunks are *never* contiguous.) Plus if the program does anything with the segment list, that will be broken if you ever re-assemble it into one hunk.

Of course some games kill the OS and load to a fixed location anyway, some as low as $0400. In that case it's best to create an empty 512KB file (if the game only uses 512KB memory), and overlay the game code in that, in the correct place. Then disassemble the 512KB file and you can put labels where variables are stored etc. [If using ReSource, note that ReSource has a bug where it does not recognise absolute word addresses as pointing within the area being disassembled. It is possible to work around that however.]


Quote:

Originally Posted by crabfists (Post 408921)
- if I can work out how to load the exe into a fixed memory location every time I run it I can use WinUAE to set static breakpoints (ie the same addresses each time I run the game) to examine/disable particular routines. Last time I had a go at this the exe would get loaded into a different memory location every time making it impossible to know where particular routines where in memory.

That's not necessarily a problem. Can you load the game, then as soon as it has loaded freeze/snapshot the state of the emulated Amiga? Then whenever you want to set up breakpoints, work from the snapshot so the hunk addresses are aways the same.


Quote:

Originally Posted by crabfists (Post 408921)
I would love it if I could use IDA to modify the code and then reassemble it.

You may well be able to. Try using ReSource though; ReSource is definitely capable of creating output that can be re-assembled with minimal editing, and if the game uses any OS routines (Exec, DOS, etc.), ReSource has built-in symbol definitions to make the disassembly much easier to read; e.g. JSR (-$228,A6) -> JSR (_LVOOpenLibrary,A6) etc.

If you go the "load to a fixed address and disassemble memory" route, you'd need to spend time fixing up the disassembly to have the same hunk structure as the original. As I mentioned above, you lose all RELOC32 and symbol hunk information that way.

crabfists 21 April 2008 18:14

Thanks for your reply. It's really helpful. I'm encouraged (and a bit surprised :)) to find somebody else interested in this sort of thing.

Quote:

If the program is a normal AmigaDOS executable, generally speaking it is preferable to load the executable itself into the disassembler. The disassembler > can use information in reloc32 and symbol hunks to improve the disassembly, and hunk information would be preserved. (At least ReSource can do that, > not sure about IDA Pro but it does apparently support the Amiga load file format.)
Please excuse my lack of knowledge but in what way will keeping the hunks intact improve the disassembly? I take your word for it that it's worth keeping the hunks intact but I suppose I don't understand how it will help. Can you give examples of what will be better? Sorry if this is a really stupid question.

Quote:

That's not necessarily a problem. Can you load the game, then as soon as it has loaded freeze/snapshot the state of the emulated Amiga? Then whenever you want to set up breakpoints, work from the snapshot so the hunk addresses are aways the same.
If I do this will I still be able to work out where certain routines and variables are in the snapshot of memory in relation to the disassembly? If you are saying the exe loader can put the hunks anywhere in RAM then how will I know from looking at the address of a routine in the disassembly where it is in the snapshot? Or do you mean the base address of all the hunks can be anywhere in RAM and the hunks will be arranged the same in relation to the base address or can each hunk be in a different location each time?

Maybe I'm getting the wrong end of the stick here but how does the disassembler unpack the hunks into its address space and does it use the same algorithm as the exe loader? Will it put the hunks in exactly the same locations as the exe loader? Or is it just the base address that can change and where the hunks are located in relation to this base address will be the same for the disassembly and the memory snapshot?

Quote:

You may well be able to. Try using ReSource though; ReSource is definitely capable of creating output that can be re-assembled with minimal editing, and if the game uses any OS routines (Exec, DOS, etc.), ReSource has built-in symbol definitions to make the disassembly much easier to read; e.g. JSR (-$228,A6) -> JSR (_LVOOpenLibrary,A6) etc.
I think IDA can do the OS routine lookup too. Well, according to this page.

Quote:

If you go the "load to a fixed address and disassemble memory" route, you'd need to spend time fixing up the disassembly to have the same hunk structure as the original. As I mentioned above, you lose all RELOC32 and symbol hunk information that way.
Ok. Thinking about it, I think trying to work out how get the exe loaded into a fixed address might be a bit too much for me at the moment. I remember last time I looked at it it wasnt as straightforward as I thought.

mark_k 23 April 2008 16:25

Quote:

Originally Posted by crabfists (Post 409041)
Please excuse my lack of knowledge but in what way will keeping the hunks intact improve the disassembly? I take your word for it that it's worth keeping the hunks intact but I suppose I don't understand how it will help. Can you give examples of what will be better? Sorry if this is a really stupid question.

There are several reasons. Firstly, if working from an AmigaDOS load file (as opposed to a memory dump), the disassembler can use RELOC32 information to improve the disassembly. Say there is this code in the program:
Code:

  LEA  (label).L,A0
  ...
label:
  dc.b "somestring"
  ...

In the load file, the reference to the address label is stored as an entry in the reloc32 hunk. (The AmigaDOS LoadSeg routine uses the reloc32 entries to fix up absolute references when hunks are loaded into memory.) So the disassembler knows that the longword after the LEA opcode points to an address. In other words, the data type of that longword is definitely not bytes or words.


Also a program can have different hunk types: code, data, BSS, or combined code+BSS, data+BSS. Hunks can be set to load into any memory, or chip memory. Typically a hunk that loads into chip memory would contain data like sprite images, sound samples etc.; data which needs to be accessed by the custom chips. BSS is effectively uninitialised data. So if a program has a 100KB BSS hunk, LoadSeg allocates 100KB or memory when it is loaded. There is not 100KB "wasted" in the executable, just the length to allocate.

If you were creating a simple dump with the data for each hunk appended to each other, to actually run that code, if any hunk loads into chip memory then the whole dump would have to.


Amiga load files can contain symbol hunks, which give meaningful names to routines and variables. The diassembler can use them to make the initial disassembly much easier to understand. (However most commercial games don't have any symbol hunks.)


Quote:

Originally Posted by crabfists (Post 409041)
If I do this will I still be able to work out where certain routines and variables are in the snapshot of memory in relation to the disassembly? If you are saying the exe loader can put the hunks anywhere in RAM then how will I know from looking at the address of a routine in the disassembly where it is in the snapshot? Or do you mean the base address of all the hunks can be anywhere in RAM and the hunks will be arranged the same in relation to the base address or can each hunk be in a different location each time?

Each hunk would load to a different location each time. But if you take a snapshot after loading, then when working from that snapshot the hunk locations are fixed. You can follow the segment list to find where each hunk is in the snapshot memory. (The longwords before the actual data of each hunk contain the hunk length in longwords and the address of the next hunk.)

So if you're looking in the disassembly at code at offset $1234 in hunk 3, you would add $1234 to the base address of hunk 3 to find that code in the snapshot memory.


Quote:

Originally Posted by crabfists (Post 409041)
Maybe I'm getting the wrong end of the stick here but how does the disassembler unpack the hunks into its address space and does it use the same algorithm as the exe loader? Will it put the hunks in exactly the same locations as the exe loader? Or is it just the base address that can change and where the hunks are located in relation to this base address will be the same for the disassembly and the memory snapshot?

The disassembler obviously needs to load the data from the hunks in the executable. How it does that is really an internal implementation detail of the disassembler; it doesn't matter to you. The actual location each hunk is loaded to is irrelevant.

I don't know how ReSource does it, but it is presented to the user as the first hunk starting at disassembly offset 0, and each successive hunk immediately following.


Quote:

Originally Posted by crabfists (Post 409041)
I think IDA can do the OS routine lookup too. Well, according to this page.

That disassembly doesn't show any Amiga OS symbols/values. I'm pretty sure IDA Pro doesn't have any Amiga OS-specific values built in. If you ever disassemble a program that uses a lot of Amiga OS routines (Exec, DOS, Intuition etc.) being able to easily replace constants with symbols is really useful. Reading the ReSource documentation (available with older versions of the ReSource demo) should give you some idea of what I'm talking about. Details on where to get that are in this thread.


Quote:

Originally Posted by crabfists (Post 409041)
Ok. Thinking about it, I think trying to work out how get the exe loaded into a fixed address might be a bit too much for me at the moment. I remember last time I looked at it it wasnt as straightforward as I thought.

I have seen at least one program that could relocate executables to a specific address. It was probably mainly intended for OS-killing demo writers though! Actually running from that specific address is tricky or impossible usually.

Minuous 24 April 2008 08:50

If you do some annotations of disassembled games, it would be useful to upload them somewhere to avoid duplication of effort. Similar to what I did when I disassembled the WBVirus display hack. I can provide webspace for hosting such disassemblies if necessary.

Hungry Horace 24 April 2008 09:32

i'm curious to know what game will be chosen ;)

BippyM 24 April 2008 18:23

just don't do rainbow islands lol

Hungry Horace 24 April 2008 18:37

Quote:

Originally Posted by bippym (Post 409705)
just don't do rainbow islands lol


cummon bip, -full- disassembly of RI might lead to adding the extra levels!

kriz 24 April 2008 22:46

Its really cool stuff!!!

SkippyAR 25 April 2008 16:09

Back in the early 90s I used the Action Replay Amiga 3 cartridge to aid both 68000 coding and disassembling on an a500plus. Was great!

There is a guy called "Krypt" who is/was heavily into re-coding Amiga games namely from OCS/ECS to work on the AGA models.

Krypt BBS - UK 021 789 xxxx (sorry if this is not allowed)

I fell across him by accident looking for *fixes* to get my favourite games
running on an accelerated a1200HD. IE: Warzone AGA, Turbo Lotus II AGA, Strider II AGA, etc

He basically fixes, optimises code for compatibility to the AGA set, and they work great!

Skippy.

BippyM 25 April 2008 16:15

Is he still about, when you consider the phone number is incorrect (0121) now I doubt he is active ;)

I have temp removed some digits of the number as it may belong to some unfortunate soul now who has nothing to do with amiga :)

Good to see new members here, recounting your memories :D

SkippyAR 25 April 2008 16:29

@bippym,

LOL, yeah, I did wonder that. Some poor person starts getting the phone freaked by ppl trying to modem dial. I think 0121 WAS london.

Found this old BBS ad:
http://textfiles.fisher.hu/bbs/ADS/thekrypt.add

This post on EAB:
http://eab.abime.net/archive/index.php/t-864.html

Bitworld:
http://bitworld.bitfellas.org/demo.php?id=21248

FOLKS this BBS number is probably 99.9% DEAD!

He was popular in the early to mid 90s.

PS: I'm an Old School Classic Amiga User.

Skippy.

BippyM 25 April 2008 16:30

hehehe 0121 is Birmingham :)

I know because I have been ringing it constantly for the past few days (My dad lived there)

Galahad/FLT 25 April 2008 20:28

Quote:

Originally Posted by SkippyAR (Post 409845)
Back in the early 90s I used the Action Replay Amiga 3 cartridge to aid both 68000 coding and disassembling on an a500plus. Was great!

There is a guy called "Krypt" who is/was heavily into re-coding Amiga games namely from OCS/ECS to work on the AGA models.

Krypt BBS - UK 021 789 xxxx (sorry if this is not allowed)

I fell across him by accident looking for *fixes* to get my favourite games
running on an accelerated a1200HD. IE: Warzone AGA, Turbo Lotus II AGA, Strider II AGA, etc

He basically fixes, optimises code for compatibility to the AGA set, and they work great!

Skippy.

It wasn't Krypt it was N.O.M.A.D. that was doing the AGA fixes that were uploaded to the Krypt BBS.

Photon 25 April 2008 20:55

Dunno much about disassembling, but if you have an object file you could load that into ReSource, I think. I started on SoundMaster (?) in 1990 or something so I could look at the sample loop. I think I setup some keyboard shortcuts to set "mode" (instructions/data) and went through it line by line, switching modes whenever I seemed to get trash code.

An idea for a game could be a 3D game that scales 'OK' with CPU speed. Or a game that is easy to make new levels. I think Gravity-Force by Stephan Wenzler would be awesome with nicer gfx and new multiplayer or mission levels! And I think he just draws the levels as 1-bitplane images in DPaint with some trees and platforms strewn on top. Or any Graftgold game, like Paradroid 90 \o/

crabfists 25 April 2008 22:46

Thanks again Mark for posting this knowledge.

Quote:

Originally Posted by mark_k (Post 409455)
Also a program can have different hunk types: code, data, BSS, or combined code+BSS, data+BSS. Hunks can be set to load into any memory, or chip memory. Typically a hunk that loads into chip memory would contain data like sprite images, sound samples etc.; data which needs to be accessed by the custom chips. BSS is effectively uninitialised data. So if a program has a 100KB BSS hunk, LoadSeg allocates 100KB or memory when it is loaded. There is not 100KB "wasted" in the executable, just the length to allocate.

Thanks. That's a good explanation.

Quote:

Originally Posted by mark_k (Post 409455)
That disassembly doesn't show any Amiga OS symbols/values. I'm pretty sure IDA Pro doesn't have any Amiga OS-specific values built in. If you ever disassemble a program that uses a lot of Amiga OS routines (Exec, DOS, Intuition etc.) being able to easily replace constants with symbols is really useful. Reading the ReSource documentation (available with older versions of the ReSource demo) should give you some idea of what I'm talking about. Details on where to get that are in this thread.

My mistake. IDA doesn't know about OS calls but it definitely knows about the Amiga hunk format as I loaded in an amigados exe the other day and it recognised it as an Amiga hunk file. Last time I looked at ReSource I found it had a steep learning curve and I didn't find it that easy to use. Saying that, I didn't read the docs in detail so that's my fault for being too lazy. :)

Ok, so leaving behind the Amigados hunk format, could I ask you if you know what happens on a game that doesn't use an amigados disk but uses a trackloader instead? Were games that used a trackloader pretty much on their own when it came to getting the executable data off the disk and into memory? I imagine they didn't use any OS calls like LoadSeg etc? So, I am presuming they would have to write their own routine which did something similar to LoadSeg but in a less simpler way? I guess at the simplest level could you load the executable code from the disk and load it at a fixed address and write the code to run from that fixed address so all jumps and addresses would not need to be fixed up. But then how would they do something like a BSS hunk? I suppose its not really relevant if you are working from fixed addresses, as you can just reference the 'BSS' block by its hardcoded address.

So, taking a stab in the dark, at the simplest level does a trackloader do something like this:
  • load some executable data from disk into a fixed address
  • load any game data from disk into a fixed address
  • jump to the address where we loaded the executable data
Again, sorry if these questions are a bit dumb. I've got plenty of c++ knowledge on how to make games and have worked on many platforms (dreamcast, ps2, gba, xbox and 360) but I missed out on the Amiga and making games in assembler so I am trying to fill some gaps in my knowledge.

@Minuous
Yeah, it would be good to have a site with a nice collection of Amiga game disassemblies. If I get anywhere I will let you know and send over any listings.

@HungryHorace
This might sound silly but I'd rather not say what game it is I am looking at just in case nothing comes of it (quite likely knowing me ;)).

@Photon
I like the idea of enhancing games to run better than they used to too. Have you seen the Project Tempest Jaguar emulator? It runs Tempest 2000 but at 60fps instead of the 10fps or so it should run at.

BippyM 25 April 2008 23:01

Quote:

Originally Posted by crabfists (Post 409903)
Thanks again Mark for posting this knowledge.
  • load some executable data from disk into a fixed address
  • load any game data from disk into a fixed address
  • jump to the address where we loaded the executable data

The most basic would yeah!

Some will use allocmem and then put a basic loader into memory which will then load the game into ram before jumping to the fixed address..

With the Amiga though there are so many different ways of doing it.. the bootblock is always the first port of call though!

crabfists 26 April 2008 20:46

Does anybody know if a manual exists for ReSource 6? I've got the one from v3.06 demo and I'm trying to work through the tutorial but its very difficult to follow as v6 seems to use completely different menus and commands.

I've got the v6.06 from codetappers site but that doesnt include a manual.

Thanks

Photon 30 April 2008 23:50

Ooh, I know the perfect game ! :)

F/A-18 Interceptor!

Give us more missions please! ACE game, that.

mark_k 01 May 2008 01:41

Quote:

Originally Posted by crabfists (Post 410080)
Does anybody know if a manual exists for ReSource 6? I've got the one from v3.06 demo and I'm trying to work through the tutorial but its very difficult to follow as v6 seems to use completely different menus and commands.

I've got the v6.06 from codetappers site but that doesnt include a manual.

Thanks

[I have original ReSource 4.x, 5.x and 6.01, 6.06 packages.] From memory, ReSource version 6 came with the same manual as version 5, which was not hugely different from versions 3 & 4. I don't think anyone has scanned the ReSource version 5 manual, but that might be worthwhile to do.

However, one of the main differences between older versions and v6 is the Amiga OS symbols/equates. Earlier versions had all the different categories (e.g. custom chip register names, NewWindow flag values etc.) as menu items. Version 6 uses a hierarchical selection method implemented using GadTools gadgets. You click a button at the bottom of the screen to bring up the symbols window.

ReSource v6 has an extensive hypertext-like help system built in. Press Help and select any menu item for documentation on that item (navigate the "links" using the cursor keys). That is really useful. Also, you can use the "ShowKeys" utility to display the default key bindings. And when in ReSource itself, you can show which key combination corresponds to any menu item by pressing Ctrl-Shift-Alt-F1 then selecting the menu item. (That's from memory, I don't have my Amiga on right now to double-check.)

Once you get used to the common key bindings it can be quite a smooth process to disassemble an OS-legal program. Even for non-OS-legal ones, the custom chip register definitions and e.g. DMACON and INTENA bit names can really help to figure out what code is doing.

In comparison, I'm sure IDA Pro is much more intelligent in tracing the flow of code execution and making an initial guess about which parts are code and which are data. If someone writes an IDA plugin to add the ability to convert values to symbols as can be done using ReSource, it could become a better option for disassembling Amiga code. As it stands though, using ReSource you can easily convert e.g.
MOVE.W #$8380,($96,A0)
to
MOVE.W #(DMAF_SETCLR!DMAF_COPPER!DMAF_RASTER!DMAF_MASTER),(DMACON,a0)
which makes figuring out what the code is doing much easier.


All times are GMT +2. The time now is 18:59.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.

Page generated in 0.05505 seconds with 11 queries