07 September 2021, 08:38 | #1 |
Registered User
Join Date: Sep 2019
Location: Sydney
Posts: 357
|
Dynamic linking / Relocation / vlink
I'm trying to do a form of dynamic linking without any OS support and it is giving me a headache.
My code is a mix of ASM and C. I'm using VASM, GCC 10 and VLINK. The main executable runs and loads a number of external code modules. The main exe calls functions in the module, that in turn call functions that reside in the main exe. Similarly there are a few bits of shared data between the two, although it is functions that are causing me the headache. ASM modules are generally not an issue, so this is really about C. All the code is linked using vlinks -brawseg option, and I use the relocation data that it outputs to properly relocate the code (on both the exe and each loaded module) before running it. Up until now I've been using function pointers to provide the interface between modules. So when a module is initialised, the main exe passes in a table of all its function pointers that the module then stores. The module then passes back a table of its function pointers. This works fine but it does add an indirection to the code for every function called. So instead of: Code:
jsr function Code:
move.l function,a0 jsr (a0) I looked at inserting fake symbols into vlink. For example if the module C code has: Code:
extern void FunctionInMain( void ); void ModuleFunction( void ) { FunctionInMain(); } I then experimented with a bodge, which did at least show partial success. I declare a BSS section in the module with dummy entries for the symbols that come from main. e.g. Code:
Section __SymbolsToResolveToMainExe,BSS _FunctionInMain:: ds.l 1 This works for data symbols, and for some functions symbols, however, for other functions I get errors from vlink that look like this: Code:
Error 32: Target rawseg: Unsupported relocation type R_PC (offset=0, size=32, mask=ffffffffffffffff) at .text+0x3b98 I'm also suspicious (but not sure) that the functions causing the problem may be ones with a single parameter, and so are using registers to pass parameters rather than the stack. GCC has no way of changing calling convention for m68k that I can find. There may also be other reasons that this approach will never be reliable - compiler optimisations or linker complexities I'm not familiar with. I've probably lost 95% of readers by now, but I hope it makes sense to some. It is quite possible that there is a much easier way of doing this. Any help would be appreciated! |
07 September 2021, 17:41 | #2 |
Registered User
Join Date: Jun 2016
Location: europe
Posts: 1,039
|
When you mentioned a BSS section the first thing on my mind was: are you making sure it's not merged with any other section? Since this involves C and linking, and not raw asm, I don't have a clear picture what is being produced and how it looks like.
That specific error looks like the generated code is trying to call your fake external function as PC relative and not with 32-bit absolute addressing. As if a symbol is both external/relocatable due to -q, and local since it's being accessed as PC relative for some reason, maybe merged sections?. Just an idea... |
07 September 2021, 20:03 | #3 | |||||||
Natteravn
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,496
|
Like a/b I'm not sure what you are currently generating.
I understand your prerequisite is that the resulting code must be raw binaries, with a simple 32-bit address relocation table appended, like vlink's "rawseg" output format produces? Do you have your own linker script? Quote:
Quote:
Quote:
Quote:
-Dor in the linker script? Quote:
Quote:
So I would guess what happened here is: Your compiler thinks all functions are in the same code section, and it would usually be no problem to call them PC-relative, saving a reloc-table entry. Unless you define the function symbol in the bss section... There may be a compiler option to do absolute calls. Otherwise you have a problem, because the reloc tables generated for the rawseg format only support a single reloc type: 32-bit absolute address relocations. Quote:
|
|||||||
08 September 2021, 07:20 | #4 |
Registered User
Join Date: Sep 2019
Location: Sydney
Posts: 357
|
Thanks for the replies.
The BSS section was the 'bodge' I was referring to. If I got it working, my next step was going to be trying to remove the dummy symbols from the module binaries to save space (because they are just an artifact of tricking the linker into outputting relocation addresses, the addresses themselves are never accessed after the patching process). But this is getting ahead of myself. I also tried a DATA section with no luck. I have now had some luck with using a TEXT section, which I'll talk about in a minute. Phx. The main module and the external modules all have a fixed function interface decided before anything is compiled, so yes you got it when you said that they have a fixed order in the BSS section. So the module knows the 'index' of every function inside the main exe and can lookup its address from a table. The dummy symbols are created by generating an ASM file that contains a table of them all. This was the only way that I found to get VLINK to output references to them in its relocation output (-brawseg -q). Without that information I wouldn't know the locations that need patching. I looked at various GCC options including adding things like this to the function declarations: Code:
__attribute__((section("__FunctionSymbolsToResolveToMainExe"))) This is a good step forward but I found whilst the majority of patched functions work without problem, I get a few that crash. These seem to correspond with the same errors VLINK gave originally. On further examination, GCC is outputting 'jsr _DummyFunctionToBePatched' for the functions that work, but 'jra _DummyFunctionToBePatched' for those that don't work. I should mention at this point I'm targetting 68020 CPUs, although I'd like all this to be compatible with 68000 too. From disassembling, I can see the binary has 'jsr $abs_address' for the working functions, and 'bra.l $rel_address' for the non-working functions. My knowledge of all the different branch instructions is less than perfect, as is the exact process between compiler output and linker output, but my guess as to what is happening here is the compiler emits 'jra', which lets the linker choose the 'best' branch instruction. If the branch destination is within range, the linker is free to use bra.l, which is not compatible with the patching that I'm doing. Continuing this speculation, the next things I could look into are: 1. Force GCC to output jsr and not jra (I don't think this option exists). 2. Force VLINK to use jsr for certain symbols (I don't think this exists either). 3. Trick VLINK into using jsr by modifying the linker script to place the symbols out of range of the main program. 4. Patch the outputted binary to change bra.l to jsr. I'm mid-way through trying stuff so the above is just a brain dump of where I am right now... Thanks again for the help. |
08 September 2021, 07:36 | #5 |
Registered User
Join Date: Sep 2019
Location: Sydney
Posts: 357
|
On further inspection, the bra.l instructions are (understandably) absent from the VLINK relocation output, which limits my options.
|
08 September 2021, 08:05 | #6 |
Registered User
Join Date: Sep 2019
Location: Sydney
Posts: 357
|
Thinking that instead of trying to trick VLINK into outputting the symbols I want in the relocation output, I'd be better off just reading the binary output, looking for all branches, checking if they go to one of the symbols I want (by parsing the VLINK map file output) and then patching the instructions.
But... what happens if there is a short branch and I need to change it to an absolute jmp? I'd have to replace 4 bytes with 6 bytes, which would likely cause a heap of new problems as everything after it would be offset. |
08 September 2021, 09:19 | #7 | ||
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
Quote:
Your specific problem is a lot easier to handle in asm than in C. I would trick the asm with macros that generate a place holder of the right size for the function call and send data to a separate section for the list of places to patch. Is this possible in C, I do not know. |
||
08 September 2021, 09:54 | #8 |
Registered User
Join Date: Jun 2016
Location: europe
Posts: 1,039
|
If you are thinking about doing an extra processing step on your own, how about defining each function you want to call as an absolute address in some specific range that has ~0% chance be represent something else (e.g. FN1 EQU $123456xx, XDEF FN1, etc. in an asm module that you then link with each external code module) instead of defining them in a BSS section?
That way each call is guaranteed to be "jsr $123456xx", then you preprocess the file and look for $4eb9 followed by $123456xx, and you can replace the addresses with proper offsets and extend the module's reloc table so you have all you need in your main exe. No BSS section to remove for extra space, all calls are 2+4 bytes, all reloc entries are in the reloc table, ... As Meynaf noted, these kinds of things are not that hard to do in asm, but c and linker... You have to play by their rules. Maybe phx has more ideas how to do this in a c-friendly way. |
08 September 2021, 19:51 | #9 |
Natteravn
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,496
|
I tried to reproduce the pc-relative function call problem with gcc 2.95.3 on my A3000 and with the latest gcc I got, gcc 5.4.0, configured as a m68k cross-compiler on my NetBSD server. Both always emit JSR for function calls (no matter if with or without -m68020 or optimisation). I guess this has something to do with the compiler configuration, when you build it (gcc -v ?).
So there are only two solutions left:
If you can go for the first solution (maybe even useful for the second with some modifications) I can present a nicer way to deal with your dynamic function pointers: You need a special linker script and a dummy-object with all your function calls. For example dummy.s, assembles into dummy.o: Code:
frank@altair cat dummy.s section dynamic,code xdef alpha xdef beta alpha: ds.b 1 beta: ds.b 1 frank@altair vasmm68k_mot -quiet -Fvobj -o dummy.o dummy.s The section name will be important for the linker script: Code:
PHDRS { main PT_LOAD; dyn PT_LOAD; } SECTIONS { . = 0; .text: { *(CODE) } :main .data: { *(DATA) } .sdata: { _LinkerDB = . + 0x7ffe; _SDA_BASE_ = . + 0x7ffe; *(.sdata __MERGED) } .bss (NOLOAD): { *(BSS) } . = 0x12345678; .dynamic (NOLOAD): { *(dynamic) } :dyn } Feel free to adapt the linker script to you needs. You will certainly want to use some gcc section names instead (.text, .data, .bss, etc.) or additionally. And here is an example main module, calling these external functions, and an internal one. I assemble with -no-opt, so the relocation for the internal call is not optimised away. Code:
frank@altair cat main.s code main: jsr alpha jsr beta jsr gamma rts gamma: moveq #1,d0 rts frank@altair vasmm68k_mot -quiet -Fvobj -no-opt -o main.o main.s vlink -brawseg -o tst -q -T ldscript main.o dummy.o and you will get the following files: Code:
-rwxr-xr-x 1 frank wheel 20 Aug 19 18:42 tst -rw-r--r-- 1 frank wheel 24 Aug 19 18:42 tst.main -rw-r--r-- 1 frank wheel 12 Aug 19 18:42 tst.main.reldyn -rw-r--r-- 1 frank wheel 8 Aug 19 18:42 tst.main.relmain Code:
frank@altair hexdump -C tst.main.reldyn 00000000 00 00 00 02 00 00 00 02 00 00 00 08 |............| frank@altair hexdump -C tst.main 00000000 4e b9 00 00 00 00 4e b9 00 00 00 01 4e b9 00 00 |N.....N.....N...| 00000010 00 14 4e 75 70 01 4e 75 |..Nup.Nu| |
09 September 2021, 00:42 | #10 |
Registered User
Join Date: Sep 2019
Location: Sydney
Posts: 357
|
If you want to see the jbsr emitted by GCC, you can do it using compiler explorer.
Go here https://franke.ms/cex/ Paste this code in the left panel: Code:
extern void Func( int i ); void main( void ) { Func( 0 ); } Change the compiler command line to Code:
-O3 -m68020 Observe the ASM output. I've tried all sorts to make it emit jsr, with no luck. I do notice that GCC 6.5 (which I think is Beppo's build of GCC for custom optimisations for Amiga) outputs jsr. However, even if I wasn't reluctant to use an older version of GCC, I did try this version in the past and couldn't get my program to run. |
09 September 2021, 02:05 | #11 |
Registered User
Join Date: Sep 2019
Location: Sydney
Posts: 357
|
Also here is my gcc -v
Code:
Using built-in specs. COLLECT_GCC=m68k-amiga-elf-gcc COLLECT_LTO_WRAPPER=q:/projects/mark2/trunk/runtime/backbonexamiga/amigacompiler/win/bin/opt/bin/../libexec/gcc/m68k-amiga-elf/10.1.0/lto-wrapper.exe Target: m68k-amiga-elf Configured with: ../gcc-10.1.0/configure --target=m68k-amiga-elf --disable-nls --enable-languages=c,c++ --enable-lto --prefix=/mnt/c/amiga-mingw/opt --disable-libssp --disable-gcov --disable-multilib --disable-threads --with-cpu=68000 --disable-libsanitizer --disable-libada --disable-libgomp --disable-libvtv --disable-nls --disable-clocale --host=x86_64-w64-mingw32 --enable-static Thread model: single Supported LTO compression algorithms: zlib gcc version 10.1.0 (GCC) |
09 September 2021, 05:36 | #12 |
Registered User
Join Date: Sep 2019
Location: Sydney
Posts: 357
|
Earlier I'd misunderstood how jra and jbsr instructions work. I thought they were emitted by the compiler for the linker to decide on the instruction to use, but the reality (obvious now) is that GCC outputs jbsr/jra pseudo-ops, and then GAS (the GNU assembler) decides on what to replace it with.
This link provides a table of the pseudo-ops that GCC emits and the valid 68k instructions that GAS may replace them with. Looking at GAS, it has an option to called 'turn jbsr into jsr', but not 'turn jra into jmp', so no help there. But there is an easy solution. Instead of GCC compiling direct to object files, I compile to ASM first, then modify the ASM file, then use GAS to assemble the final object file. To modify the ASM, I do a find and replace 'jsr x' to 'jmp x' and 'jbsr x' to 'jsr x', where x must appear in the list of external symbols I already have. And like magic, it works! Whether this is truly robust, and whether the data symbols are safe from this kind of thing, is something I'm not sure of yet. Next step is to look more at how the sections are generated (your example linker script will be very useful Phx, thanks). To see if I can clean that up. |
09 September 2021, 09:16 | #13 |
Registered User
Join Date: Sep 2019
Location: Sydney
Posts: 357
|
When I use the linker script you suggested I get:
Code:
Warning 22: Attributes of section .dynamic were changed from r-x- in Linker Script LinkerScript.txt to rwx- in ModuleFunctionsASM.o. It runs, and I'm getting a slightly smaller binary output for the modules (I guess it trims the storage for the dummy symbols with the NOLOAD command?), but is there a way to either disable or fix that warning? |
09 September 2021, 13:03 | #14 | ||||||
Natteravn
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,496
|
Quote:
I can still only suggest ask the responsible person about it. Using register arguments by default is also a strange decision, IMHO. Quote:
Quote:
Quote:
Quote:
Quote:
-nowarn=22. |
||||||
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
VASM/VLINK relocation issues | pipper | Coders. Asm / Hardware | 31 | 22 May 2021 12:03 |
VLINK / VBCC / VASM linking order issue | adrianpbrown | Coders. C/C++ | 6 | 14 January 2020 07:10 |
CDTV CPU relocation riser/adapter. | kolla | Hardware mods | 24 | 26 December 2019 20:52 |
Vasm/Vlink odd issue linking | roondar | Coders. Asm / Hardware | 7 | 10 December 2017 20:19 |
Trying out vlink and vasm | cla | Coders. General | 2 | 30 September 2016 20:30 |
|
|