English Amiga Board


Go Back   English Amiga Board > Coders > Coders. System

 
 
Thread Tools
Old 07 September 2021, 08:38   #1
Muzza
Registered User
 
Muzza's Avatar
 
Join Date: Sep 2019
Location: Sydney
Posts: 357
Dynamic linking / Relocation / vlink

I'm trying to do a form of dynamic linking without any OS support and it is giving me a headache.

My code is a mix of ASM and C. I'm using VASM, GCC 10 and VLINK.


The main executable runs and loads a number of external code modules. The main exe calls functions in the module, that in turn call functions that reside in the main exe.
Similarly there are a few bits of shared data between the two, although it is functions that are causing me the headache.
ASM modules are generally not an issue, so this is really about C.


All the code is linked using vlinks -brawseg option, and I use the relocation data that it outputs to properly relocate the code (on both the exe and each loaded module) before running it.

Up until now I've been using function pointers to provide the interface between modules. So when a module is initialised, the main exe passes in a table of all its function pointers that the module then stores. The module then passes back a table of its function pointers. This works fine but it does add an indirection to the code for every function called.

So instead of:
Code:
jsr function
It has to do:
Code:
move.l function,a0
jsr (a0)
What I really want to do is to generate jsr function, but then patch it as part of the relocation work to point to the real function address in the main exe.

I looked at inserting fake symbols into vlink. For example if the module C code has:
Code:
extern void FunctionInMain( void );
void ModuleFunction( void )
{
  FunctionInMain();
}
Then you can create a dummy symbol _FunctionInMain whilst linking the module. The problem is that it does not appear in the relocation output, so I've no way of finding the locations that need to be patched.


I then experimented with a bodge, which did at least show partial success.
I declare a BSS section in the module with dummy entries for the symbols that come from main. e.g.
Code:
Section __SymbolsToResolveToMainExe,BSS
_FunctionInMain:: ds.l 1
This allows the module to link, and vlink will output an address for the symbol, and include it in its relocation output. I can parse can parse these and then record the addresses that needs to be patched so that whenever the module code references its own version of _FunctionInMain, I change the address to the main exes version of _FunctionInMain.

This works for data symbols, and for some functions symbols, however, for other functions I get errors from vlink that look like this:
Code:
Error 32: Target rawseg: Unsupported relocation type R_PC (offset=0, size=32, mask=ffffffffffffffff) at .text+0x3b98
I think this is to do with vlink being told to keep all relocations, the -q option, but I don't have enough knowledge of linkers to fully understand what is happening here. If I turn off the -q option, it will at least link without error, but I don't get the relocation data output, which defeats the purpose.

I'm also suspicious (but not sure) that the functions causing the problem may be ones with a single parameter, and so are using registers to pass parameters rather than the stack. GCC has no way of changing calling convention for m68k that I can find. There may also be other reasons that this approach will never be reliable - compiler optimisations or linker complexities I'm not familiar with.

I've probably lost 95% of readers by now, but I hope it makes sense to some. It is quite possible that there is a much easier way of doing this. Any help would be appreciated!
Muzza is offline  
Old 07 September 2021, 17:41   #2
a/b
Registered User
 
Join Date: Jun 2016
Location: europe
Posts: 1,039
When you mentioned a BSS section the first thing on my mind was: are you making sure it's not merged with any other section? Since this involves C and linking, and not raw asm, I don't have a clear picture what is being produced and how it looks like.
That specific error looks like the generated code is trying to call your fake external function as PC relative and not with 32-bit absolute addressing. As if a symbol is both external/relocatable due to -q, and local since it's being accessed as PC relative for some reason, maybe merged sections?.
Just an idea...
a/b is online now  
Old 07 September 2021, 20:03   #3
phx
Natteravn
 
phx's Avatar
 
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,496
Like a/b I'm not sure what you are currently generating.
I understand your prerequisite is that the resulting code must be raw binaries, with a simple 32-bit address relocation table appended, like vlink's "rawseg" output format produces? Do you have your own linker script?

Quote:
Originally Posted by Muzza View Post
I'm trying to do a form of dynamic linking without any OS support and it is giving me a headache.
Usually dynamic linking is symbolic. The run-time link editor looks for unresolved references to shared objects by their name, and loads the shared objects (or maps them into your address space). There are definitely no symbol names in the rawseg output.

Quote:
The main executable runs and loads a number of external code modules. The main exe calls functions in the module, that in turn call functions that reside in the main exe.
For simplification: do all these external modules always have the same function interface?

Quote:
Up until now I've been using function pointers to provide the interface between modules. So when a module is initialised, the main exe passes in a table of all its function pointers that the module then stores. The module then passes back a table of its function pointers. This works fine but it does add an indirection to the code for every function called.
Then you might be surprised to hear that it works similarly with dynamic linking of ELF shared objects under Unix. The PLT (procedure linkage table) does that indirection here.

Quote:
I looked at inserting fake symbols into vlink. For example if the module C code has:
Code:
extern void FunctionInMain( void );
void ModuleFunction( void )
{
  FunctionInMain();
 }
Then you can create a dummy symbol _FunctionInMain whilst linking the module. The problem is that it does not appear in the relocation output, so I've no way of finding the locations that need to be patched.
How did you create these dummy symbols? With
-D
or in the linker script?

Quote:
I can parse can parse these and then record the addresses that needs to be patched so that whenever the module code references its own version of _FunctionInMain, I change the address to the main exes version of _FunctionInMain.
Ok... I think I get the idea. You know which pointer to the main module you are currently dealing with, because they have a fixed order in the BSS section?

Quote:
Code:
  Error 32: Target rawseg: Unsupported relocation type R_PC (offset=0, size=32, mask=ffffffffffffffff) at .text+0x3b98
I think this is to do with vlink being told to keep all relocations, the -q option, but I don't have enough knowledge of linkers to fully understand what is happening here.
You definitely want to keep a relocation table, so -q is correct. R_PC is a pc-relative relocation type, which can only occur between different sections. PC-relative references in the same section can be resolved and will disappear.

So I would guess what happened here is: Your compiler thinks all functions are in the same code section, and it would usually be no problem to call them PC-relative, saving a reloc-table entry. Unless you define the function symbol in the bss section...

There may be a compiler option to do absolute calls. Otherwise you have a problem, because the reloc tables generated for the rawseg format only support a single reloc type: 32-bit absolute address relocations.

Quote:
Any help would be appreciated!
If you want, you can also contact me directly by email and send me some example files. (I'm the author of vlink.)
phx is offline  
Old 08 September 2021, 07:20   #4
Muzza
Registered User
 
Muzza's Avatar
 
Join Date: Sep 2019
Location: Sydney
Posts: 357
Thanks for the replies.

The BSS section was the 'bodge' I was referring to. If I got it working, my next step was going to be trying to remove the dummy symbols from the module binaries to save space (because they are just an artifact of tricking the linker into outputting relocation addresses, the addresses themselves are never accessed after the patching process). But this is getting ahead of myself. I also tried a DATA section with no luck. I have now had some luck with using a TEXT section, which I'll talk about in a minute.

Phx. The main module and the external modules all have a fixed function interface decided before anything is compiled, so yes you got it when you said that they have a fixed order in the BSS section. So the module knows the 'index' of every function inside the main exe and can lookup its address from a table.

The dummy symbols are created by generating an ASM file that contains a table of them all. This was the only way that I found to get VLINK to output references to them in its relocation output (-brawseg -q). Without that information I wouldn't know the locations that need patching.

I looked at various GCC options including adding things like this to the function declarations:
Code:
__attribute__((section("__FunctionSymbolsToResolveToMainExe")))
But none of this made a difference. I have however managed to get VLINK to link the modules without error. I separated the data pointers and the function pointers in the dummy symbol table, kept the data in a BSS section, but moved the function pointers to a TEXT section.

This is a good step forward but I found whilst the majority of patched functions work without problem, I get a few that crash. These seem to correspond with the same errors VLINK gave originally. On further examination, GCC is outputting 'jsr _DummyFunctionToBePatched' for the functions that work, but 'jra _DummyFunctionToBePatched' for those that don't work. I should mention at this point I'm targetting 68020 CPUs, although I'd like all this to be compatible with 68000 too.

From disassembling, I can see the binary has 'jsr $abs_address' for the working functions, and 'bra.l $rel_address' for the non-working functions.

My knowledge of all the different branch instructions is less than perfect, as is the exact process between compiler output and linker output, but my guess as to what is happening here is the compiler emits 'jra', which lets the linker choose the 'best' branch instruction. If the branch destination is within range, the linker is free to use bra.l, which is not compatible with the patching that I'm doing.

Continuing this speculation, the next things I could look into are:
1. Force GCC to output jsr and not jra (I don't think this option exists).
2. Force VLINK to use jsr for certain symbols (I don't think this exists either).
3. Trick VLINK into using jsr by modifying the linker script to place the symbols out of range of the main program.
4. Patch the outputted binary to change bra.l to jsr.

I'm mid-way through trying stuff so the above is just a brain dump of where I am right now...
Thanks again for the help.
Muzza is offline  
Old 08 September 2021, 07:36   #5
Muzza
Registered User
 
Muzza's Avatar
 
Join Date: Sep 2019
Location: Sydney
Posts: 357
On further inspection, the bra.l instructions are (understandably) absent from the VLINK relocation output, which limits my options.
Muzza is offline  
Old 08 September 2021, 08:05   #6
Muzza
Registered User
 
Muzza's Avatar
 
Join Date: Sep 2019
Location: Sydney
Posts: 357
Thinking that instead of trying to trick VLINK into outputting the symbols I want in the relocation output, I'd be better off just reading the binary output, looking for all branches, checking if they go to one of the symbols I want (by parsing the VLINK map file output) and then patching the instructions.
But... what happens if there is a short branch and I need to change it to an absolute jmp? I'd have to replace 4 bytes with 6 bytes, which would likely cause a heap of new problems as everything after it would be offset.
Muzza is offline  
Old 08 September 2021, 09:19   #7
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
Quote:
Originally Posted by Muzza View Post
Thinking that instead of trying to trick VLINK into outputting the symbols I want in the relocation output, I'd be better off just reading the binary output, looking for all branches, checking if they go to one of the symbols I want (by parsing the VLINK map file output) and then patching the instructions.
Be awared that if you do that, you may end up with false positives due to data or instruction operands mismatched with branch instructions.


Quote:
Originally Posted by Muzza View Post
But... what happens if there is a short branch and I need to change it to an absolute jmp? I'd have to replace 4 bytes with 6 bytes, which would likely cause a heap of new problems as everything after it would be offset.
That's Pandora's box to open. All addresses inside change, so you have to reinterpret every instruction. Not sure it is even possible to do in an automatic way -- if someone successfully did that, i'd love to put my hands on it !

Your specific problem is a lot easier to handle in asm than in C. I would trick the asm with macros that generate a place holder of the right size for the function call and send data to a separate section for the list of places to patch. Is this possible in C, I do not know.
meynaf is offline  
Old 08 September 2021, 09:54   #8
a/b
Registered User
 
Join Date: Jun 2016
Location: europe
Posts: 1,039
If you are thinking about doing an extra processing step on your own, how about defining each function you want to call as an absolute address in some specific range that has ~0% chance be represent something else (e.g. FN1 EQU $123456xx, XDEF FN1, etc. in an asm module that you then link with each external code module) instead of defining them in a BSS section?
That way each call is guaranteed to be "jsr $123456xx", then you preprocess the file and look for $4eb9 followed by $123456xx, and you can replace the addresses with proper offsets and extend the module's reloc table so you have all you need in your main exe. No BSS section to remove for extra space, all calls are 2+4 bytes, all reloc entries are in the reloc table, ...

As Meynaf noted, these kinds of things are not that hard to do in asm, but c and linker... You have to play by their rules. Maybe phx has more ideas how to do this in a c-friendly way.
a/b is online now  
Old 08 September 2021, 19:51   #9
phx
Natteravn
 
phx's Avatar
 
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,496
I tried to reproduce the pc-relative function call problem with gcc 2.95.3 on my A3000 and with the latest gcc I got, gcc 5.4.0, configured as a m68k cross-compiler on my NetBSD server. Both always emit JSR for function calls (no matter if with or without -m68020 or optimisation). I guess this has something to do with the compiler configuration, when you build it (gcc -v ?).

So there are only two solutions left:
  • Talk with the person responsible for your compiler and ask him how to guarantee absolute function calls.
  • Implement a dummy object for indirection, linked to the code section. It would include all external function labels with an absolute JMP to the real function. This object might also be easier to patch and implement some dynamic linking at runtime.
A third solution would be that I hack vlink and add another table for pc-relative relocations, but it would unnecessarily complicate things.


If you can go for the first solution (maybe even useful for the second with some modifications) I can present a nicer way to deal with your dynamic function pointers:
You need a special linker script and a dummy-object with all your function calls. For example dummy.s, assembles into dummy.o:
Code:
frank@altair cat dummy.s 
        section dynamic,code
        xdef    alpha
        xdef    beta
alpha:  ds.b    1
beta:   ds.b    1
frank@altair vasmm68k_mot -quiet -Fvobj -o dummy.o dummy.s
This dummy module defines the functions alpha and beta of the external module. The ds.b makes sure all labels get section-offsets which are easy to identify: 0, 1, etc.

The section name will be important for the linker script:
Code:
PHDRS {
        main PT_LOAD;
        dyn PT_LOAD;
}

SECTIONS {
        . = 0;
        .text: {
                *(CODE)
        } :main
        .data: { *(DATA) }
        .sdata: {
                _LinkerDB = . + 0x7ffe;
                _SDA_BASE_ = . + 0x7ffe;
                *(.sdata __MERGED)
        }
        .bss (NOLOAD): { *(BSS) }

        . = 0x12345678;
        .dynamic (NOLOAD): {
                *(dynamic)
        } :dyn
}
It defines two segments, main and dyn. Their addresses (0 and 0x12345678) are irrelevant as you will relocate them anyway. Just to separate them. When running vlink with -q it will generate separate relocation tables for main and dyn, which will be handy. The .dynamic section is also marked as NOLOAD to make sure no binary is written (otherwise you could also ignore its output).
Feel free to adapt the linker script to you needs. You will certainly want to use some gcc section names instead (.text, .data, .bss, etc.) or additionally.

And here is an example main module, calling these external functions, and an internal one. I assemble with -no-opt, so the relocation for the internal call is not optimised away.
Code:
frank@altair cat main.s 
        code

main:   jsr     alpha
        jsr     beta
        jsr     gamma
        rts

gamma:  moveq   #1,d0
        rts
frank@altair vasmm68k_mot -quiet -Fvobj -no-opt -o main.o main.s
Finally link your main module with the dummy functions
vlink -brawseg -o tst -q -T ldscript main.o dummy.o

and you will get the following files:
Code:
-rwxr-xr-x  1 frank  wheel  20 Aug 19 18:42 tst
-rw-r--r--  1 frank  wheel  24 Aug 19 18:42 tst.main
-rw-r--r--  1 frank  wheel  12 Aug 19 18:42 tst.main.reldyn
-rw-r--r--  1 frank  wheel   8 Aug 19 18:42 tst.main.relmain
As usual your binary is in tst.main and the internal relocation table in tst.main.relmain. The "dynamic" relocations are nicely separated in tst.main.reldyn and you can even identify their target by the section offset in the binary (here 0x00000000 and 0x00000001 following the JSR opcode 0x4eb9):
Code:
frank@altair hexdump -C tst.main.reldyn 
00000000  00 00 00 02 00 00 00 02  00 00 00 08              |............|
frank@altair hexdump -C tst.main        
00000000  4e b9 00 00 00 00 4e b9  00 00 00 01 4e b9 00 00  |N.....N.....N...|
00000010  00 14 4e 75 70 01 4e 75                           |..Nup.Nu|
phx is offline  
Old 09 September 2021, 00:42   #10
Muzza
Registered User
 
Muzza's Avatar
 
Join Date: Sep 2019
Location: Sydney
Posts: 357
If you want to see the jbsr emitted by GCC, you can do it using compiler explorer.
Go here https://franke.ms/cex/
Paste this code in the left panel:
Code:
extern void Func( int i );
void main( void )
{
    Func( 0 );
}
Change the compiler at the top of the right panel to AMIGA GCC 10.2.1b
Change the compiler command line to
Code:
-O3 -m68020

Observe the ASM output.

I've tried all sorts to make it emit jsr, with no luck. I do notice that GCC 6.5 (which I think is Beppo's build of GCC for custom optimisations for Amiga) outputs jsr. However, even if I wasn't reluctant to use an older version of GCC, I did try this version in the past and couldn't get my program to run.
Muzza is offline  
Old 09 September 2021, 02:05   #11
Muzza
Registered User
 
Muzza's Avatar
 
Join Date: Sep 2019
Location: Sydney
Posts: 357
Also here is my gcc -v
Code:
Using built-in specs.
COLLECT_GCC=m68k-amiga-elf-gcc
COLLECT_LTO_WRAPPER=q:/projects/mark2/trunk/runtime/backbonexamiga/amigacompiler/win/bin/opt/bin/../libexec/gcc/m68k-amiga-elf/10.1.0/lto-wrapper.exe
Target: m68k-amiga-elf
Configured with: ../gcc-10.1.0/configure --target=m68k-amiga-elf --disable-nls --enable-languages=c,c++ --enable-lto --prefix=/mnt/c/amiga-mingw/opt --disable-libssp --disable-gcov --disable-multilib --disable-threads --with-cpu=68000 --disable-libsanitizer --disable-libada --disable-libgomp --disable-libvtv --disable-nls --disable-clocale --host=x86_64-w64-mingw32 --enable-static
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 10.1.0 (GCC)
Muzza is offline  
Old 09 September 2021, 05:36   #12
Muzza
Registered User
 
Muzza's Avatar
 
Join Date: Sep 2019
Location: Sydney
Posts: 357
Earlier I'd misunderstood how jra and jbsr instructions work. I thought they were emitted by the compiler for the linker to decide on the instruction to use, but the reality (obvious now) is that GCC outputs jbsr/jra pseudo-ops, and then GAS (the GNU assembler) decides on what to replace it with.

This link provides a table of the pseudo-ops that GCC emits and the valid 68k instructions that GAS may replace them with.

Looking at GAS, it has an option to called 'turn jbsr into jsr', but not 'turn jra into jmp', so no help there. But there is an easy solution. Instead of GCC compiling direct to object files, I compile to ASM first, then modify the ASM file, then use GAS to assemble the final object file.

To modify the ASM, I do a find and replace 'jsr x' to 'jmp x' and 'jbsr x' to 'jsr x', where x must appear in the list of external symbols I already have.

And like magic, it works!
Whether this is truly robust, and whether the data symbols are safe from this kind of thing, is something I'm not sure of yet.

Next step is to look more at how the sections are generated (your example linker script will be very useful Phx, thanks). To see if I can clean that up.
Muzza is offline  
Old 09 September 2021, 09:16   #13
Muzza
Registered User
 
Muzza's Avatar
 
Join Date: Sep 2019
Location: Sydney
Posts: 357
When I use the linker script you suggested I get:
Code:
Warning 22: Attributes of section .dynamic were changed from r-x- in Linker Script LinkerScript.txt to rwx- in ModuleFunctionsASM.o.
The file in question contains a CODE section with the ds.b's as per your example, and a BSS section with the dummy data references.

It runs, and I'm getting a slightly smaller binary output for the modules (I guess it trims the storage for the dummy symbols with the NOLOAD command?), but is there a way to either disable or fix that warning?
Muzza is offline  
Old 09 September 2021, 13:03   #14
phx
Natteravn
 
phx's Avatar
 
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,496
Quote:
Originally Posted by Muzza View Post
If you want to see the jbsr emitted by GCC, you can do it using compiler explorer.
Yes. It is a jbsr. I see the same with gcc 2.95.3, but the assembler always makes a jsr from it, even when generating 020+ code. Maybe there is not a compiler-option but an assembler-option to prevent it? I would only expect a bsr.l when specifiying a PIC option (-fpic?) or using the small code model.

I can still only suggest ask the responsible person about it. Using register arguments by default is also a strange decision, IMHO.

Quote:
Originally Posted by Muzza View Post
Earlier I'd misunderstood how jra and jbsr instructions work. I thought they were emitted by the compiler for the linker to decide on the instruction to use, but the reality (obvious now) is that GCC outputs jbsr/jra pseudo-ops, and then GAS (the GNU assembler) decides on what to replace it with.
Interesting. I wasn't sure about that either. For a moment I thought that gas would always insert a BSR.L (020+), which has the same size as a JSR, so the linker could theoretically replace it.

Quote:
Originally Posted by Muzza View Post
When I use the linker script you suggested I get:
Code:
Warning 22: Attributes of section .dynamic were changed from r-x- in Linker Script LinkerScript.txt to rwx- in ModuleFunctionsASM.o.
The file in question contains a CODE section with the ds.b's as per your example, and a BSS section with the dummy data references.
Which name does the BSS section have? It sounds like you instructed the linker to merge it with your code section. The default for code sections is non-writable, so you get a warning when merging it with data or bss sections. You can check with -M what happened.

Quote:
It runs,
Yes. It's just a warning. And writable/non-writable/executable flags have no meaning for AmigaOS anyway.

Quote:
and I'm getting a slightly smaller binary output for the modules (I guess it trims the storage for the dummy symbols with the NOLOAD command?)
Correct.

Quote:
but is there a way to either disable or fix that warning?
Find the reason for that merge. Or use
-nowarn=22
.
phx is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
VASM/VLINK relocation issues pipper Coders. Asm / Hardware 31 22 May 2021 12:03
VLINK / VBCC / VASM linking order issue adrianpbrown Coders. C/C++ 6 14 January 2020 07:10
CDTV CPU relocation riser/adapter. kolla Hardware mods 24 26 December 2019 20:52
Vasm/Vlink odd issue linking roondar Coders. Asm / Hardware 7 10 December 2017 20:19
Trying out vlink and vasm cla Coders. General 2 30 September 2016 20:30

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 22:53.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.10513 seconds with 15 queries