1

I would like to call ARM/ARM64 ASM code from C++. ASM code contains syscall and a relocation to external function. ARM architecture here is not so important, I just want to understand how to solve my problem conceptually.

I have following ASM syscall (output from objdump -d) which is called inside shared library:

 198:   d28009e8    mov x8, #0x4f                   // #79
 19c:   d4000001    svc #0x0
 1a0:   b140041f    cmn x0, #0x1, lsl #12
 1a4:   da809400    cneg    x0, x0, hi
 1a8:   54000008    b.hi    0 <__set_errno_internal>
 1ac:   d65f03c0    ret

This piece of code calls fstatat64 syscall and sets errno through external __set_errno_internal function. readelf -r shows following relocation for __set_errno_internal function:

00000000000001a8 R_AARCH64_CONDBR19  __set_errno_internal

I want to call this piece of code from C++, so I converted it to buffer:

  unsigned char machine_code[] __attribute__((section(".text"))) =
        "\xe8\x09\x80\xd2"
        "\x01\x00\x00\xd4"
        "\x1f\x04\x40\xb1"
        "\x00\x94\x80\xda"
        "\x08\x00\x00\x54"   // Here we have mentioned relocation
        "\xc0\x03\x5f\xd6";

EDIT: Important detail - I chose to use buffer (not inline assembly etc) because I want to run extra processing on this buffer (for example decryption function on string literal as a software protection mechanism but that's not important here) before it gets evaluated as machine code.

Afterwards, buffer can be cast to function and called directly to execute machine code. Obviously there is a problem with relocation, it's not fixed automatically and I have to fix it manually. But during run-time I can't do it because .text section is read-only & executable.

Although I have almost full control over source code I must not turn off stack protection & other features to make that section writable (don't ask why). So it seems that relocation fix should be performed during link stage somehow. As far as I know shared library contains relative offsets (for similar external function calls) after relocations are fixed by linker and binary *.so file should contain correct offsets (without need of run-time relocation work), so fixing that machine_code buffer during linking should be possible.

I'm using manually built Clang 7 compiler and I have full control over LLVM passes so I thought maybe it's possible to write some kind of LLVM pass which executes during link time. Though it looks like ld is called in the end so maybe LLVM passes will not help here (not an expert here).

Different ideas would be appreciated also. As you can see problem is pretty complicated. Maybe you have some directions/ideas how to solve this? Thanks!

11
  • 1
    Why can't you have that __set_errno_internal done on the C side after your function returns? Alternatively you could pass in the function address as an argument. Commented Jul 5, 2019 at 14:27
  • Ok, I will have to think about this. Currently I have lot of those syscall wrappers written in ASM so I thought it would be great if no changes would be needed for them. But I have to test this out first and see if it works. Commented Jul 5, 2019 at 14:51
  • 2
    Why would you write it as a buffer and not use inline assembler? Instead of taking the opcodes (hex numbers) take the assembler text and convert it to an inline assembler macro. Your normal build process will link this as per normal. See: GCC pre-process as assembler. Commented Jul 5, 2019 at 15:39
  • 1
    @Jester Based on your idea I removed relocation by deleting cmn, cneg and b.hi instructions. What is nice - It's a very simple modification to ASM syscall wrapper. This way generally 0 (or positive value in case of other syscalls which return handles) or negative errno value (in case of error) is returned. After that I can write a C++ wrapper around it to set errno manually if negative value is returned from assembly code. Thanks! Commented Jul 5, 2019 at 17:52
  • 2
    You can run encryption on any code. Just use a linker or attribute and put in the input section (.text.encrypt). You can define variable to the start/end of this section and run online/offline encrypt/decrypt. Some ideas from storing CRC in elf can be used for encryption. I don't see where you are going to get a decrypt key that will stop an attacker though; but it is equivalent to the array without any limitations on relocations. Commented Jul 5, 2019 at 18:55

1 Answer 1

1

There's already a working, packaged mechanism to handle relocations. It's called dlsym(). While it doesn't directly give you a function pointer, all major C++ compilers support reinterpret_casting the result of dlsym to any ordinary function pointer. (Member functions are another issue altogether, but that's not relevant here)

Sign up to request clarification or add additional context in comments.

8 Comments

It's a nice idea, although I use -fvisibility=hidden so it will not work by default. I could probably make an exception and make that function visible.
Another problem with this is as I mentioned: "Obviously there is a problem with relocation, it's not fixed automatically and I have to fix it manually. But during run-time I can't do it because .text section is read-only & executable.". I assume that you intended to call dlsym() during runtime and patch relocation.
@jozols: Obviously you call dlsym at runtime, but there's no need for manual relocation handling. That's already handled by the ELF loader (the OS). dlsym returns a pointer either to the relocated function or to a trampoline.
Yes, but I have to call it as part of my assembly/machine code ("\x08\x00\x00\x54" // Here we have mentioned relocation).
@jozols: We seem to not be communicating. That's trying to do a manual relocation, which I recommend not doing. The b.hi conditional branch is a relative jump to __set_errno_internal. Since you neither know where __set_errno_internal is relocated, nor do you know where the branch instruct itself will be, there's not even a guarantee that the difference between them fits in the 24 bits allowed for relative branches (!)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.