57

I am trying to embed binary blobs into an exe file. I am using mingw gcc.

I make the object file like this:

ld -r -b binary -o binary.o input.txt

I then look objdump output to get the symbols:

objdump -x binary.o

And it gives symbols named:

_binary_input_txt_start
_binary_input_txt_end
_binary_input_txt_size

I then try and access them in my C program:

#include <stdlib.h>
#include <stdio.h>

extern char _binary_input_txt_start[];

int main (int argc, char *argv[])
{
    char *p;
    p = _binary_input_txt_start;

    return 0;
}

Then I compile like this:

gcc -o test.exe test.c binary.o

But I always get:

undefined reference to _binary_input_txt_start

Does anyone know what I am doing wrong?

4
  • 8
    By the way, I was unaware of this method of pulling arbitrary data into an executable - nice. Commented Apr 13, 2010 at 5:48
  • What does this method offer that's not offered by .rc files? Commented Oct 20, 2011 at 9:36
  • 1
    @rubenvb Easier access to contntent. It does not need calls to any Resource API:s Commented Mar 15, 2012 at 9:16
  • also github.com/graphitemaster/incbin Commented Feb 27, 2021 at 5:16

4 Answers 4

40

In your C program remove the leading underscore:

#include <stdlib.h>
#include <stdio.h>

extern char binary_input_txt_start[];

int main (int argc, char *argv[])
{
    char *p;
    p = binary_input_txt_start;

    return 0;
}

C compilers often (always?) seem to prepend an underscore to extern names. I'm not entirely sure why that is - I assume that there's some truth to this wikipedia article's claim that

It was common practice for C compilers to prepend a leading underscore to all external scope program identifiers to avert clashes with contributions from runtime language support

But it strikes me that if underscores were prepended to all externs, then you're not really partitioning the namespace very much. Anyway, that's a question for another day, and the fact is that the underscores do get added.

Sign up to request clarification or add additional context in comments.

8 Comments

Wow... thanks alot. This was driving me mad. I knew it must have been something simple. I have just debugged it and noticed that it was changing to __binary_input_txt_start
@myforwik: just in case you're interested, I've post a question asking why C does this: stackoverflow.com/questions/2627511/…
@Michael: The article's claim is true. The runtimes were written in assembler, which was free to use names without underscores prepended and could thereby be assured not to clash with any symbols defined in the C code, and conversely the C code had no way to access the symbols from the asm runtime code.
Does anyone know how much data that can be embedded that way?
@aditya: perhaps there's a difference in that detail that depends on the target? Windows toolchains have tendency to automatically add underscores to external names when targeting Win32 x86. I wouldn't be surprised if that doesn't happen for other targets (even Win32 x64).
|
9

From ld man page:

--leading-underscore

--no-leading-underscore

For most targets default symbol-prefix is an underscore and is defined in target's description. By this option it is possible to disable/enable the default underscore symbol-prefix.

so

ld -r -b binary -o binary.o input.txt --leading-underscore

should be solution.

Comments

6

I tested it in Linux (Ubuntu 10.10).

  1. Resouce file:
    input.txt

  2. gcc (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5 [generates ELF executable, for Linux]
    Generates symbol _binary__input_txt_start.
    Accepts symbol _binary__input_txt_start (with underline).

  3. i586-mingw32msvc-gcc (GCC) 4.2.1-sjlj (mingw32-2) [generates PE executable, for Windows]
    Generates symbol _binary__input_txt_start.
    Accepts symbol binary__input_txt_start (without underline).

1 Comment

Using tdm-gcc 4.8.1, I must refer to the variables using the underscore.
0

Apparently this feature is not present in OSX's ld, so you have to do it totally differently with a custom gcc flag that they added, and you can't reference the data directly, but must do some runtime initialization to get the address.

So it might be more portable to make yourself an assembler source file which includes the binary at build time, a la this answer.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.