15

I have a binary file which i want to embed directly into my source code, so it will be compiled into the .exe file directly, instead of reading it from a file, so the data would already be in the memory when i launch the program.

How do i do this?

Only idea i got was to encode my binary data into base64, put it in a string variable and then decode it back to raw binary data, but this is tricky method which will cause pointless memory allocating. Also, i would like to store the data in the .exe as compact as the original data was.

Edit: The reason i thought of using base64 was because i wanted to make the source code files as small as possible too.

4
  • 1
    As long as you put this resource in a separate source file I offhand see no reason to have source size be part of the concern. Make it easy to use and obvious what's going on first, and let the compiler worry about reading a few extra characters. Commented Apr 19, 2011 at 14:56
  • well its just my preferences really, sure it doesnt matter, but i like compact. Commented Apr 19, 2011 at 15:01
  • Since you like compact: stackoverflow.com/a/52843063/6846474 I wrote a tool that compiles header files with a list of resource paths directly to object files or static libraries. Commented Oct 16, 2018 at 19:56
  • For GCC: stackoverflow.com/questions/4158900/… Commented Feb 8, 2019 at 22:25

4 Answers 4

11

The easiest and most portable way would be to write a small program which converts the data to a C++ source, then compile that and link it into your program. This generated file might look something like:

unsigned char rawData[] =
{
    0x12, 0x34, // ...
};
Sign up to request clarification or add additional context in comments.

11 Comments

I had to to this for firmware updates on system which does not support file operations and we just copied the raw data into array as in this answer.
what is the most compact way doing this in my source code? i could optimize the space by not using 0x prefix and use decimal values, but are there other ways? i have seen code like: Y\377\322\217^\377\321\227l\377\340\262\220\377 but i dont understand how that works, and it causes some compiler warnings for some reason, yet, it works.
@Rookie: the \nnn notation uses octal to specify the value of each character. \377 is the same as 0xff.
yes but what does the weird letters do in that octal data? for example there is Y and ^ and l etc, many weird chars there i dont understand the logic.
@Rookie Presumably, not all of the characters are octal escapes. Personally, I wouldn't worry too much about the size of the source code file; if you run into size problems, it will be because the total table is too big for the compiler, and that will be after tokenization, and won't depend on the size of the input file.
|
6

There are tools for this, a typical name is "bin2c". The first search result is this page.

You need to make a char array, and preferably also make it static const.

In C:

Some care might be needed since you can't have a char-typed literal, and also because generally the signedness of C's char datatype is up to the implementation.

You might want to use a format such as

static const unsigned char my_data[] = { (unsigned char) 0xfeu, (unsigned char) 0xabu, /* ... */ };

Note that each unsigned int literal is cast to unsigned char, and also the 'u' suffix that makes them unsigned.

Since this question was for C++, where you can have a char-typed literal, you might consider using a format such as this, instead:

static const char my_data[] = { '\xfe', '\xab', /* ... */ };

since this is just an array of char, you could just as well use ordinary string literal syntax. Embedding zero-bytes should be fine, as long as you don't try to treat it as a string:

static const char my_data[] = "\xfe\xdab ...";

This is the most compact solution. In fact, you could probably use that for C, too.

3 Comments

\xff equals to 0xff ? which equals to 255, and when using comma, its the same size, but decimal can also be 0,0,0,0, or 11,11,11,11 so its 1 to 2 bytes smaller in some cases, whereas the hex is always 4 bytes. i think i go with decimals, if those are all the options here?
could you also explain this data Y\377\322\217^\377\321\227l\377\340\262\220\377 where you see Y and ^ and l in there among the octal values, what is the logic with those?
The point in avoiding a literal like 0 was (for me) to be type-clean; the type of 1 is int. I guess the compiler will typically do bounds-checking when initializing, so it should be safe, but still. I'm not sure where the data you quote in the second comment comes from, but probably the generator decided that the byte-value was representable as a printable character and used that for brevity.
4

You can use resource files (.rc). Sometimes they are bad, but for Windows based application that's the usual way.

Comments

0

Why base64? Just store the file as it is in one char*.

5 Comments

i was thinking to use base64 because i also want to optimize the space used in my source code.
@Rookie, how is tripling the amount of source code "optimizing" it?
what do you mean tripling? base64 packs the data better in the sourcecode than using 0xff,0xff,0xff etc methods. see below: orig: this is a testing text!! base64: dGhpcyBpcyBhIHRlc3RpbmcgdGV4dCEh hexstr: 7468697320697320612074657374696E6720746578742121 decarr: 116,104,105,115,32,105,115,32,97,32,116,101,115,116,105,110,103,32,116,101,120,116,33,33
@Rookie, yes but you don't have to escape printable ascii characters. You can simply say char *data="this is a testing text!!";
that was just an example of how much it would take space, whereas the original is the original data length visible by plain eyes here, i cant paste binary data in here... read the title again.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.