2

I have question about interpreting strings as packed binary data in C++. In python, I can use struct module. Is there a module or a way in C++ to interpret strings as packed binary data without embedding Python?

8
  • 7
    A string is a sequence of contiguous characters (bytes, basically). How much more packed do you wish to get? Commented Mar 28, 2012 at 8:40
  • So, given a byte array, you want to be able to treat the array as a struct? You could just use a cast. Commented Mar 28, 2012 at 8:46
  • In C++, for binary data, you would typically use a vector (rather than a string) and the unsigned char type to represent a byte (avoiding signedness issues). Thus a typical "buffer" would be of type std::vector<unsigned char>, rather than std::string... note that in C++03 the string storage need not be contiguous. Commented Mar 28, 2012 at 8:47
  • @MatthieuM. C++03 didn't require contiguity, but the C style array pointed to by the return value of std::string::data() must be contiguous. And the reason C++11 added the contiguous requirement was in recognition of existing practice---there were in fact no implementations which weren't contiguous (and where &s[0] didn't result in the same values as s.data()). Commented Mar 28, 2012 at 9:02
  • @JamesKanze: data is for vector, so I believe you are talking about c_str. The problem with c_str is that it is char const* and sometimes you'd like to modify the characters (to_upper ?). Commented Mar 28, 2012 at 9:05

4 Answers 4

1

As already mentioned, it is better to consider this an array of bytes (chars, or unsigned chars), possibly held in a std::vector, rather than a string. A string is null terminated, so what happens if a byte of the binary data had the value zero?

You can either cast a pointer within the array to a pointer to your struct, or copy the data over a struct:

#include <memory>

#pragma pack ( push )
#pragma pack( 1 );

struct myData
{
    int data1;
    int data2;
    // and whatever
};

#pragma pack ( pop ) 


char* dataStream = GetTheStreamSomehow();

    //cast the whole array
myData* ptr = reinterpret_cast<myData*>( dataStream );
    //cast from a known position within the array
myData* ptr2 = reinterpret_cast<myData*>( &(dataStream[index]) );

    //copy the array into a struct
myData data;
memcpy( &data, dataStream, sizeof(myData) );

If you were to have the data stream in a vector, the [] operator would still work. The pragma pack declarations ensure the struct is single byte aligned - researching this is left as an exercise for the reader. :-)

Sign up to request clarification or add additional context in comments.

Comments

1

Basically, you don't need to interpret anything. In C++, strings are packed binary data; you can interpret them as text, but you're not required to. Just be aware that the underlying type of a string, in C++, is char, which can be either signed (range [-128,127] on all machines I've heard of) or unsigned (usually [0,255], but I'm aware of machines where it is [0,511]).

To pass the raw data in a string to a C program, use std::string::data() and std::string::size(). Otherwise, you can access it using iterators or indexation much as you would with std::vector<char> (which may express the intent better).

Comments

1

A string in C++ has a method called c_str ( http://www.cplusplus.com/reference/string/string/c_str/ ).

c_str returns the relevant binary data in a string in form of an array of characters. You can cast these chars to anything you wish and read them as an array of numbers.

2 Comments

You can, although this usually lies somewhere in-between implementation-defined and undefined behaviour.
@OliCharlesworth The results of the conversion will be implementation defined, since whether plain char is signed or not, and how many bits it contains, is implementation defined. Converting a char to an integral type large enough to contain the value (which will be all integral types if char is signed) is well defined, as is converting it to any unsigned integral type or floating point type.
0

Eventhough it might be closer to pickling in python, boost serialization may be closest to what you want to achieve.

Otherwise you might want to do it by hand. It is not that hard to make reader/writer classes to convert primitives/classes to packed binary format. I would do it by shifting bytes to avoid host endianess issues.

1 Comment

Thx guys for the very quickly answers. For clarification, I want to overwrite my application with Python to C++ devpda.net/rsceditor RSCEditor for edit EXE/DLL and RSC files and i have problems with some functions in python for example: • builtn map() • struct.unpack • struct.pack • index from to in ARRAYS (example arrary[5:15]) i don't know have can i write this in C++

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.