I have question about interpreting strings as packed binary data in C++. In python, I can use struct module. Is there a module or a way in C++ to interpret strings as packed binary data without embedding Python?
4 Answers
As already mentioned, it is better to consider this an array of bytes (chars, or unsigned chars), possibly held in a std::vector, rather than a string. A string is null terminated, so what happens if a byte of the binary data had the value zero?
You can either cast a pointer within the array to a pointer to your struct, or copy the data over a struct:
#include <memory>
#pragma pack ( push )
#pragma pack( 1 );
struct myData
{
int data1;
int data2;
// and whatever
};
#pragma pack ( pop )
char* dataStream = GetTheStreamSomehow();
//cast the whole array
myData* ptr = reinterpret_cast<myData*>( dataStream );
//cast from a known position within the array
myData* ptr2 = reinterpret_cast<myData*>( &(dataStream[index]) );
//copy the array into a struct
myData data;
memcpy( &data, dataStream, sizeof(myData) );
If you were to have the data stream in a vector, the [] operator would still work. The pragma pack declarations ensure the struct is single byte aligned - researching this is left as an exercise for the reader. :-)
Comments
Basically, you don't need to interpret anything. In C++, strings are
packed binary data; you can interpret them as text, but you're not
required to. Just be aware that the underlying type of a string, in
C++, is char, which can be either signed (range [-128,127] on all
machines I've heard of) or unsigned (usually [0,255], but I'm aware of
machines where it is [0,511]).
To pass the raw data in a string to a C program, use
std::string::data() and std::string::size(). Otherwise, you can
access it using iterators or indexation much as you would with
std::vector<char> (which may express the intent better).
Comments
A string in C++ has a method called c_str ( http://www.cplusplus.com/reference/string/string/c_str/ ).
c_str returns the relevant binary data in a string in form of an array of characters. You can cast these chars to anything you wish and read them as an array of numbers.
2 Comments
char to an integral type large enough to contain the value (which will be all integral types if char is signed) is well defined, as is converting it to any unsigned integral type or floating point type.Eventhough it might be closer to pickling in python, boost serialization may be closest to what you want to achieve.
Otherwise you might want to do it by hand. It is not that hard to make reader/writer classes to convert primitives/classes to packed binary format. I would do it by shifting bytes to avoid host endianess issues.
struct? You could just use a cast.vector(rather than a string) and theunsigned chartype to represent a byte (avoiding signedness issues). Thus a typical "buffer" would be of typestd::vector<unsigned char>, rather thanstd::string... note that in C++03 the string storage need not be contiguous.std::string::data()must be contiguous. And the reason C++11 added the contiguous requirement was in recognition of existing practice---there were in fact no implementations which weren't contiguous (and where&s[0]didn't result in the same values ass.data()).datais forvector, so I believe you are talking aboutc_str. The problem withc_stris that it ischar const*and sometimes you'd like to modify the characters (to_upper?).