Convert byte array to unsigned int using pointers

Question

char* f = (char*)malloc(4 * sizeof(char));
f[0] = 0;
f[1] = 0;
f[2] = 0;
f[3] = 1;
unsigned int j = *f;
printf("%u\n", j);

so if the memory looks like this: 0000 0000 0000 0000 0000 0000 0000 0001

The program outputs 0. How do I make it output a uint value of the entire 32 bits?

Because you are only displaying the converted 0000 from the 1st byte? — πάντα ῥεῖ
– πάντα ῥεῖ, Commented Nov 19, 2016 at 0:05

Edward Strange · Accepted Answer · 2016-11-19 00:08:55Z

3

Because you are using type promotion. char will promote to int when accessed. You'll get no diagnostic for this. So what you are doing is dereferencing the first element in your char array, which is 0, and assigning it to an int...which likewise ends up being 0.

What you want to do is technically undefined behavior but generally works. You want to do this:

unsigned int j = *reinterpret_cast<unsigned int*>(f);

At this point you'll be dealing with undefined behavior and with the endianness of the platform. You probably do not have the value you want recorded in your byte stream. You're treading in territory that requires intimate knowledge of your compiler and your target architecture.

answered Nov 19, 2016 at 0:08

Edward Strange

41k9 gold badges79 silver badges127 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

πάντα ῥεῖ · Accepted Answer · 2016-11-19 00:10:34Z

3

Supposed your platform supports 32bit length integers, you can do the following to achieve the kind of cast you want:

char* f = (char*)malloc(4 * sizeof(char));
f[0] = 0;
f[1] = 0;
f[2] = 0;
f[3] = 1;

uint32_t j;
memcpy(&j,f,sizeof(j));
printf("%u\n", j);

Be aware of endianess in integer representation.

answered Nov 19, 2016 at 0:10

πάντα ῥεῖ

88.9k13 gold badges125 silver badges199 bronze badges

3 Comments

Nathan Over a year ago

Would this ever produce different results to using "reinterpret_cast"?

πάντα ῥεῖ Over a year ago

@Nathaniel No. The result would be the same. Did you expect a different result?

Nathan Over a year ago

Cool, I just wanted to make sure that I understood the code correctly.

Vivek · Accepted Answer · 2016-11-20 22:19:43Z

2

In order to ensure that your code works on both little endian and big endian systems, you could do the following:

char f[4] = {0,0,0,1};
int32_t j = *((int32_t *)f);
j=ntohl(j);
printf("%d", j);

This will print 1 on both little endian and big endian systems. Without using ntohl, 1 will only be printed on Big Endian systems.

The code works because f is being assigned values in the same way as in a Big Endian System. Since network order is also Big Endian, ntohl will correctly convert j. If the host is Big Endian, j will remain unchanged. If the host is Little Endian, the bytes in j will be reversed.

edited Nov 20, 2016 at 22:19

answered Nov 19, 2016 at 0:43

Vivek

3301 silver badge5 bronze badges

2 Comments

Nathan Over a year ago

Since "ntohl" is only on the print command, wouldn't that actual value of the uint still be incorrect?

Vivek Over a year ago

yes, assigning j with the ntohl value is the better way to go.

2501 · Accepted Answer · 2016-11-19 08:53:18Z

1

What happens in the line:

unsigned int j = *f;

is simply assigning the first element of f to the integer j. It is equivalent to:

unsigned int j = f[0];

and since f[0] is 0 it is really just assigning a 0 to the integer:

unsigned int j = 0;

You will have to convert the elements of f.

Reinterpretation will always cause undefined behavior. The following example shows such usage and it is always incorrect:

unsigned int j = *( unsigned int* )f;

Undefined behavior may produce any result, even apparently correct ones. Even if such code appears to produce correct results when you run it for the first time, this isn't proof that the program is defined. The program is still undefined, and may produce incorrect results at any time.

There is no such thing as technically undefined behavior or generally works, the program is either undefined or not. Relying on such statements is dangerous and irresponsible.

Luckily we don't have to rely on such bad code.

All you need to do is choose the representation of the integer that will be stored in f, and then convert it. It appears you want to store in big-endian, with at most 8 bits per element. This doesn't mean that the machine must be big-endian, only the representation of the integer you're encoding in f. Representation of integers on the machine is not important, as this method is completely portable.

This means the most significant byte will appear first. The most significant byte is f[0], and the least significant byte is f[3].

We will need an integer capable of storing at least 32 bits and type unsigned long does this.

Type char is for used storing characters not integers. An unsigned integer type like unsigned char should be used.

Then only the conversion from big-endian encoded in f must be done:

unsigned char encoded[4] = { 0 , 0 , 0 , 1 };
unsigned long value = 0;
value = value | ( ( ( unsigned long )encoded[0] & 0xFF ) << 24 );
value = value | ( ( ( unsigned long )encoded[1] & 0xFF ) << 16 );
value = value | ( ( ( unsigned long )encoded[2] & 0xFF ) << 8 );
value = value | ( ( ( unsigned long )encoded[3] & 0xFF ) << 0 );

edited Nov 19, 2016 at 8:53

answered Nov 19, 2016 at 8:47

2501

25.8k4 gold badges51 silver badges93 bronze badges

7 Comments

Nathan Over a year ago

Surely all those bitshift operations will be slower than the memcopy or the reinterpret_cast though? I wanted to avoid calculating the uint, I just wanted to read the uint straight from the memory. I usually program in memory managed languages, now that I'm learning C, I was hoping to take advantage of all that control over the memory.

2501 Over a year ago

@Nathaniel This code will be faster than memcpy on a modern machine. Using the cast is incorrect (and won't be faster anyway).

Nathan Over a year ago

I tested it by timing it with this code: imgur.com/a/qMot0 Recast is the fastest, memcopy takes twice as long, and this bitwise code takes slightly longer than memcopy

2501 Over a year ago

@Nathaniel Your test is invalid because it causes undefined behavior. But a test isn't needed. It can be seen from the generated assembly that my version performs less instructions.

Nathan Over a year ago

compiling using "x86-64 gcc 6.2" (godbolt.org) says that if recast takes X instructions, memcopy takes (X-1) instructions, and the bitwise code takes (X+13) instructions.

|

user3629249 · Accepted Answer · 2016-11-20 07:31:45Z

regarding the posted code:

char* f = (char*)malloc(4 * sizeof(char));
f[0] = 0;
f[1] = 0;
f[2] = 0;
f[3] = 1;
unsigned int j = *f;
printf("%u\n", j);

in C, the return type from malloc() is void* which can be assigned to any other pointer, so casting just clutters the code and can be a problem when applying maintenance to the code.
The C standard defines sizeof(char) as 1, so that expression has absolutely no effect as a part of the expression passed to malloc()
the size of a int is not necessarily 4 (think of microprocessors or 64bit architecture)
the function: calloc() will pre set all the bytes to 0x00
which byte should be set to 0x01 depends on the Endianness of the underlying architecture

lets' assume, for now, your computer is a little Endian architecture. (I.E. Intel or similar)

then the code should look similar to the following:

#include <stdio.h>  // printf(), perror()
#include <stdlib.h> // calloc(), exit(), EXIT_FAILURE

int main( void )
{
    char *f = calloc( 1, sizeof(unsigned int) );
    if( !f )
    {
        perror( "calloc failed" );
        exit( EXIT_FAILURE );
    }

    // implied else, calloc successful

    // f[sizeof(unsigned int)-1] = 0x01; // if big Endian
    f[0] = 0x01;   // assume little Endian/Intel x86 or similar
    unsigned int j = *(unsigned int*)f;
    printf("%u\n", j);
}

Which when compiled/linked, outputs the following:

Collectives™ on Stack Overflow

Convert byte array to unsigned int using pointers

5 Answers 5

Comments

3 Comments

2 Comments

7 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

3 Comments

2 Comments

7 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related