Why bitwise complement of a 1-byte variable returns a 4-byte value?

Question

Let's take for example the following two 1-byte variables:

uint8_t x1 = 0x00;
uint8_t x2 = 0xFF;

When printing the bitwise complement, the result is a 4-byte variable:

printf("%02X -> %02X; %02X -> %02X\n", x1, ~x1, x2, ~x2);
00 -> FFFFFFFF; FF -> FFFFFF00

I know this can be "solved" using casting or masking:

printf("%02X -> %02X; %02X -> %02X\n", x1, (uint8_t) ~x1, x2, (uint8_t) ~x2);
00 -> FF; FF -> 00
printf("%02X -> %02X; %02X -> %02X\n", x1, ~x1&0xFF, x2, ~x2&0xFF);
00 -> FF; FF -> 00

But why the non-intuitive behavior in the first place?

Because %X is for unsigned int. And no, uint8_t is not a 2-byte variable. — Eugene Sh.
– Eugene Sh., Commented Dec 20, 2017 at 19:02
Look up "integer promotions in C" to learn what is going on. — Sergey Kalinichenko
– Sergey Kalinichenko, Commented Dec 20, 2017 at 19:06

Eric Postpischil · Accepted Answer · 2017-12-20 19:19:41Z

2

Many computer processors have a “word” size for most of their operations. E.g., on a 32-bit machine, there may be an instruction that loads 32 bits, an instruction that stores 32 bits, an instruction that adds one 32-bit number to another, and so on.

On these processors, it may be a nuisance to work with other sizes. There may be no instruction for multiplying a 16-bit number by another 16-bit number. C grew up on these machines. It was designed so that int (or unsigned int) was “whatever size is good for the machine you are running on” and char or short were fine for storing things in memory, but, once they were loaded from memory into processor registers, C worked with them like they were int.

This simplified the development of early C compilers. The compiler did not have to implement your complement by doing a 32-bit complement instruction followed by an AND instruction to remove the unwanted high bits. It only did a plain 32-bit complement.

We could develop languages differently today, but C is burdened with this legacy.

edited Dec 20, 2017 at 19:19

answered Dec 20, 2017 at 19:13

Eric Postpischil

233k15 gold badges199 silver badges379 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

user1143634 Over a year ago

Even modern processors are not able to work on words of arbitrary sizes. Afaik ARM has 16-bit multiplication (that is impossible to use from C), but has no 8 bit multiplication. Same for 16 and 8-bit addition, subtraction and others. Everything has to be promoted to 32-bit values. Intel is the only architecture that deals with different sizes of values. So this is not a legacy, it how processors are usually built.

Sparkler · Accepted Answer · 2017-12-20 19:34:59Z

1

When you apply the ~ operator to x1 and x2, the values are first subject to integer promotions because uint8_t is smaller than an int. The operator is then applied to the promoted value.

So ~x1 is really ~0x00000000 (i.e. 0xFFFFFFFF) and ~x2 is really ~0x000000FF (i.e. FFFFFF00). That's why you get the values you're getting.

Also, the %x format specifier expects an unsigned int which it prints as such.

You need to use %hhx for the format specifier. That signifies an unsigned char argument.

printf("%02hhX -> %02hhX; %02hhX -> %02hhX\n", x1, ~x1, x2, ~x2);

edited Dec 20, 2017 at 19:34

Sparkler

2,9731 gold badge28 silver badges46 bronze badges

answered Dec 20, 2017 at 19:06

dbush

233k27 gold badges261 silver badges334 bronze badges

6 Comments

Sparkler Over a year ago

hhX prints 00 -> FFFF; FF -> FF00

dbush Over a year ago

Are you sure you're using hhX and not hX?

dbush Over a year ago

@chux It actually might not help in this case. My implementation defined PRIx8 as "x".

Sparkler Over a year ago

@dbush, you're right, apparently hhX on my system yields a too many arguments for format [-Wformat-extra-args] warning

chux Over a year ago

PRIx8 as "x" is OK for the implementation. That is no problem when it prints a uint8_t. This answer is trying to do something different with printf("%02hhX\n", some_int_with_negative_value); which is UB - I think Hmmmm.

|

Collectives™ on Stack Overflow

Why bitwise complement of a 1-byte variable returns a 4-byte value?

2 Answers 2

1 Comment

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related