2

To my knowledge, there isn't much of a difference between an unsigned char and a signed char besides a signed char ranging from -127 to 128 and an unsigned char ranging from 0 to 255. I'm trying to learn C++ and I've been wondering this for a while. Also, I was a lua programmer for a long time so I'm used to unsigned chars, so I'm just wondering the difference and when you'd rather use a signed char over an unsigned char.

Thanks.

5
  • 2
    Well if you are doing bit manipulations unsigned types are preferred. Commented Mar 7, 2017 at 20:18
  • 5
    Remember that it's implementation defined if plain char is signed or not. So you would use signed char (or unsigned char) if you want a small integer in the range of it. But then I would rather recommend int8_t and uint8_t instead (if available) for generic small integers. Commented Mar 7, 2017 at 20:19
  • Oh, I see. Thanks both of you for taking the time to reply Commented Mar 7, 2017 at 20:24
  • A signed char typically ranges from -128 to 127 because two's-complement spends the "extra" value on the negative side. Before C++11, the guaranteed range was only -127 to 127, because it allowed for 8-bit sign-magnitude representations, even though most implementations were two's-complement. Commented Mar 7, 2017 at 20:36
  • unsigned char is C's legacy way of saying "byte". singed char is a small signed integer. An application is mini-normals for 3D geometry where 3 floats would be too memory-hungry. Commented Mar 7, 2017 at 21:15

1 Answer 1

2

As @SomeProgrammerDude explained and you already knew, you specify signed or unsigned explicitly when you wish to use small integral values. On an AMD-64 architecture (which is the most widely-used architecture for 64-bit general-purpose CPUs, and it probably the one you have on your laptop), a signed char takes up 1 byte and ranges from -128 to 127, while an unsigned char also takes up 1 byte but ranges from 0 to 255.

I would like to push this a little further by showing how using signed or unsigned integral types (such as char, short, int, ...) impact the final program and are actually implemented. I'll use the example of char, but the principle is identical with other integral types.

Assume we have this small program:

// main.cpp

#include <iostream>

int main() {
    signed char   sc = (signed char)255; // equivalent to sc = -1
    unsigned char uc = 255;

    bool signedComp   = (sc <= 5);
    bool unsignedComp = (uc <= 5);

    return 0;
}

If we have a look at the assembler (the code that is very close to what your CPU actually does), we can observe the difference. Here is the most relevant part of the assembler code:

movb    $-1, -4(%rbp) # sc = -1
movb    $-1, -3(%rbp) # uc = -1 (equivalent to uc = 255)

cmpb    $5, -4(%rbp)  # compare sc and 5, ...
setle   %al           # ... see whether sc was lower or equal (signed comparison), ...
movb    %al, -2(%rbp) # ... and set the boolean result into signedComp.

cmpb    $5, -3(%rbp)  # compare uc and 5, ...
setbe   %al           # ... see whether uc was below or equal (unsigned comparison), ...
movb    %al, -1(%rbp) # ... and set the boolean result into unsignedComp.

(If you are curious and want to generate the assembler yourself, run g++ -S main.cpp -o main.s -O0 and have a look at part of the main.s file where you see the main: tag.)

On your memory (specifically, on the stack), both sc and uc will take up 1 byte. In fact, sc and uc actually contain the same value of 255. However, it's the way the comparison is done that makes sc and uc different.

Thus, this :

there isn't much of a difference between an unsigned char and a signed char

... is ironically 100% true.

The lesson to learn from this is that the numbers programmers work with are just conceptual. In the end, it's all about how you work with the 1s and 0s.

Sign up to request clarification or add additional context in comments.

6 Comments

A signed char must cover at least the range from -128 to 127, but it can be (and sometimes is) larger. Similarly, an unsigned char must cover at least the range from 0 to 255, but it can be larger.. Thus the comment that assigning 255 to a variable of type signed char is "equivalent to sc = -1" may be correct for most systems, but it is not required, and there are systems (with larger char types) for which it is not true.
@PeteBecker «In the end, it's all about how you work with the 1s and 0s.» QED. Thus it is proven.
@PeteBecker Yes, there are platforms where they aren’t equivalent, but the answer specifically says it’s for “AMD-64 architecture”. It’s also true for most other architectures you see, such as ARM.
@DanielH I specified "for AMD-64" after @PeteBecker's comment. By the way, I'm currently a CS engineering student, and for our first-year project, we had to program an ATMega324PA microcontroller using avr-gcc, a tweak of gcc/g++ for microcontrollers. On this compiler (or at least, for that specific microcontroller), an int's size was 2 bytes. But if you're making big software that won't ever fit on a different architecture anyway, I don't see the downside of using unsigned char rather than uint8_t (though unsigned char is a bit verbose). I guess it's a matter of habit and viewpoint.
@martinkunev -- regardless of how a signed char is represented, the language definition requires that it be capable of representing all integral values from -127 to 127 inclusive (my earlier comment incorrectly says that the lower bound is -128). Larger ranges are allowed. That's a little hard to find, but it comes from the specifications of SCHAR_MIN and SCHAR_MAX in <limits.h>.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.