-4

I'm working on code examples that demonstrate how you can "shoot yourself in the foot" using pointers in C++.

It's easy to create code that crashes. But now I'm trying to write code that would change the value of a constant, and it's not working.

Here's a sample code:

int main()
{
    int first = 1;
    int second = 2;
    const int the_answer = 42;
    int third = 3;
    int fourth = 4;

    cout << "First  : " << &first << " -> " << first << endl;
    cout << "Second : " << &second << " -> " << second << endl;
    cout << "TheAns : " << &the_answer << " -> " << the_answer << endl;
    cout << "Third  : " << &third << " -> " << third << endl;
    cout << "Fourth : " << &fourth << " -> " << fourth << endl << endl;

    for (int *pc = &second; pc < &third; pc++) {
        *pc = 33;
        cout << pc << "->" << * pc << endl;
    }

    cout << "First  : " << &first << " -> " << first << endl;
    cout << "Second : " << &second << " -> " << second << endl;
    cout << "TheAns : " << &the_answer << " -> " << the_answer << endl;
    cout << "Third  : " << &third << " -> " << third << endl;
    cout << "Fourth : " << &fourth << " -> " << fourth << endl << endl;

    return 0;
}

I can see in the output that the contents of the address of the constant (0x56F40FF574) gets overwritten:

First  : 00000056F40FF534 -> 1
Second : 00000056F40FF554 -> 2
TheAns : 00000056F40FF574 -> 42
Third  : 00000056F40FF594 -> 3
Fourth : 00000056F40FF5B4 -> 4

00000056F40FF554->33
00000056F40FF558->33
00000056F40FF55C->33
00000056F40FF560->33
00000056F40FF564->33
00000056F40FF568->33
00000056F40FF56C->33
00000056F40FF570->33
00000056F40FF574->33       <---
00000056F40FF578->33
00000056F40FF57C->33
00000056F40FF580->33
00000056F40FF584->33
00000056F40FF588->33
00000056F40FF58C->33
00000056F40FF590->33
First  : 00000056F40FF534 -> 1
Second : 00000056F40FF554 -> 33
TheAns : 00000056F40FF574 -> 42
Third  : 00000056F40FF594 -> 3
Fourth : 00000056F40FF5B4 -> 4

I stepped through the code with debugger, and I saw the value of the constant the_answer change in "locals" window. But then, cout displays the original value.

10
  • 12
    Changing a const object is undefined behavior, anything can happen. In particular, the compiler is free to assume a const object never changes and optimize as such. Besides that, your pointer comparison(pc < &third) is also undefined behavior. Commented Nov 11, 2024 at 4:16
  • 10
    Your code is making several assumptions, and NONE of them are guaranteed to be true. There is NO GUARANTEE that first, second, the_answer, third, and fourth are located near each other in memory, but your code ASSUMES they are located consecutively. You are incorrectly assuming that you can iterate using pointers to modify those variables - whereas the loop has undefined behaviour. You are assuming incorrectly that a variable marked const exists in memory - but there is no guarantee of that. Commented Nov 11, 2024 at 6:22
  • 1
    I also suggest you read the as-if rule. Nothing stops the compiler from assuming "you have a const variable, thus ignore any changes to it" when the object code is produced. Commented Nov 11, 2024 at 7:09
  • @PaulMcKenzie Programs with undefined behaviour are specifically exempt from the as-if rule (or was that your point?). Commented Nov 11, 2024 at 7:16
  • 5
    Surely the fact that you can write code that apparently changes the value of a const variable but then that variable has not changed proves that you have 'shot yourself in the foot'. Commented Nov 11, 2024 at 7:18

3 Answers 3

3

As you can see in cppreference:

Modifying a const object through a non-const access path ... results in undefined behavior.

(emphasis is mine)

When you invoke undefined behavior, the standard gives no guarantee regarding the behavior of the program. It could crash, give invalid result or even give the expected result but you cannot rely on it.

Also note that the compiler is allowed to assume undefined behavior does not happen, and e.g. optimize away access to the const object (and emit the value it was initialized with when you attempt to access it).

Another issue is that it is not guaranteed that your local variables are laid out sequentially in memory, and in particular iterating over the addresses (and comparing pc < &third) as you do in the for loop is invoking undefined behavior as well.

Sign up to request clarification or add additional context in comments.

Comments

2

Your code has multiple issues, and what you observe is the result of undefined behaviors, and unless you really realize the importance of undefined behavior and carefully avoid them, your C++ code will not do what you naively expect it to do. C++ is not just glorified assembly (nor is C). It has its own very strict rules.

that demonstrate how you can "shoot yourself in the foot" using pointers in C++.

Yes, you have just demonstrated some footguns in C++, although not your expected ones.

Let's check what the C++ standard has to say.

In dcl.type.cv:

Any attempt to modify ([expr.ass], [expr.post.incr], [expr.pre.incr]) a const object ([basic.type.qualifier]) during its lifetime ([basic.life]) results in undefined behavior.

followed by some examples. In your case, even if your *pc=33 does an assignment to the object the_answer (actually it does not, and we'll soon see why), it is undefined behavior because you are attempting to modify a const object within its lifetime.

In any case, a compiler is free to assume that value of the object the_answer never changes because it is a const object, and is free to optimize

cout << "TheAns : " << &the_answer << " -> " << the_answer << endl;

to

cout << "TheAns : " << &the_answer << " -> " << 42 << endl;

If this optimzation is performed (which is a fairly well-known technique by compiler implementors, and is known as constant propagation), it is quite natural that you can observe the change of the memory at &the_answer when the output is still 42.

Now that we have clarified this part, let's check other undefined behaviors in your code.

In C++, you cannot compare two arbitrary pointers using the built-in relational operator and get a reliable result. You may expect them to naively just compare the address, but that is not what the standard says. Let's check expr.rel:

The result of comparing unequal pointers to objects is defined in terms of a partial order consistent with the following rules:

  • If two pointers point to different elements of the same array, or to subobjects thereof, the pointer to the element with the higher subscript is required to compare greater.
  • If two pointers point to different non-static data members of the same object, or to subobjects of such members, recursively, the pointer to the later declared member is required to compare greater provided neither member is a subobject of zero size and their class is not a union.
  • Otherwise, neither pointer is required to compare greater than the other.

If two operands p and q compare equal ([expr.eq]), p<=q and p>=q both yield true and pq both yield false. Otherwise, if a pointer to object p compares greater than a pointer q, p>=q, p>q, q<=p, and q<p all yield true and p<=q, p<q, q>=p, and q>p all yield false. Otherwise, the result of each of the operators is unspecified.

In particular, that means a implementation can give you the result that both pc>&third and pc<&third are true, or both are false. Anyway this is not undefined behavior, only unspecified. If you really mean to compare their address, you may use reinterpret_cast<std::uintptr_t>(p) < reinterpret_cast<std::uintptr_t>(q), and it becomes implementation-defined behavior (Note that the standard also does not guarantee the reinterpret_casts to give you the addresses). Or you can consider using std::less{}(p, q) which at least guarantees a total order (so that it can never be that both std::less{}(p, q) and std::less{}(q, p) are true).

Now let's further assume your compiler does the correct thing and compares the addresses as you expect, then we run to undefined behaviors related to pointer arithmetics,

The expression pc++ is equivalent to pc = pc + 1, so let's check the validity of that, expr.add says,

When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P.

  • If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.
  • Otherwise, if P points to a (possibly-hypothetical) array element i of an array object x with n elements ([dcl.array]), the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) array element i+j of x if 0≤i+j≤n and the expression P - J points to the (possibly-hypothetical) array element i−j of x if 0≤i−j≤n.
  • Otherwise, the behavior is undefined.

That means except for the case of adding 0 to nullptr, pointer arithmetics are only valid for objects within the same array (including the past-the-end pointer). In particular, a variable of non-array type can be considered as an array of size one, and since pc originally points to the single element of the hypothetical array, after pc++, pc becomes past-the-end pointer of the array. The pointer arithmetic here is fine so far.

However, expr.unary.op states,

The unary * operator performs indirection. Its operand shall be a prvalue of type “pointer to T”, where T is an object or function type. The operator yields an lvalue of type T. If the operand points to an object or function, the result denotes that object or function; otherwise, the behavior is undefined except as specified in [expr.typeid].

and basic.compound has this to say:

A value of a pointer type that is a pointer to or past the end of an object represents the address of the first byte in memory ([intro.memory]) occupied by the object or the first byte in memory after the end of the storage occupied by the object, respectively.

[Note 2: A pointer past the end of an object ([expr.add]) is not considered to point to an unrelated object of the object's type, even if the unrelated object is located at that address. — end note]

In particular, Note 2 rules out the possiblilty of using *pc to refer to the object the_answer even if it happens to be at the same address (which is, again, not guaranteed at all.1). So this makes your *pc = 33 (after incrementing pc) undefined behavior by itself, regardless of whether the_answer is const or not, or whether printing out &the_answer and pc shows you the same address.

Since your code has at least two undefined behaviors, there is no guarantee about the observed behavior of your program. While most answers merely states that modifying a const object is UB (which it is), this answer explains why *pc=33 is not even a valid attempt to modify the_answer (*pc does not refer to the_answer after pc is incremented).


1As @Peter comments, the_answer may not even exist in the memory in general when it is never ODR-used. In your case, std::cout << &the_answer counts as an ODR-use of the_answer, so it is guaranteed to be in the memory, but << the_answer is not an ODR-use and the compiler is free not to emit any memory read instructions.

2 Comments

It seems that the only thing this code proves is that no matter what, the answer remains 42, lol.
Thank you for the long and detailed answer. I can see you put a lot of thought into it. I hope you get more upvotes.
1

There are 3 different categories of const objects.

  1. Immutable objects that are in ROM or RAM marked as read only. Attempts to alter it is undefined behavior.
  2. Objects in RAM that may be altered but yield undefined behavior if alteration is attempted even though the memory at that location may be altered.
  3. Objects that are in a non const class. These can be altered only by replacing the object they are inside of.

Examples:

const int c1{1};
struct A { const int c3{3}; };

int main(){
    const int c2{2};
    A a;
}

c1 may be stored in a memory that isn't writable. It's truly immutable. Attempts to alter it will fail. It is also undefined behavior.

c2 may be stored on the heap. The compiler assumes it will never change. Attempts to alter it is undefined behavior.

a.c3 is mutable but may not be altered directly. It can only be altered by replacing the object a that contains c3. For instance std::construct_at(&a, {42}); Here's a more general solution using a user defined copy assignment method for classes that contain consts or reeferences.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.