In modern C++ (at least C++ 2011 and later) neither is undefined behavior. And even neither is implementation defined or unspecified. (All three terms are different things.)
These two lines are both well defined (but they do different things).
When you have pointers p1 and p2 to scalar types then
*++p1 = *p2++;
is equivalent to
p1 = p1 + 1;
*p1 = *p2;
p2 = p2 + 1;
(^^^this is also true for C++ 1998/2003)
and
++*p1 = *p2++;
is equivalent to
*p1 = *p1 + 1;
*p1 = *p2;
p2 = p2 + 1;
(^^^maybe also in C++ 1998/2003 or maybe not - as explained below)
Obviously in case 2 incrementing value and then assigning to it (thus overwriting just incremented value) is pointless - but there may be similar examples that make sense (e.g. += instead of =).
BUT like many people point out - just don't write the code that looks ambiguous or unreasonably complex. Write the code that is clear to you and supposed to be clear to the readers.
Old C++ 1998/2003 case for second expression is a strange matter:
At first after reading the description of prefix increment operator:
ISO/IEC 14882-2003 5.3.2:
- The operand of prefix ++ is modified by adding 1, or set to true if it
is bool (this use is deprecated). The operand shall be a modifiable
lvalue. The type of the operand shall be an arithmetic type or a
pointer to a completely-defined object type. The value is the new
value of the operand; it is an lvalue. If x is not of type bool, the
expression ++x is equivalent to x+=1.
I personally have a strong feeling that everything is perfectly defined and obvious and the same as above for C++ 2011 and later.
At least in the sense that every reasonable C++ implementation will behave in exact same well defined way (including old ones).
Why it should be otherwise if we always intuitively rely on a general rule that in any simple operator evaluation within a complex expression we evaluate its operands first and after that apply the operator to the values of those operands. Right? Breaking this intuitive expectation would be extremely stupid for any programming language.
So for the full expression ++*p1 = *p2++; we have operands: 1 - ++*p1 evaluated as already incremented lvalue (as defined in the above quote from C++ 2003) and 2 - *p2++ that is an rvalue stored at pointer p2 before its increment. It doesn't look ambiguous at all. Of course in this case - no reason to increment a value you are overwriting anyway BUT if there was double increment instead - ++(++*p1); OR other kind of assignment like +=/-=/&=/*=/etc instead of simple assignment THAT would not be unreasonable at all.
Unfortunately all the intuition and logic is messed up by this:
ISO/IEC 14882-2003 - 5 Expressions:
- Except where noted, the order of evaluation of operands
of individual operators and subexpressions of individual
expressions, and the order in which side effects
take place, is unspecified. Between the previous
and next sequence point a scalar object shall have its
stored value modified at most once by the evaluation
of an expression. Furthermore, the prior value shall be
accessed only to determine the value to be stored.
The requirements of this paragraph shall be met for each
allowable ordering of the subexpressions of a full
expression; otherwise the behavior is undefined.
[Example:
i = v[i++]; // the behavior is unspecified
i = 7, i++, i++; // i becomes 9
i = ++i + 1; // the behavior is unspecified
i = i + 1; // the value of i is incremented
—end example]
So this wording if interpreted in a paranoid way seems to imply that modification of a value stored in a specific location more than once without intervening sequence point is explicitly forbidden by this rule and the last sentence declares that failing to comply with every requirement is Undefined Behavior. AND our expression seems to modify the same location more that once (?) with no sequence point until the full expression evaluated. (This arbitrary and unreasonable limitation is reinforced further by example 3 - i = ++i + 1; though it says // the behavior is unspecified - not undefined as in the wording before - which only adds more confusion.)
BUT on the other hand... If we ignore the example 3. (Maybe i = ++i + 1; is a typo and there should have been postfix increment instead - i = i++ + 1;? Who knows... Anyway examples are not part of formal specification.) If we interpret this wording in the most permissive way - we can see that in each allowed order of evaluation of subexpressions of the whole expression - preincrement ++*p1 must be evaluated to an LVALUE (which is something that allows further modification) BEFORE applying assignment operator so the only valid final value at that location is the one that is stored with assignment operator. ALSO NOTE that conforming C++ implementation have no obligation to actually modify that location more than once and may instead store only final result - that is both reasonable optimization allowed by the standard and may be actual demand of this article.
Which one of those interpretations is correct? Paranoid or permissive? Universally applicable logic or some suspicious and ambiguous words in a document almost nobody really ever read? Blue pill or Red pill?
Who knows... It looks like a gray area that requires less ambiguous explanation.
If we interpret the quote from C++ 2003 standard above in a paranoid way then it looks like this code may be Undefined Behavior:
#include <iostream>
#define INC(x) (++(x))
int main()
{
int a = 5;
INC(INC(a));
std::cout << a;
return 0;
}
while this code is perfectly legitimate and well defined:
#include <iostream>
template<class T> T& INC(T& x) // sequence point after evaluation of the arguments
{ // and before execution of the function body
return ++x;
}
int main()
{
int a = 5;
INC(INC(a));
std::cout << a;
return 0;
}
Really?
All this looks very much like a defect of the old C++ standard.
Fortunately this has been addressed in newer C++ standards (starting with C++ 2011) as there is no such concept as sequence point anymore. Instead there is a relation - something sequenced before something. And of course the natural guarantee that evaluation of the argument expressions of any operator is sequenced before evaluation of the result of the operator is there.
ISO/IEC 14882-2011 - 1.9 Program execution
Sequenced before is an asymmetric, transitive, pair-wise relation between evaluations executed by a single thread (1.10), which induces
a partial order among those evaluations. Given any two evaluations A
and B, if A is sequenced before B, then the execution of A shall
precede the execution of B. If A is not sequenced before B and B is
not sequenced before A, then A and B are unsequenced. [ Note: The
execution of unsequenced evaluations can overlap. — end note ]
Evaluations A and B are indeterminately sequenced when either A is
sequenced before B or B is sequenced before A, but it is unspecified
which. [ Note: Indeterminately sequenced evaluations cannot overlap,
but either could be executed first. — end note ]
Every value computation and side effect associated with a full-expression is sequenced before every value computation and side
effect associated with the next full-expression to be evaluated.
Except where noted, evaluations of operands of individual operators
and of subexpressions of individual expressions are unsequenced. [
Note: In an expression that is evaluated more than once during the
execution of a program, unsequenced and indeterminately sequenced
evaluations of its subexpressions need not be performed consistently
in different evaluations. — end note ] The value computations of the
operands of an operator are sequenced before the value computation of
the result of the operator. If a side effect on a scalar object is
unsequenced relative to either anotherside effect on the same scalar
object or a value computation using the value of the same scalar
object, the behavior is undefined.
[ Example:
void f(int, int);
void g(int i, int* v) {
i = v[i++]; // the behavior is undefined
i = 7, i++, i++; // i becomes 9
i = i++ + 1; // the behavior is undefined
i = i + 1; // the value of i is incremented
f(i = -1, i = -1); // the behavior is undefined
}
— end example ]
(Also NOTE how C++ 2003 prefix increment example i = ++i + 1; is replaced by postfix increment example i = i++ + 1; in this C++ 2011 quote. :) )
++,*and=in one horribly complex and hard-to-read line...std::string a = "Hello"; std::string b = "World"; a = b;