18

Consider the following code:

#include <iostream>

struct Data
{
    int x, y;
};

Data fill(Data& data)
{
    data.x=3;
    data.y=6;
    return data;
}

int main()
{
    Data d=fill(d);
    std::cout << "x=" << d.x << ", y=" << d.y << "\n";
}

Here d is copy-initialized from the return value of fill(), but fill() writes to d itself before returning its result. What I'm concerned about is that d is non-trivially used before being initialized, and use of uninitialized variables in some(all?) cases leads to undefined behavior.

So is this code valid, or does it have undefined behavior? If it's valid, will the behavior become undefined once Data stops being POD or in some other case?

10
  • 7
    This is uggly, who do you do this ?` Commented Nov 11, 2015 at 11:20
  • 3
    Why not just give Data a proper constructor? Also, see here: stackoverflow.com/questions/4782757/… Commented Nov 11, 2015 at 11:27
  • 2
    Related question: Is passing a C++ object into its own constructor legal? this is not a constructor but the arguments there are all similar for this case as well. Commented Nov 11, 2015 at 12:31
  • 2
    @MichaelWalz I agree this is ugly, but it's a much simplified version of a thing I only recently understood how to implement properly. So now it's just a matter of curiosity whether it was valid. Commented Nov 11, 2015 at 13:04
  • 2
    @ShafikYaghmour that's an exact phrase among those in bounty reasons "this question didn't receive enough attention" or something like this ;) You can consider offering a bounty if you really feel so. Commented Nov 16, 2015 at 13:05

3 Answers 3

14

This does not seem like valid code. It is similar to the case outlined in the question: Is passing a C++ object into its own constructor legal?, although in that case the code was valid. The mechanics are not identical but the base reasoning can at least get us started.

We start with defect report 363 which asks:

And if so, what is the semantics of the self-initialization of UDT? For example

 #include <stdio.h>

 struct A {
        A()           { printf("A::A() %p\n",            this);     }
        A(const A& a) { printf("A::A(const A&) %p %p\n", this, &a); }
        ~A()          { printf("A::~A() %p\n",           this);     }
 };

 int main()
 {
  A a=a;
 }

can be compiled and prints:

A::A(const A&) 0253FDD8 0253FDD8
A::~A() 0253FDD8

and the proposed resolution was:

3.8 [basic.life] paragraph 6 indicates that the references here are valid. It's permitted to take the address of a class object before it is fully initialized, and it's permitted to pass it as an argument to a reference parameter as long as the reference can bind directly. [...]

So although d is not fully initialized we can pass it as a reference.

Where we start to get into trouble is here:

data.x=3;

The draft C++ standard section 3.8(The same section and paragraph the defect report quotes) says (emphasis mine):

Similarly, before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any glvalue that refers to the original object may be used but only in limited ways. For an object under construction or destruction, see 12.7. Otherwise, such a glvalue refers to allocated storage (3.7.4.2), and using the properties of the glvalue that do not depend on its value is well-defined. The program has undefined behavior if:

  • an lvalue-to-rvalue conversion (4.1) is applied to such a glvalue,

  • the glvalue is used to access a non-static data member or call a non-static member function of the object, or

  • the glvalue is bound to a reference to a virtual base class (8.5.3), or

  • the glvalue is used as the operand of a dynamic_cast (5.2.7) or as the operand of typeid.

So what does access mean? That was clarified with defect report 1531 which defines access as:

access

to read or modify the value of an object

So fill accesses a non-static data member and hence we have undefined behavior.

This also agrees with section 12.7 which says:

[...]To form a pointer to (or access the value of) a direct non-static member of an object obj, the construction of obj shall have started and its destruction shall not have completed, otherwise the computation of the pointer value (or accessing the member value) results in undefined behavior.

Since you are using a copy anyway you might as well create an instance of Data inside of fill and initialize that. The you avoid having to pass d.

As pointed out by T.C. it is important to explicitly quote the details on when lifetime starts. From section 3.8:

The lifetime of an object is a runtime property of the object. An object is said to have non-trivial initialization if it is of a class or aggregate type and it or one of its members is initialized by a constructor other than a trivial default constructor. [ Note: initialization by a trivial copy/move constructor is non-trivial initialization. — end note ] The lifetime of an object of type T begins when:

  • storage with the proper alignment and size for type T is obtained, and

  • if the object has non-trivial initialization, its initialization is complete.

The initialization is non-trivial since we are initializing via the copy constructor.

Sign up to request clarification or add additional context in comments.

7 Comments

This is missing a quote from 3.8/1 on when d's lifetime starts.
@T.C. Fair point, I kind of left it implied by reference to the defect report but I should make it explicit. I won't be able to edit it in for a while though.
@M.M it is unclear to me, I think they both apply. The defect report with a similar situation cites 3.8 but 3.8 cites 12.7 but it seems like they are saying the same thing in a different way. It seems like some fixing is needed in the wording.
@M.M As I see it, the object isn't under construction until its constructor begins execution. The access here happens before that, while evaluating arguments to be passed to the constructor.
I concur with your conclusion. I would also point out that if x and y were of some non-POD type with an assignment operator that depended on the existing value, then calling that assignment operator (as you do in fill) is bound to fail (because x and y haven't been constructed yet). Given it can't work with complex classes, I'd feel very nervous about using it with POD classes. (Even if the standard were to allow it, this sort of edge case is where compiler bugs are likely to lurk.)
|
2

I don't see a problem. Accessing the uninitialized integer members is valid, because you're accessing for the purpose of writing. Reading them would cause UB.

2 Comments

Could you comment on why Shafik's answer is wrong, as it seems to contradict your answer?
@Ruslan: Omitting Standardese, this->x = 3 would be legal in a constructor of Data, even if x is not initialized by the initializer list. This hinges on x not having a (default) constructor, as it's a plain int, because for members with a constructor will have been initialized by that point. Fundamentally, "lifetime" is a fuzzy problem for POD's. *(int*) malloc(sizeof(int)) = 5 has to be valid for C compatibility, which shows that storage is sufficient. This mirrors Shaik's last quote from 3.8. Data::x has the right storage and trivial initialization, so the write is OK.
-7

I think it is valid ( crazy, but valid ).

This would be both legal and logically acceptable :

Data d ;

d = fill( d ) ;

and the fact is that this form is the same :

Data d = fill( d ) ;

As far as the logical structure of the language is concerned those two versions are equivalent.

So it's legal and logically correct for the language.

However, as we normally expect people to initialize variables to a default when we created them ( for safety ), it is bad programming practice.

It is interesting that g++ -Wall compiles this code without a blurp.

11 Comments

Those forms are not the same, and they are not equivalent "as far as the logical structure of the language is concerned". One is initialization and one is assignment. Also, an answer to a language-lawyer question should include references to the language standard to support the argument presented.
Well, you're mistaken.
Because the C++ standard says that it is initialization. This is really basic stuff. Here is an intro, read the C++ standard section 8.5.4 for a full description. You're out of your depth here.
"in fact the variable d had to be initialized ( even it it was with garbage ) before that functions result is produced." - no it didn't. That's the whole point of this question, fill(d) uses d before its initialization even starts. See Shafik's answer for supporting evidence.
BTW if you're going to say things like "at least have the common courtesy to say why [the downvote]", then you should accept explanations. Probably people don't explain their downvotes because they don't want to get into lame "arguments" like this where you just insist you were right all along and don't listen.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.