38

I'd like to replace the attributes of a dataclass instance, analogous to namedtuple._replace(), i.e. making an altered copy of the original object:

from dataclasses import dataclass
from collections import namedtuple

U = namedtuple("U", "x")

@dataclass
class V:
    x: int

u = U(x=1)
u_ = u._replace(x=-1)
v = V(x=1)

print(u)
print(u_)
print(v)

This returns:

U(x=1)
U(x=-1)
V(x=1)

How can I mimic this functionality in dataclass objects?

5 Answers 5

62

The dataclasses module has a helper function for field replacement on instances (docs)

from dataclasses import replace

Usage differs from collections.namedtuple, where the functionality was provided by a method on the generated type (Side note: namedtuple._replace is documented/public API, using an underscore on the name was called a "regret" by the author, see link at end of answer).

>>> from dataclasses import dataclass, replace
>>> @dataclass
... class V:
...     x: int
...     y: int
...     
>>> v = V(1, 2)
>>> v_ = replace(v, y=42)
>>> v
V(x=1, y=2)
>>> v_
V(x=1, y=42)

For more background of the design, see the PyCon 2018 talk - Dataclasses: The code generator to end all code generators. The replace API is discussed in depth, along with other design differences between namedtuple and dataclasses, and some performance comparisons are shown.

Sign up to request clarification or add additional context in comments.

8 Comments

It seems like someone discovered issues with init and post-init hooks in dataclasses, and instead of revisiting the design and resolving complexity, they chose to solve it just by adding complexity. The real story is that if you are leveraging dataclasses in some way where they aren't treated as completely logic-free containers, you're using them wrong and you need a different tool. deepcopy of a dataclass, for example, should have absolutely zero risk of doing anything besides simplistic deepcopy of each member attribute, so there is no least surprise issue for the user.
replace is pretty useful when having (pseudo-)immutable objects, such as frozen dataclasses. They are very common in functional programming where you don't mutate the original object, but instead return a new object with all fields equal except the ones you replace.
I sort of want to have replacec as method rather than a function, because it seems like it embeds the assumption that something is a dataclass in calling code.
@wim and @hugovdberg coming back to this a long time later. Wim, validation logic like that ought never be internal to the container itself. Make a helper function, validate_triangle - don't bloat a simple record object with responsibilities for self processing. If you need it for some (dubious) OO reason, then use a class. Almost any place would be better for that validation logic than as part of obtuse instance creation of something any user of such an object is likely to assume is just a basic data record. It reminds of @property, which is also frequently a bad choice & overused.
@ely That is how you see it, but I don't read a reason. For me, dataclasses are a power- and beautiful way to avoid boiler plate code around initialization, access and contracts of attributes.
|
1

Just using replace will have reference pointer to previous mutable objects, hence two instances of a dataclass will share a state

So try something like this:

@dataclasses.dataclass(frozen=True)
class MyDataClass:
    mutable_object: list
    val: int
    
    def copy(self, **changes):
        return dataclasses.replace(deepcopy(self), **changes)

data = MyDataClass([], 1)
data2 = data.copy(val=2)
assert data.mutable_object != data2.mutable_object

1 Comment

Can you explain a bit more what you mean by the reference pointer to a mutable object? When is this important?
0

I know the question is about dataclass, but if you're using attr.s instead then you can use attr.evolve instead of dataclasses.replace:

import attr

@attr.s(frozen=True)
class Foo:
    x = attr.ib()
    y = attr.ib()

foo = Foo(1, 2)
bar = attr.evolve(foo, y=3)

Comments

-1

dataclass is just syntactic sugar for the automatic creation of a special __init__ method and a host of other "boilerplate" methods based on type-annotated attributes.

Once the class is created, it is like any other, and its attributes can be overwritten and instances can be copied, e.g.

import copy

v_ = copy.deepcopy(v)
v_.x = -1

Depending on what the attributes are, you may only require copy.copy.

4 Comments

–1 It is incorrect to use a copy/deepcopy for field replacement on dataclasses. In some complex use cases (e.g. init/post_init hooks), data may not be handled correctly. The better way is to use dataclasses.replace() function.
@wim revisiting this a bit later I think my disagreement about replace is even stronger after having dealt with this feature in production systems for a while. I added some comments to your answer for a different take. I totally respect your POV is different, but I wanted to highlight a dissenting opinion because some users may feel like I do, and it could inform them on ways to use convention based restrictions of dataclass that allow for avoiding the bad code smell of replace.
The suggested approach of making a copy and then setting attributes does not work at all in the case of frozen dataclasses, which are pretty common when you want hashable instances that can be stored inside sets or used as dictionary keys.
A frozen dataclass in Python is just a fundamentally confused concept. It could still have mutable attributes like lists and so on. Using such a thing for dict keys is a hugely bad idea.
-2
@dataclass()
class Point:
    x: float = dataclasses.Field(repr=True, default=0.00, default_factory=float, init=True, hash=True, compare=True,
                                 metadata={'x_axis': "X Axis", 'ext_name': "Point X Axis"})
    y: float = dataclasses.Field(repr=True, default=0.00, default_factory=float, init=True, hash=True, compare=True,
                                 metadata={'y_axis': "Y Axis", 'ext_name': "Point Y Axis"})

Point1 = Point(13.5, 455.25)
Point2 = dataclasses.replace(Point1, y=255.25)

print(Point1, Point2)

1 Comment

Welcome to StackOverflow! Can you add some text to your answer to explain how it solves the problem, and maybe also point our how it adds to the other answers already provided?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.