Better to save method result as class variable or return this variable and use as input in another method?

Question

I don't have any formal training in programming, but I routinely come across this question when I am making classes and running individual methods of that class in sequence. What is better: save results as class variables or return them and use them as inputs to subsequent method calls. For example, here is a class where the the variables are returned and used as inputs:

class ProcessData:

 def __init__(self):
  pass

 def get_data(self,path):
  data = pd.read_csv(f"{path}/data.csv"}
  return data

 def clean_data(self, data)
  data.set_index("timestamp", inplace=True)
  data.drop_duplicates(inplace=True)
  return data

def main():
 processor = ProcessData()
 temp = processor.get_data("path/to/data")
 processed_data = processor.clean_data(temp)

And here is an example where the results are saved/used to update the class variable:

class ProcessData:

 def __init__(self):
  self.data = None

 def get_data(self,path):
  data = pd.read_csv(f"{path}/data.csv"}
  self.data = data

 def clean_data(self)
  self.data.set_index("timestamp", inplace=True)
  self.data.drop_duplicates(inplace=True)
  
def main():
 processor = ProcessData()
 processor.get_data("path/to/data")
 processor.clean_data()

I have a suspicion that the latter method is better, but I could also see instances where the former might have its advantages. I am sure the answer to my question is "it depends", but I am curious in general, what are the best practices?

I'd go with the latter unless you have really good reasons for needing the former. The latter makes the code much easier to understand and maintain because you can easily track the "flow" of data and see when and where those values are expected to change. There are some cases where you may need to save the result of a function for performance reasons - in those cases you can always return the result from the function and save it to a member variable at the call site. — 0x5453
– 0x5453, Commented Aug 29, 2022 at 15:20
You are correct, it depends. I use both, but I lean towards passing arguments. It makes the methods more pure, like functions, and easier to reason about. — Sergio Tulentsev
– Sergio Tulentsev, Commented Aug 29, 2022 at 15:21
Each piece of state is one more thing to keep track of and one more thing that can produce a bug. In general, the less state each part of the program has to take into account, the better. Passing state between functions by saving them as instance variables also tends to introduce non-obvious dependencies. In your second example, the caller of the class needs to know that get_data must be called before clean_data. In your first example, clean_data takes data as an argument, so the dependencies are obvious just from the function signatures. — Samwise
– Samwise, Commented Aug 29, 2022 at 15:21
Your first example doesn't need to be a class and could be implemented with plain functions. But there you have to keep track of the data. The second example misses a trick where you should be passing the path to the __init__() method to read and set the self.data member. Then you can have multiple other methods that process the data as required. — quamrana
– quamrana, Commented Aug 29, 2022 at 15:24

SargeATM · Accepted Answer · 2022-08-29 16:35:06Z

0

Sketch the class based on usage, then create it

Instead of inventing classes to make your high level coding easier, tap your heels together and write the high-level code as if the classes already existed. Then create the classes with the methods and behavior that exactly fits what you need.

PEP AS AN EXAMPLE

If you look at several peps, you'll notice that the rationale or motivation is given before the details. The rationale and motivation shows how the new Python feature is going to solve a problem and how it is going to be used sometimes with code examples.

Example from PEP 289 – Generator Expressions:

Generator expressions are especially useful with functions like sum(), min(), and max() that reduce an iterable input to a single value:
max(len(line)  for line in file  if line.strip())
Generator expressions also address some examples of functionals coded with lambda:
reduce(lambda s, a: s + a.myattr, data, 0)
reduce(lambda s, a: s + a[3], data, 0)
These simplify to:
sum(a.myattr for a in data)
sum(a[3] for a in data)

My methodology given above is the same as describing the motivation and rationale for a class in terms of use. Because you are writing the code that is actually going to use it first.

edited Aug 29, 2022 at 16:35

answered Aug 29, 2022 at 15:25

SargeATM

2,84918 silver badges27 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

SargeATM Over a year ago

Wow that's the fastest vote responses up or down I have ever received on stackoverflow. Either people feel my answer is completely wrong or too vague to be helpful. Constructive feedback would be helpful. Thanks.

Fruity Fritz Over a year ago

I think I see what you mean. So your answer is essentially an "it depends" because I should be creating classes, functions, etc. after "sketching" out the structure based on what the particulate project calls for -- is that correct?

SargeATM Over a year ago

@FruityFritz that's exactly what I mean.

SargeATM Over a year ago

@FruityFritz I updated the heading to use your terminology. It more closely represents what I'm describing.

Collectives™ on Stack Overflow

Better to save method result as class variable or return this variable and use as input in another method?

1 Answer 1

Sketch the class based on usage, then create it

PEP AS AN EXAMPLE

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Sketch the class based on usage, then create it

PEP AS AN EXAMPLE

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related