1

Let's say I have a dataset ('test.csv') like so:

Name,Fruit,Price
John,Apple,1.00
Steve,Apple,1.00
John,Mango,2.00
Adam,Apple,1.00
Steve,Banana,1.00

Although there are several easier ways to do this, I would like to organize this information as a class in python. So, ideally, an instance of a class would look like:

{'name': 'John', 'Fruits': ['Apple','Mango'], 'Price':[1.00, 2.00]}

My approach to loading the dataset into a class is to store each instance in a list.

class org(object):
    def __init__(self,name,fruit,price):
        self.name = name
        self.fruit = [fruit]
        self.price = [price]

    classes = []
    with open('test.csv') as f:
        for line in f:
            if not 'Name' in line:
                linesp=line.rstrip().split(',')
                name = linesp[0]
                fruit = linesp[1]
                price = linesp[2]
                inst = org(name,fruit,price)
                classes.append(inst)
    for c in classes:
        print (c.__dict__)
  1. In this case, how do I know if 'John' already exists as an instance?

  2. If so, how do I update 'John'? With a classmethod?

@classmethod
    def update(cls, value):
        cls.fruit.append(fruit)
3
  • It's not exactly what you're asking but it can be modified to your needs: Creating a singleton. In your case you don't exactly want "only one instance" (singleton) but you do want "only one instance of each type", so see if those ideas help Commented Jun 15, 2018 at 6:07
  • Are you open to using a dictionary instead of a list? Commented Jun 15, 2018 at 6:32
  • 1
    You really need to fix your indentation. The code you posted is extremely difficult to read, and not valid Python. Commented Jun 15, 2018 at 6:50

1 Answer 1

2

There's no need for anything special to update your instances. Your class' attributes are public, so just access them for updating.

If you insist using a list as your instance container, you could do sth. like this:

classes = []
with open('test.csv') as f:
    for line in f:
        if not 'Name' in line:
            name,fruit,price=line.rstrip().split(',')
            exists = [inst for inst in classes if inst.name == name]
            if exists:
                exists[0].fruit.append(fruit)
                exists[0].price.append(price)
            else:
                classes.append(org(name,fruit,price))
for c in classes:
    print (c.__dict__)

However, I recommend using a dict instead, because it makes lookup and access to the instances easier

classes = {}
with open('test.csv') as f:
    for line in f:
        if not 'Name' in line:
            name,fruit,price=line.rstrip().split(',')
            if name in classes:
                classes.get(name).fruit.append(fruit)
                classes.get(name).price.append(price)
            else:
                classes.update({name: org(name,fruit,price)})

for c in classes.values():
    print (c.__dict__)

Both solutions will give you the same thing:

{'name': 'John', 'fruit': ['Apple', 'Mango'], 'price': ['1.00', '2.00']}
{'name': 'Steve', 'fruit': ['Apple', 'Banana'], 'price': ['1.00', '1.00']}
{'name': 'Adam', 'fruit': ['Apple'], 'price': ['1.00']}

For the sake of completeness, what @MadPhysicist down below in the comments probably means by a clunky way to update the dict is that I use the dict's methods instead of accessing the items by subscription.

# update existing instance in the dict
classes[name].fruit.append(fruit)

# add new instance to the dict
classes[name] = org(name, fruit, price)

I personally just find that somewhat ugly, hence I tend to use the methods :)

Sign up to request clarification or add additional context in comments.

5 Comments

The idea to use a dict is correct. +1 for that. The way you update the dict is very clunky.
@MadPhysicist Care to elaborate why you consider it clunky? :)
If you're going to check for containment, use direct indexing instead of get to get the item. Also, it probably won't matter here, but it's much more efficient to retrieve the reference once and do two things to it than to look it up every time you want to do something to it. You may want to take a look at dict.setdefault, but you'll need a constructor to make an empty instance.
If you're going for efficiency, using a defaultdict in the first place, instead of dict.setdefault is supposed to be faster. But, as you said, it comes with the downside that the constructor would need changes to accept instantiation without arguments. If you wanted to keep the class the way it is, I'd argue that doing exists = classes.get(name), update exists if it is True and otherwise update the dict with a new instance would be better because you keep the maximum number of lookups at 1.
Also, why classes.update({name: org(name,fruit,price)}) instead of just classes[name] = org(...)?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.