2

I am new to Python and need some help with a string I have that looks like this:

string='Starters\nSalad with Greens 14.00\nSalad Goat Cheese 12.75\nMains\nPizza 12.75\nPasta 12.75\n'

and need to transform it into a table that looks more like this:

Category   Dish   Price
Starters   Salad with Greens   14.00
Starters   Salad Goat Cheese   12.75
Mains   Pizza   12.75
Mains  Pasta  12.75

What would be the best way to achieve this?

I was trying to apply string.rsplit(" ",2) but couldn't figure out to make it do it per line. And had no idea how to repeat the headers into a separate column. Any help will be much appreciated.

Thanks in advance!

3
  • 2
    Where does this string come from? Seems like an X/Y problem. Commented Jan 14, 2018 at 20:36
  • It's ambiguous how you want to treat categories since they do not appear with each dish.. Do you want to assume all entries are Startes until Mains appear? Commented Jan 14, 2018 at 20:38
  • 1
    My assumption is you are creating this string. Instead of this string you should be storing this value in list. Then you may format the table using tabulate library. Check this How to create a table in python? Commented Jan 14, 2018 at 20:41

4 Answers 4

2

I suppose you have to decide how to differentiate category and item. I think that an item should have its price. This code checks if a dot is present, but you probably should use regexp.

s = 'Starters\nSalad with Greens 14.00\nSalad Goat Cheese 12.75\nMains\nPizza 12.75\nPasta 12.75'
items = s.split('\n')
# ['Starters', 'Salad with Greens 14.00', 'Salad Goat Cheese 12.75', 'Mains', 'Pizza 12.75', 'Pasta 12.75']

category = ''
menu = {}
for item in items:
    print(item)
    if '.' in item:
        menu[category].append(item)
    else:
        category = item
        menu[category] = []
print(menu)

# {'Starters': ['Salad with Greens 14.00', 'Salad Goat Cheese 12.75'], 'Mains': ['Pizza 12.75', 'Pasta 12.75']}

UPD: You may replace

if '.' in item:

with

if re.match(r".*\d.\d\d", item):

It is searching for strings which end like 1.11 (it is useful if you have abbreviations in category name)

Sign up to request clarification or add additional context in comments.

1 Comment

You can also check if it isalpha. If not, there are prices
1

Not that I would use it in a production environment but for the sake of academic challenge:

import re

string = """Starters
Salad with Greens 14.00
Salad Goat Cheese 12.75
Mains
Pizza 12.75
Pasta 12.75"""

rx = re.compile(r'^(Starters|Mains)', re.MULTILINE)

result = "\n".join(["{}\t{}".format(category, line)
                for parts in [[part.strip() for part in rx.split(string) if part]]
                for category, dish in zip(parts[0::2], parts[1::2])
                for line in dish.split("\n")])
print(result)

This yields

Starters    Salad with Greens 14.00
Starters    Salad Goat Cheese 12.75
Mains   Pizza 12.75
Mains   Pasta 12.75

Comments

0

You can use a class-based solution in Python3 with operator overloading to gain additional accessibility over the data:

import re
import itertools
class MealPlan:
    def __init__(self, string, headers):
       self.headers = headers
       self.grouped_data = [d for c, d in [(a, list(b)) for a, b in itertools.groupby(string.split('\n'), key=lambda x:x in ['Starters', 'Mains'])]]
       self.final_grouped_data = list(map(lambda x:[x[0][0], x[-1]], [grouped_data[i:i+2] for i in range(0, len(grouped_data), 2)]))
       self.final_data = [[[a, *list(filter(None, re.split('\s(?=\d)', i)))] for i in b] for a, b in final_grouped_data]
       self.final_data = [list(filter(lambda x:len(x) > 1, i)) for i in self.final_data]
    def __getattr__(self, column):
        if column not in self.headers:
            raise KeyError("'{}' not found".format(column))
        transposed = [dict(zip(self.headers, i)) for i in itertools.chain.from_iterable(self.final_data)]
        yield from map(lambda x:x[column], transposed)
    def __getitem__(self, row):
         new_grouped_data = {a:dict(zip(self.headers[1:], zip(*[i[1:] for i in list(b)]))) for a, b in itertools.groupby(list(itertools.chain(*self.final_data)), key=lambda x:x[0])}
         return new_grouped_data[row]
    def __repr__(self):
         return ' '.join(self.headers)+'\n'+'\n'.join('\n'.join(' '.join(c) for c in i) for i in self.final_data)

string='Starters\nSalad with Greens 14.00\nSalad Goat Cheese 12.75\nMains\nPizza 12.75\nPasta 12.75\n' 
meal = MealPlan(string, ['Category', 'Dish', 'Price'])
print(meal)
print([i for i in meal.Category])
print(meal['Starters'])

Output:

Category Dish Price
Starters Salad with Greens 14.00
Starters Salad Goat Cheese 12.75
Mains Pizza 12.75
Mains Pasta 12.75
['Starters', 'Starters', 'Mains', 'Mains']
{'Dish': ('Salad with Greens', 'Salad Goat Cheese'), 'Price': ('14.00', '12.75')}

Comments

0

try this. Note: it's assuming 'Starters' are listed before 'Mains'

category = 'Starters'
for item in string.split('\n'):
    if item == 'Mains': category = 'Mains'
    if item in ('Starters', 'Mains'): continue

    price = item.split(' ')[-1]
    dish = ' '.join(item.split(' ')[:-1])
    print ('{} {} {}'.format(category, dish, price))

2 Comments

Thanks! That is exactly what I was trying to achieve!
This is not the best answer, as you should change the code if you include a dessert. @Diman answer is more general. imho

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.