3

I have a string that I am getting from a command line application. It has the following structure:

-- section1 --
item11|value11
item12|value12
item13

-- section2 --
item21|value21
item22

what I would like is to parse this to a dict so that I can easily access the values with:

d['section1']['item11']

I already solved it for the case when there are no sections and every key has a value but I get errors otherwise. I have tried a couple things but it is getting complicated because and nothing seems to work. This is what I have now:

s="""
item11|value11
item12|value12
item21|value21
"""
d = {}
for l in s.split('\n'):
    print(l, l.split('|'))
    if l != '':
        d[l.split('|')[0]] = l.split('|')[1]

Can somebody help me extend this for the section case and when no values are present?

5
  • 1
    What is the question exactly? Commented Jan 12, 2015 at 19:18
  • This woke fine in this case . what you expect to it do ? Commented Jan 12, 2015 at 19:19
  • ...parsing the text file into nested dicts for secctions? Commented Jan 12, 2015 at 19:19
  • Can I assume section headers will always appear? If not, what you want to do in that case, just set them (keys, values) as root elements? Commented Jan 12, 2015 at 19:22
  • yes, there are always section headers. thanks. Commented Jan 12, 2015 at 19:23

2 Answers 2

5

Seems like a perfect fit for the ConfigParser module in the standard library:

d = ConfigParser(delimiters='|', allow_no_value=True)
d.SECTCRE = re.compile(r"-- *(?P<header>[^]]+?) *--")  # sections regex
d.read_string(s)

Now you have an object that you can access like a dictionary:

>>> d['section1']['item11']
'value11'
>>> d['section2']['item22']   # no value case
None
Sign up to request clarification or add additional context in comments.

1 Comment

Damn, this is freaking awesome! Kilometers above my regex+iteration manual approach.
1

Regexes are a good take at this:

import re


def parse(data):
    lines = data.split("\n") #split input into lines
    result = {}
    current_header = ""

    for line in lines:
        if line: #if the line isn't empty
            #tries to match anything between double dashes:
            match = re.match(r"^-- (.*) --$", line)
            if match: #true when the above pattern matches
                #grabs the part inside parentheses:
                current_header = match.group(1)
            else:
                #key = 1st element, value = 2nd element:
                key, value = line.split("|")
                #tries to get the section, defaults to empty section:
                section = result.get(current_header, {})
                section[key] = value #adds data to section
                result[current_header] = section #updates section into result
    return result #done.

print parse("""
-- section1 --
item1|value1
item2|value2
-- section2 --
item1|valueA
item2|valueB""")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.