1

I have some configuration data in the below format. What is the best way to parse this data in python? I checked the csv module and briefly this module. Couldn't figureout how to use it. Existing parser is hacked in perl.

|------------+-----------------+--------|
| ColHead1   | Col_______Head2 | CH3    |
|------------+-----------------+--------|
| abcdefg000 | *               | somev1 |
| abcdefg001 | *               | somev2 |
| abcdefg002 | *               |        |
| abcdefg003 | *               |        |
| abcdefg004 | *               |        |
| abcdefg005 | *               |        |
| abcdefg006 | *               |        |
| abcdefg007 | *               |        |
| abcdefg008 | *               |        |
| abcdefg009 | *               |        |
| abcdefg010 | *               |        |
|------------+-----------------+--------|

1

3 Answers 3

2

You can try something like that:

def parse(ascii_table):
    header = []
    data = []
    for line in filter(None, ascii_table.split('\n')):
        if '-+-' in line:
            continue
        if not header:
            header = filter(lambda x: x!='|', line.split())
            continue
        data.append(['']*len(header))
        splitted_line = filter(lambda x: x!='|', line.split())
        for i in range(len(splitted_line)):
            data[-1][i]=splitted_line[i]
    return header, data
Sign up to request clarification or add additional context in comments.

Comments

1

here is another (similar) way to do it if it is in a file:

with open(filepath) as f:
    for line in f:
        if '-+-' in line or 'Head' in line:
            continue
        # strip '|' off the ends then split on '|'
        c1, c2, c3 =  line.strip('|').split('|')
        print 'Col1: {}\tCol2: {}\tCol3: {}'.format(c1,c2,c3)

or a string variable:

for line in ascii_table.split('\n'):
    if '-+-' in line or 'Head' in line:
        continue
    c1, c2, c3 =  line.strip('|').split('|')
    print 'Col1: {}\tCol2: {}\tCol3: {}'.format(c1,c2,c3)

Comments

0

Folling the exemple from @mguijarr, bellow have an code keeping first empty cell on matrix and stripping spaces from cells borders.

def parseAsciiTable(ascii_table: str):
  header = []
  data = []
  for line in ascii_table.split('\n'):
    if '-+-' in line: continue
    cells = list(filter(lambda x: x!='|', line.split('|')))
    striped_cells = list(map(lambda c: c.strip(), cells))
    if not header:
      header = striped_cells
      continue
    data.append(striped_cells)
    
  return header, data

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.