3

I can't seem to find any docs on the parse() method for strings. Is there a good reference? I want to parse the following:

frame 0 rows {3 2 3 3 3 3 2 3 2 3 3 3 2 3 2 3 4 3 3 4 3 2 2 3 3 3 2 2 2 2 3 3 3 2 3 2 3 3 3 3 4 3 4 3 3 3 3 4 3 2 3 3 3 3 2 2 2 4 4 3 3 3 3 3 4 4 4 3 2 4 3 4 3 3 3 4 3 3 4 3 3 4 4 3 3 3 4 4 3 4 3 3 3 3 3 4} columns {2 3 2 3 3 3 4 3 3 2 3 2 2 2 3 2 3 3 2 2 2 3 3 3 3 2 3 3 3 2 3 3 2 2 2 3 3 4 3 3 3 3 3 3 3 3 2 3 3 3 3 4 3 2 3 2 3 3 3 3 3 2 2 3 3 3 3 2 3 3 3 3 3 3 3 3 3 4 3 3 3 3 3 4 3 3 4 3 4 4 4 3 4 4 4 4 4 4 3 3 4 4 3 4 4 4 4 3 3 3 4 4 3 4 4 3 3 4 3 5 5 5 5 4 5 4 4 4}

into two lists of int.

1
  • That's not what string.parse() is designed to be used for (it's used internally as part of string.format()). Try taking a look at the re regular expressions library instead. Commented Mar 14, 2011 at 2:09

5 Answers 5

5

Python strings' parse() won't help you here (it has a very obscure use). In this case, I'd do it the obvious way: With regexes! If 's' is your string above,

import re
lists = [
    [int(i) for i in match.split()]
    for match in re.findall(r'{(.*?)}', s)
]

print lists
Sign up to request clarification or add additional context in comments.

Comments

1
>>> a="frame 0 rows {3 2 3 3 3 3 2 3 2 3 3 3 2 3 2 3 4 3 3 4 3 2 2 3 3 3 2 2 2 2 3 3 3 2 3 2 3 3 3 3 4 3 4 3 3 3 3 4 3 2 3 3 3 3 2 2 2 4 4 3 3 3 3 3 4 4 4 3 2 4 3 4 3 3 3 4 3 3 4 3 3 4 4 3 3 3 4 4 3 4 3 3 3 3 3 4} columns {2 3 2 3 3 3 4 3 3 2 3 2 2 2 3 2 3 3 2 2 2 3 3 3 3 2 3 3 3 2 3 3 2 2 2 3 3 4 3 3 3 3 3 3 3 3 2 3 3 3 3 4 3 2 3 2 3 3 3 3 3 2 2 3 3 3 3 2 3 3 3 3 3 3 3 3 3 4 3 3 3 3 3 4 3 3 4 3 4 4 4 3 4 4 4 4 4 4 3 3 4 4 3 4 4 4 4 3 3 3 4 4 3 4 4 3 3 4 3 5 5 5 5 4 5 4 4 4}"

>>> import ast
>>> import re
>>> for match in re.finditer("\{([\d ]+)\}",a):
    integers=match.groups()[0]
    l=ast.literal_eval(integers.replace(" ",","))
    print l


(3, 2, 3, 3, 3, 3, 2, 3, 2, 3, 3, 3, 2, 3, 2, 3, 4, 3, 3, 4, 3, 2, 2, 3, 3, 3, 2, 2, 2, 2, 3, 3, 3, 2, 3, 2, 3, 3, 3, 3, 4, 3, 4, 3, 3, 3, 3, 4, 3, 2, 3, 3, 3, 3, 2, 2, 2, 4, 4, 3, 3, 3, 3, 3, 4, 4, 4, 3, 2, 4, 3, 4, 3, 3, 3, 4, 3, 3, 4, 3, 3, 4, 4, 3, 3, 3, 4, 4, 3, 4, 3, 3, 3, 3, 3, 4)
(2, 3, 2, 3, 3, 3, 4, 3, 3, 2, 3, 2, 2, 2, 3, 2, 3, 3, 2, 2, 2, 3, 3, 3, 3, 2, 3, 3, 3, 2, 3, 3, 2, 2, 2, 3, 3, 4, 3, 3, 3, 3, 3, 3, 3, 3, 2, 3, 3, 3, 3, 4, 3, 2, 3, 2, 3, 3, 3, 3, 3, 2, 2, 3, 3, 3, 3, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 3, 3, 3, 3, 3, 4, 3, 3, 4, 3, 4, 4, 4, 3, 4, 4, 4, 4, 4, 4, 3, 3, 4, 4, 3, 4, 4, 4, 4, 3, 3, 3, 4, 4, 3, 4, 4, 3, 3, 4, 3, 5, 5, 5, 5, 4, 5, 4, 4, 4)

I have never heard of a parse method to actually parses the string in the way you ask. However, parsing that string is not that hard. Here is how to do it.

3 Comments

Why are you using ast here?
@Senthil: There is no good reason to use it in this particular case. Just calling int() like in Jesse's answer is probably more efficient and right way to do it.
@funktu: Thanks for the answer anyway as I am still not familiar with ast, and it's good to some examples of its use.
1

For such nicely structured data, pyparsing may be more than you need, but it makes for a good tutorial example:

from pyparsing import *

s = "frame 0 rows {3 2 3 3 3 3 2 3 2 3 3 3 2 3 2 3 4 3 3 4 3 2 2 3 3 3 2 2 2 2 3 3 3 2 3 2 3 3 3 3 4 3 4 3 3 3 3 4 3 2 3 3 3 3 2 2 2 4 4 3 3 3 3 3 4 4 4 3 2 4 3 4 3 3 3 4 3 3 4 3 3 4 4 3 3 3 4 4 3 4 3 3 3 3 3 4} columns {2 3 2 3 3 3 4 3 3 2 3 2 2 2 3 2 3 3 2 2 2 3 3 3 3 2 3 3 3 2 3 3 2 2 2 3 3 4 3 3 3 3 3 3 3 3 2 3 3 3 3 4 3 2 3 2 3 3 3 3 3 2 2 3 3 3 3 2 3 3 3 3 3 3 3 3 3 4 3 3 3 3 3 4 3 3 4 3 4 4 4 3 4 4 4 4 4 4 3 3 4 4 3 4 4 4 4 3 3 3 4 4 3 4 4 3 3 4 3 5 5 5 5 4 5 4 4 4}"

LBRACE,RBRACE = map(Suppress,"{}")
integer = Word(nums).setParseAction(lambda t:int(t[0]))

line = ("frame" + integer("frame") + 
        "rows" + LBRACE + ZeroOrMore(integer)("rows") + RBRACE + 
        "columns" + LBRACE + ZeroOrMore(integer)("columns") + RBRACE )

data = line.parseString(s)
print data.frame
print data.rows[:10]
print data.columns[:10]

prints:

0
[3, 2, 3, 3, 3, 3, 2, 3, 2, 3]
[2, 3, 2, 3, 3, 3, 4, 3, 3, 2]

Comments

0

type pydoc -p 5000 into the command line then got to http://localhost:5000/string.html#Formatter-parse

Comments

0
m_string = "frame 0 rows {3 2 3 3 3 3 2 3 2 3 3 3 2 3 2 3 4 3 3 4 3 2 2 3 3 3 2 2 2 2 3 3 3 2 3 2 3 3 3 3 4 3 4 3 3 3 3 4 3 2 3 3 3 3 2 2 2 4 4 3 3 3 3 3 4 4 4 3 2 4 3 4 3 3 3 4 3 3 4 3 3 4 4 3 3 3 4 4 3 4 3 3 3 3 3 4} columns {2 3 2 3 3 3 4 3 3 2 3 2 2 2 3 2 3 3 2 2 2 3 3 3 3 2 3 3 3 2 3 3 2 2 2 3 3 4 3 3 3 3 3 3 3 3 2 3 3 3 3 4 3 2 3 2 3 3 3 3 3 2 2 3 3 3 3 2 3 3 3 3 3 3 3 3 3 4 3 3 3 3 3 4 3 3 4 3 4 4 4 3 4 4 4 4 4 4 3 3 4 4 3 4 4 4 4 3 3 3 4 4 3 4 4 3 3 4 3 5 5 5 5 4 5 4 4 4}"

import re

print [[ int(i) for i in  x.split(" ")] for x in [ match for match in re.findall("\{([\d ]+)\}", m_string)] ]

results in:

[[3, 2, 3, 3, 3, 3, 2, 3, 2, 3, 3, 3, 2, 3, 2, 3, 4, 3, 3, 4, 3, 2, 2, 3, 3, 3, 2, 2, 2, 2, 3, 3, 3, 2, 3, 2, 3, 3, 3, 3, 4, 3, 4, 3, 3, 3, 3, 4, 3, 2, 3, 3, 3, 3, 2, 2, 2, 4, 4, 3, 3, 3, 3, 3, 4, 4, 4, 3, 2, 4, 3, 4, 3, 3, 3, 4, 3, 3, 4, 3, 3, 4, 4, 3, 3, 3, 4, 4, 3, 4, 3, 3, 3, 3, 3, 4], [2, 3, 2, 3, 3, 3, 4, 3, 3, 2, 3, 2, 2, 2, 3, 2, 3, 3, 2, 2, 2, 3, 3, 3, 3, 2, 3, 3, 3, 2, 3, 3, 2, 2, 2, 3, 3, 4, 3, 3, 3, 3, 3, 3, 3, 3, 2, 3, 3, 3, 3, 4, 3, 2, 3, 2, 3, 3, 3, 3, 3, 2, 2, 3, 3, 3, 3, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 3, 3, 3, 3, 3, 4, 3, 3, 4, 3, 4, 4, 4, 3, 4, 4, 4, 4, 4, 4, 3, 3, 4, 4, 3, 4, 4, 4, 4, 3, 3, 3, 4, 4, 3, 4, 4, 3, 3, 4, 3, 5, 5, 5, 5, 4, 5, 4, 4, 4]]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.