Find keyword arguments from a c-style python string and text

Question

How can I find the keyword argument passed to a c-style python string

Given:

bill eats apple
'%(name)s eats %(fruit)s'

Should get

{ 'name': 'bill', 'fruit' : 'apple'}

ok, so nothing direct?, I would have to use re and stuff only. — garg10may
– garg10may, Commented Oct 4, 2017 at 6:28
I don't understand your question. Do you mean you want to try to only use modules that ship with Python? If so, why? Would you consider it acceptable to copy/paste the code from the parse module? (If not, why not? If so, why wouldn't you just use the module?) — user94559
– user94559, Commented Oct 4, 2017 at 6:30
I am very happy with the module you suggested, as first it seemed a simple task to me, so I thought is that really that hard, can't I come up with a line or two of code myself. As i understand, it's not that simple/direct, given strings can grow complex. — garg10may
– garg10may, Commented Oct 4, 2017 at 6:36
not able to do it with parse module also, only seems to be working with {} type parameters. — garg10may
– garg10may, Commented Oct 4, 2017 at 6:59

tobias_k · Accepted Answer · 2017-10-04 10:04:21Z

1

If you do not want to use parse, you can convert your pattern string to a regular expression using named groups and then use re.match and match.groupdict to get the mapping.

>>> text = "bill eats apple"
>>> a = "%(name)s eats %(fruit)s"
>>> p = re.sub(r"%\((\w+)\)s", r"(?P<\1>\w+)", a)
>>> p
'(?P<name>\\w+) eats (?P<fruit>\\w+)'
>>> re.match(p, text).groupdict()
{'fruit': 'apple', 'name': 'bill'}

Note that \w+ will only match a single word. To allow for more complex names, you might instead use e.g. [^(]+ to match anything up to the closing )

>>> text = "billy bob bobbins eats a juicy apple"
>>> p = re.sub(r"%\((\w+)\)s", r"(?P<\1>[^)]+)", a)
>>> re.match(p, text).groupdict()
{'fruit': 'a juicy apple', 'name': 'billy bob bobbins'}

edited Oct 4, 2017 at 10:04

answered Oct 4, 2017 at 8:53

tobias_k

83.1k12 gold badges130 silver badges186 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

garg10may Over a year ago

this doens't work when name has a space in between.

tobias_k Over a year ago

@garg10may That's right, in this case you will have to replace \w+ with something more complex, maybe [^)]+ or similar.

garg10may Over a year ago

that gives sre_constants.error: unbalanced parenthesis

garg10may Over a year ago

thx, not sure how it works but your re works magic. Using parse would have required changing everything given I was using django translations it would be more cubersome. And replacing `%' prior to using parse would also not be possible since now the translations won't work.

Arount · Accepted Answer · 2017-10-04 08:29:13Z

1

First, there is no function or package in Python that allow you to do that with old style (aka C style) string formatting. A good reference about reversing c-style string format.

The best you can have is a giant regex pattern and as you know it's really not a perfect solution.

That said,

As @smarx said in comments, you can use parse which is well fitted for that, but, from the given doc's link:

parse() is the opposite of format()

That mean you needs to use format() instead of %, which is a good thing because % is Python's string formatting old style where format() is the new style and the best to use since Python3 (it's python 2.7 / 3 compliant, but not %).

Here is an example with format():

print(parse.parse('{name} eats {fruit}', 'bill eats apple'))
<Result () {'fruit': 'apple', 'name': 'bill'}>

If you are not confortable with format() I advise you to give a look at pyformat.org, a really good guide.

edited Oct 4, 2017 at 8:29

answered Oct 4, 2017 at 8:11

Arount

10.5k1 gold badge35 silver badges45 bronze badges

3 Comments

garg10may Over a year ago

but my strings don't use format, would mean changing existing texts.

Arount Over a year ago

I updated my answer, from what I saw you have two solutions: 1. Use format() or use homemade regex pattern and hope it will works in most of case. Again, even if that needs to change texts I advise you to use format()

tobias_k Over a year ago

@garg10may Can't you just replace %( with { and )s with } prior to using parse?

Collectives™ on Stack Overflow

Find keyword arguments from a c-style python string and text

2 Answers 2

4 Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related