1

How can I find the keyword argument passed to a c-style python string

Given:

bill eats apple
'%(name)s eats %(fruit)s'

Should get

{ 'name': 'bill', 'fruit' : 'apple'}
5
  • You may want to give the parse module a try. Commented Oct 4, 2017 at 6:26
  • ok, so nothing direct?, I would have to use re and stuff only. Commented Oct 4, 2017 at 6:28
  • I don't understand your question. Do you mean you want to try to only use modules that ship with Python? If so, why? Would you consider it acceptable to copy/paste the code from the parse module? (If not, why not? If so, why wouldn't you just use the module?) Commented Oct 4, 2017 at 6:30
  • I am very happy with the module you suggested, as first it seemed a simple task to me, so I thought is that really that hard, can't I come up with a line or two of code myself. As i understand, it's not that simple/direct, given strings can grow complex. Commented Oct 4, 2017 at 6:36
  • not able to do it with parse module also, only seems to be working with {} type parameters. Commented Oct 4, 2017 at 6:59

2 Answers 2

1

If you do not want to use parse, you can convert your pattern string to a regular expression using named groups and then use re.match and match.groupdict to get the mapping.

>>> text = "bill eats apple"
>>> a = "%(name)s eats %(fruit)s"
>>> p = re.sub(r"%\((\w+)\)s", r"(?P<\1>\w+)", a)
>>> p
'(?P<name>\\w+) eats (?P<fruit>\\w+)'
>>> re.match(p, text).groupdict()
{'fruit': 'apple', 'name': 'bill'}

Note that \w+ will only match a single word. To allow for more complex names, you might instead use e.g. [^(]+ to match anything up to the closing )

>>> text = "billy bob bobbins eats a juicy apple"
>>> p = re.sub(r"%\((\w+)\)s", r"(?P<\1>[^)]+)", a)
>>> re.match(p, text).groupdict()
{'fruit': 'a juicy apple', 'name': 'billy bob bobbins'}
Sign up to request clarification or add additional context in comments.

4 Comments

this doens't work when name has a space in between.
@garg10may That's right, in this case you will have to replace \w+ with something more complex, maybe [^)]+ or similar.
that gives sre_constants.error: unbalanced parenthesis
thx, not sure how it works but your re works magic. Using parse would have required changing everything given I was using django translations it would be more cubersome. And replacing `%' prior to using parse would also not be possible since now the translations won't work.
1

First, there is no function or package in Python that allow you to do that with old style (aka C style) string formatting. A good reference about reversing c-style string format.

The best you can have is a giant regex pattern and as you know it's really not a perfect solution.


That said,

As @smarx said in comments, you can use parse which is well fitted for that, but, from the given doc's link:

parse() is the opposite of format()

That mean you needs to use format() instead of %, which is a good thing because % is Python's string formatting old style where format() is the new style and the best to use since Python3 (it's python 2.7 / 3 compliant, but not %).

Here is an example with format():

print(parse.parse('{name} eats {fruit}', 'bill eats apple'))
<Result () {'fruit': 'apple', 'name': 'bill'}>

If you are not confortable with format() I advise you to give a look at pyformat.org, a really good guide.

3 Comments

but my strings don't use format, would mean changing existing texts.
I updated my answer, from what I saw you have two solutions: 1. Use format() or use homemade regex pattern and hope it will works in most of case. Again, even if that needs to change texts I advise you to use format()
@garg10may Can't you just replace %( with { and )s with } prior to using parse?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.