1

I'm trying to match recursive pattern in python but i'm missing something and getting errors.

I want to achieve:

var = [a-z]+
digit = [0-9]+
op=+,-,*,/
E->var|digit|op E E

eg:

"+ x 1", 

"* x * y + x y"

This is my code:

import re
term="[a-z]+|[0-9]+"
op=[+-/*]
exp="("+term+"|("+op+" "+term+" "+term+")|(?R))

sow when I do re.match(exp,"+ x 1"), I'm getting :

"sre_constants.error: unexpected end of pattern"

Can anyone help me with this problem?

6
  • the re module doesn't have a recursion feature, if you need it, install and use the regex module. pypi.python.org/pypi/regex Commented Feb 9, 2016 at 0:34
  • I did the substitution and there is not really any regex problem. Not sure but should +")|(?R)) be terminated with a double quote? As for the recursion, I don't see any nesting, are you just testing recursion for it's own sake? Commented Feb 9, 2016 at 1:54
  • hi sin, how can i do that with substitution. A simple example would help. Thanks! Commented Feb 9, 2016 at 3:18
  • Maybe (?:[-+/*]*\s*(?:[a-z]+|[0-9]+)\s*|\s*(?:[-+/*]\s+(?:[a-z]+|[0-9]+)\s+(?:[a-z]+|[0-9]+))+)+ will work for you without recursion? Commented Feb 9, 2016 at 18:19
  • but it also matches "+x x x 1" and "+ + x 1" which is not right pattern Commented Feb 9, 2016 at 19:58

1 Answer 1

1

I suggest using a non-recursive regex with built-in re:

^[+/*-] ?(?:[a-z]+|[0-9]+)(?: ?(?:[a-z]+|[0-9]+))?(?: [+/*-] ?(?:[a-z]+|[0-9]+)(?: ?(?:[a-z]+|[0-9]+))?)*$

See the regex demo

The pattern follows a specific scheme: ^block{{op} ?{term}(?: ?{term})?}(?: {block})*$.

Details

  • ^ - start of string
  • [+/*-] ?(?:[a-z]+|[0-9]+)(?: ?(?:[a-z]+|[0-9]+))? - block part:
    • [+/*-] - op: +, /, * or -
    • \? - an optional space
    • (?:[a-z]+|[0-9]+) - term part: 1+ lowercase letters or 1+ digits
    • (?: ?(?:[a-z]+|[0-9]+))? - an optional sequence of:
    • \? - an optional space
    • (?:[a-z]+|[0-9]+) - "term" pattern
  • (\?: - start of the non-capturing group and a space
    • [+/*-] ?(?:[a-z]+|[0-9]+)(?: ?(?:[a-z]+|[0-9]+))? - the "block" pattern
  • )* - end of the non-capturing group, repeat 0 or more times
  • $ - end of string.

In Python:

op = r'[+/*-]'
term = r'(?:[a-z]+|[0-9]+)'
block = rf'{op} ?{term}(?: ?{term})?'
pattern = rf'^{block}(?: {block})*$'

Use re.match or re.search with the pattern to validate.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.