0

I have some problem to write the regular expression for if-then-else statement. For example I have the following string (all the char are on the same row):

if(%codizione1%) then istruzione1; istruzione1.2; else if(%condizione2%) then istruzione2; istruzione 2.1; istruzione 2.2; else if(%condizione3%) then istruzione3; end if; end if; end if;

and I tryed to write this regex but it didn't work:

if\\s?\\(\\%(.+?)\\%\\)\\s?then (.*?) ((?:else?))*(.*)end if;

In the example I want to capture: group(0): all the string; group(1): the condition (%codizione1%) group(2): the instruction between the then and the else (istruzione1; istruzione1.2;) group(3): the rest ot the instruction after group(2)(else if(%condizione2%) then istruzione2; istruzione 2.1; istruzione 2.2; else if(%condizione3%) then istruzione3; end if; end if;)

The else must be optional, I can have something like:

if(%a=b%) then c=d; end if;

Please help me. Thank you guys.

1
  • Is the regex you wrote a Java string or did you write standard regex syntax, i.e. does your string look like "if\\s?..." or more like "if\\\\s?..."? And why don't you use a real parser instead? Programming languages normally aren't regular so regular expressions are a poor fit. Commented Jul 10, 2017 at 14:33

1 Answer 1

4

Your regex is almost correct (assuming you meant if\s?... and what you posted is the content of a Java string where backslashes would be escaped). Try the following one:

if\s*\(%(.+?)%\)\s*then\s+(.*?)\s+(?:else\s+(.*)\s*)?end if;

As you can see I made a few minor changes:

  • \s* instead of \s? to allow more than one space between if and ( etc.
  • \s+ instead of single spaces to allow for more than one (but at least one)
  • % instead of \% since % doesn't need to be escaped (see Andreas' comment)
  • (?:else\s+(.*)\s*)? instead of ((?:else?))*(.*)because the entire block is optional, starting with else. Your version stated that else is optional and can be found multiple times (using *) and be followed by anything (.*). But what you want - at least how understood it - is everything between an optional else and end if;, hence (?:else\s+(.*)\s*)? (about the last \s* see Andreas' other comment).

A final note though: be aware that depending on what's allowed in the statements that expression might not work in all situations. As I already stated in my comment, programming languages normally are irregular and thus there are limits on what you can do with regular expressions. As an example, you'd probably get problems if the condition of one of the instructions would contain a string with "else ".

Sign up to request clarification or add additional context in comments.

5 Comments

Why are you escaping %?
@Andreas you're right, I just copied it from the original expression and didn't think much about it.
Works nice, see regex101.com, though be aware that if you use matches() and then reapply on group 3, that group 3 ends with a space, so you either need to trim() it, or add an optional space-match pattern at the end.
Thank you Thomas you saved my life. I love you! <3
@SuperVarcoBros Please take a look at What should I do when someone answers my question?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.