1

I have some obfuscated code which call functions, like this:

getAny([["text with symbols \"()[],.;\" and maybe 'ImVerySeriousFn'"], ...]);
setAny([["other text with \"()[],.;\""], ...]);...

Arguments contain random text. Functions follow each other without a new line.

How can I get arguments of getAny, setAny and other functions, using set of regular expressions?

I need this result:

regex1 result: [["text with symbols \"()[],.;\" and maybe 'ImVerySeriousFn'"], ...]
regex2 result: [["other text with \"()[],.;\""], ...]
...

I tried write regex1:

getAny\((.*)\)

but matching result also contains setAny call

 [["text with symbols \"()[],.;\" and maybe 'ImVerySeriousFn'"], ...]);setAny([["other text with \"()[],.;\""], ...]

When I tried:

getAny\((.*?)\)

matching result break argument string

[["text with symbols \"(

I can't split by ; or ); because text in arguments can contains symbols ; or );

maybe impossible to do it using regex?

2
  • A parser is probably a better way to do this, but how about this?: [gs]etAny\((.*)\) Commented Jul 18, 2014 at 14:46
  • Unfortunately,functions follow each other without a new line. [gs]etAny\((.*)\) - it does not work. Commented Jul 21, 2014 at 7:15

3 Answers 3

2

Your regex needs to be \(.*?\); since your code is obfuscated (and assumedly on one line).

Note that this will fail if one of your arguments contains ); inside of it.

Explanation (From Regex101.com):

/\((.*?)\);/g
  \( matches the character ( literally
    1st Capturing group (.*?)
      .*? matches any character (except newline)
      Quantifier: Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
  \) matches the character ) literally
  ; matches the character ; literally
  g modifier: global. All matches (don't return on first match)

The main problem with your regex is that you never specified ; to end a match, so it went ahead and grabbed up until the last ) it saw because you used .*, which is greedy (grabs everything) unless followed by ?.

Demo

Sign up to request clarification or add additional context in comments.

14 Comments

You're assuming that the comma will be always there. I think the OP added the punctuation symbols to indicate that there could be "special" characters in between, and that shouldn't cause the regex to fail, and not as an argument delimiter.
@AmalMurali Fixed. Also, if you look at his input, commas are also being used as argument delimiters.
To rephrase: I think (?:.+,)*? is useless here since .+ at the end is what is doing all the matching.
Thanks for your answer -- despite the commentary below, it doesn't offer enough explanation of what it does and why it solves the OP's problem. Can you explain it better please?
@EngineerDollery Sorry for spamming your inbox with comments, but after reading over both of the links, I don't see how this doesn't meet the guidelines. According to the "How to Answer", I think I can solve this by adding some context for the link, such as "Here is a demo of the regex in action on Regex101.com", but I don't know if that will satisfy the requirements that I apparently missed.
|
0

I don't know, if I understand your question, but if I do, you maybe could use a group and collect the allowed signs in it.

Your regex could be: \( ( ) " [ ],\.; a-zA-Z \)

outer brackets enclose the group

Comments

0

If I understand your pattern correctly, your function argument will always start with [[" and end with "]].

Regex:

/getAny\((\[\[".*?[^\\]"\]\])\);/

Demo: http://regex101.com/r/jC3vX5/2

Note the lazy .*?, and [^\\] to make sure the matching quote is not escaped.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.