1

I am trying to extract function notation from a string in javascript (so I can't use lookbehind), here are some examples:

  • f(2) should match f(
  • f(10)g(8) should match f( and g(
  • f(2+g(3)) should match f( and g(
  • f(2+g(sqrt(10))) should match f( and g(
  • f(g(2)) should match f( and g(

Right now I am using

/\b[a-z]\([^x]/g

because I don't want to match when it is a string of letters (such as sqrt) only when there is a single letter then a parentheses. The problem I am having is with the last one in the list (nested functions). ( is not part of the \b catches so it doesn't match.

My current plan is to add a space after every ( using something like

input = input.replace(/\([^\s]/g, '( ');

Which splits the nested function so that \b comes into play [becomes f( g( 3))] but before I started messing with the input string, I thought I would ask here if there was a better way to do it. Obviously regex is not something I am super strong with but I am trying to learn so an explanation with the answer would be appreciated (though I will take any pointers that I can google myself too! I am not entirely sure of what to search for here.)

4
  • why do you have the [^x] ? Commented Jan 28, 2016 at 23:42
  • /\b[a-z]\(/gBased on the rules does this works ? Commented Jan 28, 2016 at 23:44
  • I am saving things like f(x) for definitions and shorthand for expressions. It means that I will never be able to have a function like x(x) but I am ok with that. Commented Jan 28, 2016 at 23:45
  • 1
    Another way: don't try to do it with a regex, and use a js parser to extract a graph of function calls... Commented Jan 28, 2016 at 23:46

2 Answers 2

1

The point here is that [^x] is a negated character class that still matches, consumes the symbol after ( and it prevents overlapping matches. To make a check that the next character is not x, use a lookahead:

\b[a-z]\((?!x)
         ^^^^^

See regex demo

Perhaps, you want to fail a match only if a x is the only letter inside f() or g():

\b[a-z]\((?!x\))

From Regular-expressions.info:

Negative lookahead is indispensable if you want to match something not followed by something else. When explaining [character classes][4], this tutorial explained why you cannot use a negated character class to match a q not followed by a u. Negative lookahead provides the solution: q(?!u). The negative lookahead construct is the pair of parentheses, with the opening parenthesis followed by a question mark and an exclamation point.

Sign up to request clarification or add additional context in comments.

2 Comments

Ahhh, that makes sense, I need to read up on consumption, I am obviously not processing it correctly.
Appreciate the link too!
0

I think you just have to remove [^x]:

"f(g(2))".match(/\b[a-z]\(/g)
// ["f(", "g("]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.