0

I am trying to extract a version number from a string using regular expression. The version number is in the format "D.D.Dc", where 'D' are digits (can be one or more instances) and 'c' is an optional alphabet character, surrounded by white spaces on either side.

The string I want to extract it from is something like:

FOO 5.1.7d BAR 5.0.2 2019/06/18

The regular expression I'm using is:

\s(\d+)\.(\d+)\.(\d+)([a-zA-Z])?\s

Below is the the code I'm using.

static regex FWVersionFormat{ R"(\s(\d+)\.(\d+)\.(\d+)([a-zA-Z])?\s)" }; 

auto matches = cmatch{};
if (regex_search(strVersion.c_str(), matches, FWVersionFormat))
{
    int maj = 0, min = 0, maint = 0, build = 0;
    if (!matches[1].str().empty()) maj = strtol(matches[1].str().c_str(), nullptr, 10);
    if (!matches[2].str().empty()) min = strtol(matches[2].str().c_str(), nullptr, 10);
    if (!matches[3].str().empty()) maint = strtol(matches[3].str().c_str(), nullptr, 10);
    if (!matches[4].str().empty()) build = matches[4].str().c_str()[0] - ('a' - 1);
    return{ maj, min, maint, build };
}

This works fine if there is only one match in the version string but the issue is that the regex_search() is putting the second instance of the version into the matches ("5.0.2").

I want to be able to only extract the first match. Is there any way to do this using regex?

13
  • 2
    Try adding a negative lookahead to prevent being followed by a digit regex101.com/r/x6Cnr6/1 (\s(\d+)\.(\d+)\.(\d+)([a-zA-Z])?\s(?!\d)) Commented Oct 1, 2019 at 14:06
  • 1
    Unable to reproduce with g++/libstdc++ 7.4.0. It matches 5.1.7d from that string. Commented Oct 1, 2019 at 14:10
  • 1
    Also, why are you using std::cmatch instead of std::smatch and std::strtol() instead of std::stoi(), which works directly with std::string so you don't have to convert to a C style string? It'd make your code a lot cleaner and simpler... maj = std::stoi(matches[1].str()); etc. Commented Oct 1, 2019 at 14:13
  • 1
    @Minato std::cmatch is used when matching against C style strings, std::smatch when matching against std::string. OP has the latter but is converting them to the former first for some reason. Commented Oct 1, 2019 at 14:15
  • 1
    @ChrisJ If you only want the first match and you only need a single capturing group, you might also try ^.*?(\s\d+\.\d+\.\d+[a-zA-Z]?\s) regex101.com/r/oH8D8X/1 Commented Oct 1, 2019 at 14:20

1 Answer 1

3

Thanks to @TheFourthBird for the answer.

I changed my regular expression to include a negative lookahead at the end so the search stops at the first matching instance.

\s(\d+)\.(\d+)\.(\d+)([a-zA-Z])?\s(?!\d)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.