1

I am trying to extract the TPS number from the following strings using Java - the strings will be read from a file and so the strings can appear in any order (not known in advance), e.g. I wont know which of the strings I am dealing with - it could be either of these two:

Testing performance TPS..  ok. (795 TPS recorded for run)

Testing performance TPS..  warning: TPS seems low - it was 10 TPS and I expected to achieve over 50

E.g. for the first string I would want the number 795, and for the second string I would want the number 10.

Does anyone know how to do this with regex or similar using Java?

Many thanks

2
  • The regex to get the first number in a text line is something like this: ^.*([0-9]*). Commented Dec 12, 2011 at 14:45
  • This is quite a flaky way of getting these numbers - are you sure you can't directly hook into the source of the file, and whatever produces these lines in the file? It would be a far better way to get the data you want. If not, the regexp in answers below will do the trick, but make sure you validate (at runtime) that you're reading a line that has the correct (i.e. expected) format Commented Dec 12, 2011 at 14:50

3 Answers 3

4

You need to find the first group of number characters in the input. The number is terminated by a space.

You can use this regex:

    String regex = "[^\\d]+(\\d+) .*";

The number is captured in group one ($1).

Here is a simple test:

public static void main(String[] args) throws Exception {

    String[] lines = {
        "Testing performance TPS..  ok. (795 TPS recorded for run)",
        "Testing performance TPS..  warning: TPS seems low - it was 10 TPS and I expected to achieve over 50"
    };

    String regex = "[^\\d]+(\\d+) .*";
    Pattern p = Pattern.compile(regex);
    for (String s: lines) {
        Matcher m = p.matcher(s);
        if (m.matches()) {
            System.err.println(m.group(1));
        }
    }
}

The output is:

795
10
Sign up to request clarification or add additional context in comments.

2 Comments

Your use of groupCount() is incorrect. It just tells you how many capturing groups there are in the regex. It doesn't say anything about what was actually matched. To find out if group #1 participated in the match, you use if (m.group(1) != null) or if (m.start(1) != -1).
@Alan Thanks for pointing that out, I didn't know. So really you only need to check that Matcher.matches() before attempting to access groups. If the pattern matches then you can lookup all your groups, though some might be null or empty.
3

If you're always looking for an integer followed by the string "TPS" you can do

"(\\d+) TPS"

But you'd better be sure it will always be in this format - it would be better to modify the output format, if that's possible.

3 Comments

excellent. in java, just use if(matcher.groupCount()>=1) { String groupStr = matcher.group(1); ... } This will give you the first captured group. Zero-th group would be the whole match, i.e. "795 TPS", first group will be just "795"
@PeterPerháč: That's not what groupCount() is for. See my comment to sudocode's answer for details.
you win :-) great and simple answer... I was of course thinking about positive lookbehinds and stuff.. but those are zero-width and don't capture... and then you posted this and I was like d'oh!
1

This regex should do the trick:

    ^[^0-9]*([0-9]+).*$

It matches any line that contains a number, and extracts the first number in the line.

However, it is not really possible to generalize from just these two examples. For instance you don't show us examples that the regex shouldn't match.


I agree with the comment that say that this is a flakey way to extract information. Unless you are very sure of your input text, there is always a possibility that you will encounter a different form that the regex doesn't cope with; e.g. that matches when it shouldn't or vice versa.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.