1

I have a text file that contains some strings separated by ",". Strings are in the form of: "x:somestring:any string". I'm interested in extracting "somestring" value only. I could extract "somestring:any string" by replacing the "x:" with "" using:

Pattern p= Pattern.compile("x:", Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher("");

But as I said before, I'm interested only in "somestring". Is it possible to add a second pattern in order to replace ":any string" with "". I thought of repeating the same process again, but I wanted to ask about a better way. Is there any way to improve my regular expression? Please note that "somestring" and "any string" are not fixed values.

3 Answers 3

1

Use split:

    for (String s : subjectString.split(",")) {
        s.split(":")[1];
    }
Sign up to request clarification or add additional context in comments.

Comments

0

If you have a string subjectString that contains "x:somestring:any string", then the following will extract somestring:

Pattern regex = Pattern.compile(
    "(?<=x:) # Assert position right after 'x:'\n" +
    "[^:]*   # Match any number of characters except colons", 
    Pattern.COMMENTS);
Matcher regexMatcher = regex.matcher(subjectString);
if (regexMatcher.find()) {
    ResultString = regexMatcher.group();
} 

2 Comments

your solutions works fine. But this means I have to split the strings in my file since it is CSV file.Then treat each splitted string using your code in a loop. My file is going to have soooooo many strings. any suggestions to treat my sequence of strings (string1, string2, string3, ..etc.) at once without the need to split each strings and treat them in a loop separately?
I would steer clear of using regexes to parse a CSV file directly. This is bound to cause problems (think embedded newlines, quoted fields etc.). Better use a CSV library to handle the file itself, and then apply regexes to the fields you have parsed.
0

Another, simple way is:

"x:somestring:any string".replaceAll (".*:(.*):.*", "$1")

1 Comment

$1 is the first group of elements, captured between round parens; here is only one such group declared: The word between the two colons.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.