2

I have requirement to split a string based on Regular expression which will be of the below format.

There are 3 different type of String values -

  • ABC_1234_XL.jpg
  • XYZ_7890_SM.jpg
  • PQ_R_4567_LG.jpg

The regex that I have right now which isn't working is -

(^[a-zA-Z])(_\\d+_)([a-zA-Z]$)

in the above ABC, XYZ and PQ_R are 3 types of image types which I want to extract separately and compare it with respective list of corresponding Types fetched from DB. So in case if I go with normal split by underscore "_", then it flunks the purpose while splitting the 3rd string.

So I need a solution to split these string based on Regular expression, where every time the center element will [0-9] and the left would be Image Type and the right would be Image Size.
Meaning - ImageType_ImageTypeID_ImageSize. We need to split this having the center element (imageTypeID) as base and get the left & right date by excluding the "_". How to achieve this with Split along with Regex?

Help please and let me know in case if you need more info.

0

4 Answers 4

2

OK since no one yet explained your problem then I will try. Your current regex

(^[a-zA-Z])(_\\d+_)([a-zA-Z]$)

can match only strings with one letter at start, _, one or more digits, another _ and ends with one letter. What you need is regex which accept strings that

  • [a-zA-Z]+(?:_[a-zA-Z]+)* - starts with one or more letters and can have optional sequences of _ and letters (not digits yet)
  • _\\d+_ - have digits surrounded with _ after it
  • [a-zA-Z]+ have one or more letters after it.

  • You probably also want to end your regex with sequence which will match file extension, so you will need something like [.]jpg

So try with

([a-z]+(?:_[a-z]+)*)_(\\d+)_([a-z]+)[.]jpg

Demo:

String[] data = {
        "ABC_1234_XL.jpg",
        "XYZ_7890_SM.jpg",
        "PQ_R_4567_LG.jpg",
};
Pattern p = Pattern.compile(
            "([a-z]+(?:_[a-z]+)*)_(\\d+)_([a-z]+)[.]jpg",
   //group 1  ^^^^^^^^^^^^^^^^^^
   //group 2                       ^^^^
   //group 3                              ^^^^^^
            Pattern.CASE_INSENSITIVE);
for (String s : data) {
    Matcher m = p.matcher(s);
    if (m.matches())
        System.out.println(m.group(1)+" : "+m.group(2)+" : "+m.group(3));
    else
        System.out.println(s+" doesn't match pattern");
}

Output:

ABC : 1234 : XL
XYZ : 7890 : SM
PQ_R : 4567 : LG
Sign up to request clarification or add additional context in comments.

3 Comments

Let me try this and get back. Thank You Pshemo. :)
I am glad you like it :)
Yes. This works and as same as exactly what I wanted. Now I can use these string values for different business logic. Thanks once again mate, :)
0

Try this:

([a-zA-Z][_][a-zA-Z]*)(\d+)([\w]+[.][\w]+)

The first group looks for as many characters or _ as it can get Second group finds the 123123 pattern And the last one gets you the size and its type.

Comments

0

If the first part is allowed to contain a _, I think it is enough to include it in the list of characters of this block :

(^[a-zA-Z_]+)(_\\d+_)([a-zA-Z]+)

You can even put the delimiters outside of your center block :

(^[a-zA-Z_]+)_(\\d+)_([a-zA-Z]+)

2 Comments

thanks. If I put delimiter outside the center block then --> PQ_R_4567_LG.jpg would have problem - when I try to split the string into 3 parts.
Also, ur first regex works good. But I want to know how to split based on this pattern and assign the 3 parts to 3 different Strings.
-1

^([A-Za-z_]+)_(\\d+)_([A-Za-z]+)\\.jpg$

3 Comments

While it is good to provide solution it is even better to explain how it solves problem.
thanks. Regex is fine but need a logic to split this into 3 different strings.
Pshemo - Thanks for the echo

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.