13

Would anyone be able to assist me with some regex.

I want to split the following string into a number, string number

"810LN15"

1 method requires 810 to be returned, another requires LN and another should return 15.

The only real solution to this is using regex as the numbers will grow in length

What regex can I used to accomodate this?

2
  • Your question is not clear. Do you want to split on "LN", or on any alphabetic sequence? Commented Apr 7, 2011 at 13:49
  • Hi Laurent. In different methods i need to get a different part of this string, 1 method requires 810 to be returned, another requires LN and the last requires 15. I dont want to go down the route of using substrings and string counts as the lengths of the numbers are liable to change. Your help is much appreciated with this Commented Apr 7, 2011 at 13:51

5 Answers 5

20

String.split won't give you the desired result, which I guess would be "810", "LN", "15", since it would have to look for a token to split at and would strip that token.

Try Pattern and Matcher instead, using this regex: (\d+)|([a-zA-Z]+), which would match any sequence of numbers and letters and get distinct number/text groups (i.e. "AA810LN15QQ12345" would result in the groups "AA", "810", "LN", "15", "QQ" and "12345").

Example:

Pattern p = Pattern.compile("(\\d+)|([a-zA-Z]+)");
Matcher m = p.matcher("810LN15");
List<String> tokens = new LinkedList<String>();
while(m.find())
{
  String token = m.group( 1 ); //group 0 is always the entire match   
  tokens.add(token);
}
//now iterate through 'tokens' and check whether you have a number or text
Sign up to request clarification or add additional context in comments.

1 Comment

Hi Thomas, many thanks for your input, my problem is now solved
10

In Java, as in most regex flavors (Python being a notable exception), the split() regex isn't required to consume any characters when it finds a match. Here I've used lookaheads and lookbehinds to match any position that has a digit one side of it and a non-digit on the other:

String source = "810LN15";
String[] parts = source.split("(?<=\\d)(?=\\D)|(?<=\\D)(?=\\d)");
System.out.println(Arrays.toString(parts));

output:

[810, LN, 15]

Comments

7

(\\d+)([a-zA-Z]+)(\\d+) should do the trick. The first capture group will be the first number, the second capture group will be the letters in between and the third capture group will be the second number. The double backslashes are for java.

1 Comment

Thanks very much for your input Mark, it helped me solve my problem
0

This gives you the exact thing you guys are looking for

        Pattern p = Pattern.compile("(([a-zA-Z]+)|(\\d+))|((\\d+)|([a-zA-Z]+))");
        Matcher m = p.matcher("810LN15");
        List<Object> tokens = new LinkedList<Object>();
        while(m.find())
        {
          String token = m.group( 1 ); 
          tokens.add(token);
        }
        System.out.println(tokens);

Comments

0

If you want to separate out string chars and numbers .You can use following regex:

        String str = "810LN15";
        String []arr=str.split("[A-Za-z]");
        // arr = "810", "", "", "15"
        int sum=0;
        for(int i=0;i<arr.length;i++){
            if(!arr[i].equals(""))
            sum+=Integer.parseInt(arr[i]);
        }
        return sum;

 

2 Comments

As I understand the question, the OP wants to also keep the LN. The code in your answer discards it. Did you look at the accepted answer to this question?
As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.