As an input I get a string like "[email protected]". All parts are variable. Only rules are that the number of digits in front is always 10 and then there is "@". I need a regex which will allow me to extract "12345" (i.e. digits from positions 2 to 6) and "site.com" substrings. For example, in above case the result could be either "12345site.com" or "12345:site.com". Can it be done with a single regular expression? How can we skip first digit and digits from positions 7 to 10 and '@'? Examples in Java will be appreciated.
-
What language do you use?Casimir et Hippolyte– Casimir et Hippolyte2015-02-21 12:44:09 +00:00Commented Feb 21, 2015 at 12:44
-
I use Java. But I thought it is not so important in the context of regex related question.Max– Max2015-02-21 12:54:37 +00:00Commented Feb 21, 2015 at 12:54
-
Absolutly not! Different languages use different regex flavors and implementations.Casimir et Hippolyte– Casimir et Hippolyte2015-02-21 12:59:55 +00:00Commented Feb 21, 2015 at 12:59
-
Ok. I added remark about Java.Max– Max2015-02-21 13:07:05 +00:00Commented Feb 21, 2015 at 13:07
-
Since the index and positions are known? Why do you still need a regex for this? Can substring not be used?Arun A K– Arun A K2015-02-21 13:13:01 +00:00Commented Feb 21, 2015 at 13:13
|
Show 1 more comment
2 Answers
If i understood you correctly, this regex will do
\d(\d{5})\d{4}@(.+)
and then use
matcher.group(1) + matcher.group(2)
to concatenate the groups.
Java code:
public static void main(String[] args) {
String s = "[email protected]";
String patternString = "\\d(\\d{5})\\d{4}@(.+)";
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(s);
if (matcher.matches()) {
System.out.println(matcher.group(1) + matcher.group(2));
// shows "12345site.com"
}
}
8 Comments
Max
Will that translate in Java to "\\d(\\d{5})\\d{4}@(.+)"? If so, then it results in one matching group "[email protected]"
Max
Ok. Actually it works. "[email protected]" was group(0).. There are also group(1) and group(2) with the right values.
Max
How would you change the code if you don't know beforehand the number of matching groups?
Maksim
You should always know the number of matching groups as soon as your patternString is final. If it's not, i would use "for" cycle to dash through them. Or did you mean something else?
Max
Ok. I can use groupCount() and iterate over all groups to concatenate all results.
|
Specifically for your input pattern:
\d{1}(\d{5})\d*@(.*)
2 capturing groups:
group 1: (\d{5})
group 2: (.*)
Input: [email protected]
Output: 12345
site.com
3 Comments
Max
If in Java I try pattern = "\\d(\\d{5})\\d*@(.*)" then the output will be a complete input string
Max
Yes. It works. I just should not use group(0). But rather group(1) and group(2).
DeDee
@Max Okay. Initially you did not mention it's for use with Java. Glad you figured it out :)