1

I want to capture multiple string which match some specific patterns, For example my string is like

String textData = "#1_Label for UK#2_Label for US#4_Label for FR#";

I want to get string between two # which match with string like for UK

Output should like this if match string is UK than
output should be 1_Label for UK

if match string is label than
output should be 1_Label for UK, 2_Label for US and 4_Label for FR if match string is 1_ than

output should be 1_Label for UK

I don't want to extract data via array list and extraction should be case insensitive.

Can you please help me out from this problem?

Regards, Ashish Mishra

1
  • I don't understand what you mean with "if match string is UK/label/1_". From what I can see you input-string has all three of those strings in it. Commented Oct 9, 2014 at 7:33

4 Answers 4

2

You can use this regex for search:

#([^#]*?Label[^#]*)(?=#)

Replace Label with your search keyword.

RegEx Demo

Java Pattern:

Pattern p = Pattern.compile( "#([^#]*?" + Pattern.quote(keyword) + "[^#]*)(?=#)" );
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks Anubhava for your quick reply
1

If the data always is between two hashes, try a regex like this: (?i)#.*your_match.*# where your_match would be UK, label, 1_ etc.

Then use this expression in conjunction with the Pattern and Matcher classes.

If you want to match multiple strings, you'd need to exclude the hashes from the match by using look-around methods as well as reluctant modifiers, e.g. (?i)(?<=#).*?label.*?(?=#).

Short breakdown:

  • (?i) will make the expression case insensitive
  • (?<=#) is a positive look-behind, i.e. the match must be preceeded by a hash (but doesn't include the hash)
  • .*? matches any sequence of characters but is reluctant, i.e. it tries to match as few characters as possible
  • (?=#) is a positive look-ahead, which means the match must be followed by a hash (also not included in the match)

Without the look-around methods the hashes would be included in the match and thus using Matcher.find() you'd skip every other label in your test string, i.e. you'd get the matches #1_Label for UK# and #4_Label for FR# but not #2_Label for US#.

Without the relucatant modifiers the expression would match everything between the first and the last hash.

As an alternative and better, replace .*? with [^#]*, which would mean that the match cannot contain any hash, thus removing the need for reluctant modifiers as well as removing the problem that looking for US would match 1_Label for UK#2_Label for US.

So most probably the final regex you're after looks like this: (?i)(?<=#)[^#]*your_match[^#]*(?=#).

1 Comment

Thanks Thomas for your quick reply.
1
([^#]*UK[^#]*)   for UK

([^#]*Label[^#]*) for Label

([^#]*1_[^#]*)    for 1_

Try this.Grab the captures.See demo.

http://regex101.com/r/kQ0zR5/3

http://regex101.com/r/kQ0zR5/4

http://regex101.com/r/kQ0zR5/5

1 Comment

Thanks VKS for your quick reply
0

I have solved this problem with below pattern,

(?i)([^#]*?us[^#]*)(?=#)

Thank you so much Anubhava, VKS and Thomas for you reply.

Regards,
Ashish Mishra

1 Comment

This was precisely my answer also. Better not to post it as an answer and vote up / accept the answers that work for you.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.