2

I want to remove any occurence of "is happy" sentence from a very large text ignoring case sensitivity. Here are some of that large text sentences :

  1. "She is happy. I like that."

  2. "His happy son"

  3. "He is happy all the day"

  4. "Tasha is Happy"

  5. "Choose one of the following: is sad-is happy-is crying"

My initial code is :

String largeText = "....";  // The very large text here.
String removeText = "is happy";
largeText = largeText.replaceAll( "(?i)" + removeText , "" ); 

This code will work fine with sentence number 1, 3, 4, 5. But i do not want to delete it from sentence number 2 as it has another meaning. How can i do that ?

1
  • you'll need to be more specific about when you dont want to replace, just in this exact sentance or in all sentences of a particular form? can you write some rules about when you should and should not match? if so can you write those rules in code? Commented Dec 23, 2010 at 20:23

2 Answers 2

4

Use \b around your pattern to detect word boundaries. ie:

String largeText = "....";  // The very large text here.
String removeText = "is happy";
largeText = largeText.replaceAll( "(?i)\\b" + removeText + "\\b" , "" ); 
Sign up to request clarification or add additional context in comments.

2 Comments

.. That works fine. Just a question ... Will this also work for Unicode letters(Other languages) ?
@Brad: From the documentation for java.util.regex.Pattern it looks like [a-zA-Z_0-9] is used for "word" characters, so I assume that's also the definition they use for word boundaries. You could try using negative assertions instead of \b to look for certain Unicdoe character classes, but note that this will not work for Chinese or any other language that does not require spaces between words unless you first segment the input.
0

You might want to look into atomic zero-width assertions -- patterns that match against positions inside a string (such as a word boundary), rather than text itself.

This question was previously asked; see this link for more info:

java String.replaceAll regex question

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.