2

Hi I want to remove certain words from a long string, there problem is that some words end with "s" and some start with a capital, basically I want to turn:

"Hello cat Cats cats Dog dogs dog fox foxs Foxs"

into:

"Hello"

at the moment I have this code but I want to improve on it, thanks in advance:

                    .replace("foxs", "")
                    .replace("Fox", "")
                    .replace("Dogs", "")
                    .replace("Cats", "")
                    .replace("dog", "")
                    .replace("cat", "")
1
  • 1
    Use case insensitive flag (?i) and (?i)\s(?:fox|dog|cat)s? Commented Feb 28, 2018 at 15:50

4 Answers 4

7

Try this:

String input = "Hello cat Cats cats Dog dogs dog fox foxs Foxs";
input = input.replaceAll("(?i)\\s*(?:fox|dog|cat)s?", "");

Demo

Sign up to request clarification or add additional context in comments.

3 Comments

I would remove one \\s*, otherwise foo cat bar would become foobar instead of (I am guessing preferred) foo bar.
@Pshemo Yes you're right ... the commenter S Jovan left a brilliant and flawless pattern about 30 seconds before I posted.
Yes, writing answer containing fully executable code takes more time than writing only solution :)
3

Maybe you can try to match everything except the word Hello. Something like:

string.replaceAll("(?!Hello)\\b\\S+", "");

You can test it in this link.

The idea is to perform a negative lookahead for Hello word, and get any other word present.

2 Comments

One of \\b is redundant.
yes, you are right. I edit the answer and remove one of them.
1

So you could pre-compile a list of the words you want and make it case insensitive something like:

    String str = "Hello cat Cats cats Dog dogs dog fox foxs Foxs";
    Pattern p = Pattern.compile("fox[s]?|dog[s]?|cat[s]?", Pattern.CASE_INSENSITIVE);
    Matcher m = p.matcher(str);
    String result = m.replaceAll("");
    System.out.println(result);

[s]? handles if there is a plural form, where the ? character will match 0 or 1

3 Comments

So you don't have to have it inside of there. it was more so in case you wanted to match a set of characters or range of characters at the end [0-9] or [s|es]. I agree this was not necessarily clear. So it could be just: Pattern.compile("foxs?|dogs?|cats?", Pattern.CASE_INSENSITIVE);
It is good you realise that s? and [s]? will work same way (IMO adding [ ] makes it harder to understand - especially for someone new to regex - but that is matter of personal preference). Aside from that "or [s|es]" doesn't look like proper example (or you are misunderstanding it) since [...] can match only single character from set of characters defined in [...]. So [s|es] can match only s or | or e (putting s second time doesn't change anything here).
Valid points. In my second case it should have been (s|es) and I do agree that it does make it harder to understand, always room for improvement
0

You can generate patterns that match all combinations for a word. I.e. for dog you need the pattern [Dd]ogs?:

  • [Dd] is a character class that matches both cases
  • s? matches zero or one s
  • the rest of the word will be case sensitive. I.e. dOGS will not be a match.

This is how you can put it together:

public static void main(String[] args) {
    // it's easy to add any other word
    String original = "Hello cat Cats cats Dog dogs dog fox foxs Foxs";
    String[] words = {"fox", "dog", "cat"};
    String tmp = original;
    for (String word : words) {
        String firstChar = word.substring(0, 1);
        String firstCharClass = "[" + firstChar.toUpperCase() + firstChar.toLowerCase() + "]";
        String patternSrc = firstCharClass + word.substring(1) + "s?"; // [Ww]ords?
        tmp = tmp.replaceAll(patternSrc, "");
    }
    tmp = tmp.trim(); // to remove unnecessary spaces 
    System.out.println(tmp);
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.