0

I'm trying to create a program that will split a given string into multiple parts, then convert all to lowercase if it has two or more consecutive capital letters. After splitting the string, it would strip it of any non-alphabetical characters, convert it all to lowercase, then put the non-alphabetical characters back in. I have the logic to convert it all to lower case, but it doesn't split the string quite how I want. Currently, I'm trying to make it so that it:

  • Splits the string where there are two or more consecutive capital letters, then makes it all lowercase (THIS_IS_A_TEST)
  • Splits the string where there is a "!", "?", or "." (THIS.IS!A?TEST)
  • Splits the string where there is whitespace (THIS IS A TEST)

I currently here's everything I have to do this stuff: http://pastebin.com/ppBykvY4

Where the "[A-Z]{2}" is for two consecutive capitals, but I don't know how I can include the rest. {Punc} would only work if I could exclude everything except "!", "?", ".".

Also, I'm using the BukkitAPI.

EXAMPLE: If a user entered all of the above examples (in the bullets), they should be:

  • this_is_a_test
  • this.is!.a?test
  • this is a test
7
  • 1
    Before any other discussion, are you sure that split("[A-Z]{2}") does what you expect? What it acutally does is use any two groups of capital letters as a delimiter. E.g. for the following string: I LIke to eat pizza in LA. And drink wine in Detroit, the splits will be: I , ke to eat pizza in and . And drink wine in Detroit. Commented May 24, 2014 at 17:43
  • To be more clear, edit your post to add expected results for example strings. Commented May 24, 2014 at 17:45
  • Sorry, I forgot to mention that part. The purpose of the above line of code was to simply check if any of these conditions were met. It would then convert it all to a lowercase string, so you example would be I like to eat pizza in la. And drink wine in detroit. Commented May 24, 2014 at 17:47
  • Ok, but be aware that the separator will not be included in any of the splits. Commented May 24, 2014 at 17:48
  • @Andrei Nicusan Look at my pastebin example. I don't think I'm explaining myself well enough. Commented May 24, 2014 at 18:07

1 Answer 1

2

If you want multiple delimiters, use the |.

String[] message = chat.split("regex|regex|regex");

So I assume it will end up something like this:

String[] message = chat.split("[A-Z]{2}|[.!?]|[\s]");

enter image description here

Tested using: http://www.regexplanet.com/advanced/java/index.html

Sign up to request clarification or add additional context in comments.

1 Comment

Actually escape the dot. It is a special character in regex.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.