12

Intention is to take a current line (String that contains commas), replace white space with "" (Trim space) and finally store split String elements into the array.

Why does not this work?

String[] textLine = currentInputLine.replace("\\s", "").split(",");
6
  • Tried, replaceAll("\\s", "").split(","). This seems to work. Is this correct? Commented Jun 14, 2010 at 2:52
  • An explanation of what it's doing versus what you expect it to do would be helpful. Commented Jun 14, 2010 at 2:53
  • It looks like you're trying to parse a CSV file. Be aware that the CSV format is a lot more complex than it appears to be at first glance (due to the complexity of handling metacharacters in values). Should you use a library for doing this? Commented Jun 15, 2010 at 12:40
  • 1
    @01: That was a stupid edit. Care to explain it? For now, I’m reverting it. Commented Jun 20, 2010 at 13:57
  • @Konrad: 01 also added [beginner] and [footer] to this question stackoverflow.com/questions/3050284/… Commented Jun 20, 2010 at 14:42

4 Answers 4

10

On regex vs non-regex methods

The String class has the following methods:

So here we see the immediate cause of your problem: you're using a regex pattern in a non-regex method. Instead of replace, you want to use replaceAll.

Other common pitfalls include:

  • split(".") (when a literal period is meant)
  • matches("pattern") is a whole-string match!
    • There's no contains("pattern"); use matches(".*pattern.*") instead

On Guava's Splitter

Depending on your need, String.replaceAll and split combo may do the job adequately. A more specialized tool for this purpose, however, is Splitter from Guava.

Here's an example to show the difference:

public static void main(String[] args) {
    String text = "  one, two, , five (three sir!) ";

    dump(text.replaceAll("\\s", "").split(","));
    // prints "[one] [two] [] [five(threesir!)] "

    dump(Splitter.on(",").trimResults().omitEmptyStrings().split(text));
    // prints "[one] [two] [five (three sir!)] "
}

static void dump(String... ss) {
    dump(Arrays.asList(ss));
}
static void dump(Iterable<String> ss) {
    for (String s : ss) {
        System.out.printf("[%s] ", s);
    }
    System.out.println();       
}

Note that String.split can not omit empty strings in the beginning/middle of the returned array. It can omit trailing empty strings only. Also note that replaceAll may "trim" spaces excessively. You can make the regex more complicated, so that it only trims around the delimiter, but the Splitter solution is definitely more readable and simpler to use.

Guava also has (among many other wonderful things) a very convenient Joiner.

System.out.println(
    Joiner.on("... ").skipNulls().join("Oh", "My", null, "God")
);
// prints "Oh... My... God"
Sign up to request clarification or add additional context in comments.

Comments

6

I think you want replaceAll rather than replace.

And replaceAll("\\s","") will remove all spaces, not just the redundant ones. If that's not what you want, you should try replaceAll("\\s+","\\s") or something like that.

3 Comments

"\s" is not a valid Java String. It should be "\\s" (same for "\\s+")
@Carlos - interestingly, that's what I wrote, but because I didn't put it in code marks, it showed it as \s instead of \\s.
First, you can’t use a regexp in the replacement, only in the search part. Second, this doesn’t remove all whitespace, because it misses common non-ASCII whitespace code points like U+00A0 NO-BREAK SPACE due to a Java bug not fixed till Java 7, and even then you have to embed a "(?U)" into your pattern to get \s to match Unicode whitespace. If you’re used to languages like Perl whose regexes already pick up Unicode by default, it is easy to miss that they do not do so in Java.
2

What you wrote does not match the code:

Intention is to take a current line which contains commas, store trimmed values of all space and store the line into the array.

It seams, by the code, that you want all spaces removed and split the resulting string at the commas (not described). That can be done as Paul Tomblin suggested.

String[] currentLineArray = currentInputLine.replaceAll("\\s", "").split(",");

If you want to split at the commas and remove leading and trailing spaces (trim) from the resulting parts, use:

String[] currentLineArray = currentInputLine.trim().split("\\s*,\\s*");

(trim() is needed to remove leading spaces of first part and trailing space from last part)

Comments

0

If you need to perform this operation repeatedly, I'd suggest using java.util.regex.Pattern and java.util.regex.Matcher instead.

final Pattern pattern = Pattern.compile( regex);
for(String inp: inps) {
    final Matcher matcher = pattern.matcher( inpString);
    return matcher.replaceAll( replacementString); 
}

Compiling a regex is a costly operation and using String's replaceAll repeatedly is not recommended, since each invocation involves compilation of regex followed by replacement.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.