1

I have a line of plaintext that contains a series of tags delimited by a plus sign:

event name @location +tag1 +tag2 +tag3 +tag4

The data fields alway sappear in the same order: Name, Location, Tags. There is always ONLY ONE instance of name and location, but there can be one or more tags. I'd like to be able to replicate the .NET StringSplit method (write all delimited strings to an array) in Java, but can't seem to wrap my head around doing it.

My desired output for the tag field from the above example would be:

tag[0] = tag1
tag[1] = tag2
tag[2] = tag3
tag[3] = tag4

First, the closest method I can find would be split which uses regex. But I'm not sure how I would code the regex to EXCLUDE from the array any characters that are before the first +.

I thought of getting a count of + in a particular row and using a for loop to parse and create tagString[count-of-plusses], but would that step through multiple instances of +nnnnn on a single line?

Any suggestions on a good way to approach this?

7 Answers 7

2

If you look at the javadoc for String specifically:

public String[] split(String regex)

This will allow you to split a string around whatever you like and return an array of strings.

You can use the version of this method with the limit parameter too. Use this first to allow you to get rid of all the stuff before then do it again without the limit parameter.

Sign up to request clarification or add additional context in comments.

Comments

2

You can split the string and copy the returned array without the first item:

String s ="event name @location +tag1 +tag2 +tag3 +
String[] items = s.split("\\+");

//remove the `event name  @location` part
String[] tags = new String[items.length - 1];
System.arraycopy(items, 1, tags, 0, items.length - 1); 

Make sure you add relevant sanity checks (on length of items being > 1 for example).

Comments

0

Use the String methods indexOf() and subString() to get the relevant part of the String i.e. discard everything before the first +.

Then work on this Substring using split() with + as the delimiter and trim() to discard whitespace.

Comments

0

If you don't want to use regex then try http://docs.oracle.com/javase/1.4.2/docs/api/java/util/StringTokenizer.html.

Comments

0

to get rid of the chars before the first + just ignore the first item in the array.

    String input = "+tag1+tag2+tag3+tag4";
    String[] splitted = input.split("\\+");     
    System.out.println(Arrays.toString(splitted));
    //returns [, tag1, tag2, tag3, tag4]

    input = "xxx+tag1+tag2+tag3+tag4";
    splitted = input.split("\\+");      
    System.out.println(Arrays.toString(splitted));
    //returns [xxx, tag1, tag2, tag3, tag4]

Comments

0
input.split(" +")[1]     // you have "tag1 +tag2 +tag3 +tag4" now
     .split("\\s+\\+");  // you have {tag1, tag2, tag3, tag4} now

1 Comment

It is difficult to add a check with that syntax (i.e. you will get an IndexOutOfBoundsException if there is no " +" in the original string).
0

Try this.

String string = "event name @location +tag1 +tag2 +tag3 +tag4";                        
String[] ss = string.split(" ");
String[] tag = new String[ss.length - 3];
for (int i = 3 ; i < ss.length; i++) {
    tag[i-3] = ss[i].replace("+", "");
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.