1

i am trying to read words from the text file and store it in array.Problem from the code i tried as shown below is that it reads all characters such as "words," and "read." but i only want "words" and "read" in an array.

public String[] openFile() throws IOException
{
    int noOfWords=0;
    Scanner sc2 = new Scanner(new File(path));
    while(sc2.hasNext()) 
    {
         noOfWords++;
         sc2.next();
    }

    Scanner sc3 = new Scanner(new File(path));
    String bagOfWords[] = new String[noOfWords];
    for(int i = 0;i<noOfWords;i++)
    {
         bagOfWords[i] =sc3.next();
    }

    sc3.close();
    sc2.close();
    return bagOfWords;
}

3 Answers 3

3

Use regex replace :

replaceAll("([^a-zA-Z]+)","");

And apply that line to

bagOfWords[i] = sc3.next().replaceAll("([^a-zA-Z]+)","");
Sign up to request clarification or add additional context in comments.

2 Comments

The parentheses and + are not necessary you just need [^a-zA-Z]. It would probably benefit the OP if you explained the regex pattern and how the replaceAll uses it.
yes i know, i think + will replace a group of chars rather than every char. So it not waste memory address for every regex match
2

Use this code:

for (int i = 0; i < noOfWords; i++) {
     bagOfWords[i] = sc3.next().replaceAll("[^A-Za-z0-9 ]", "");
}

Comments

1

You probably want only letters. In this case, you can use Character.isLetter(char) method.

Snippet:

String token = "word1";
String newToken = "";
for (int i = 0; i < token.length(); i++) {
    char c = token.charAt(i);
    if(java.lang.Character.isLetter(c)){
        newToken += c;
    }
}
System.out.println(newToken);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.