0

Having trouble writing a method to accomplish this, have the basic outline of the method but just need some pointers/help accomplishing this.

  public static String [] readFileAndReturnWords(String filename){
     //create array
     //read one word at a time from file and store in array
     //return the array
  }

This is what I have so far:

public static String readFileAndReturnWords(String filename){   
      String[] temp = new String[];

      //connects file
      File file = new File(filename);
      Scanner inputFile = null;

     try{

          inputFile = new Scanner(file);

         }
          //When arg is mistyped
      catch(FileNotFoundException Exception1) {
          System.out.println("File not found!");
          System.exit(0);      
     }


     //Loops through a file
    if (inputFile != null) {

    try { //I draw a blank here

I understand that some .next and .hasNext calling is in order, I just am not sure how to use these particular methods in the context of the problem.

3
  • 2
    That's where the documentation becomes useful: you don't know how to use them, so you read their documentaion, and then you know better: docs.oracle.com/javase/7/docs/api/java/util/Scanner.html Commented Feb 21, 2015 at 8:23
  • @JBNizet True, however in the context of this problem, that's where I am having difficulty understanding the Syntax and other such things about these particular methods. Reading the oracle documentation gives me some context, but does not necessarily help me understand syntax or how to truly apply it to any problem. Perhaps not being able to apply what I read from the documentation could be chalked up to my inexperience in programming. Commented Feb 21, 2015 at 8:35
  • 1
    The cool thing with programming is that you can try things and make mistakes. hasNext() returns true while there are still tokens. next() consumes the next token and returns it. You want to read every token, so you need a loop. The loop should stop when there is no token anymore. If there is no token, hasNext() returns false. This should be enough to at least try something. Commented Feb 21, 2015 at 8:39

2 Answers 2

3

Splitting into individual words is actually a little trickier than it might first seem - what do you split on?

If you split on spaces then fullstops, commas and other punctuation will end up attached to a word, so

quick, the lazy dog.

Would be split into:

  1. quick,
  2. the
  3. lazy
  4. dog.

Which may or may not be what you want. If you split on non-word characters then you end up splitting on apostrophes, hyphens etc, so:

  • can't, won't ->
    1. can
    2. t
    3. won
    4. t
  • no-one suspects hyper-space
    1. no
    2. one
    3. suspects
    4. hyper
    5. space

So, these solutions each have their issues. I would suggest the use of the word boundary regex matcher. It's a little more sophisticated, but has issues nonetheless - try different approaches and see what produces the output you need.

The solution I propose uses Java 8:

public static String[] readFileAndReturnWords(String filename) throws IOException {
    final Path path = Paths.get(filename);
    final Pattern pattern = Pattern.compile("\\b");

    try (final Stream<String> lines = Files.lines(path)) {
        return lines.flatMap(pattern::splitAsStream).toArray(String[]::new);
    }
}

So first you convert your String to a Path, a Java NIO representation of a file location. You then create your Pattern, this decides how to break up words.

How you simply use Files.lines to stream all the lines in the file and then Pattern.splitAsStream to turn each line into words. We use flatMap as we need to "flatten" the stream, i.e. each line will be a Stream<String> and we already have a Stream<String> so we end up with a Stream<Stream<String>>. flatMap is designed to take a Stream<Stream<T>> and return a Stream<T>.

Sign up to request clarification or add additional context in comments.

Comments

2

Store it in an ArrayList, since you don't know how many words are stored in your file.

public class Test
{
  static ArrayList<String> words;
  public static void main(String[] args) throws FileNotFoundException
  {
    Scanner s = new Scanner(new File("Blah.txt"));
    words = new ArrayList<String>();
    while(s.hasNext ())
    {
      String token = s.next ();
      if(isAWord(token))
      {
        if(token.contains ("."))
        {
         token =  token.replace (".","");
        }
        if(token.contains (","))
        {
          token = token.replace (",", "");
        }
        //and remove other characters like braces and parenthesis 
        //since the scanner gets tokens like
        // here we are, < "are," would be a token
        //
        words.add(token);
      }

    }

  }

  private static boolean isAWord(String token)
  {
    //check if the token is a word
  }
}

It should work.

If you really want to use an array, you can just transform your ArrayList into a simple Array by

String[] wordArray = words.toArray();

1 Comment

toArray() won't compile. You need to pass the types as Sting[] wordArray = words.toArray(new String[0]);

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.