0

I've created a program to scrape data from a website using the <tr> tag. When I run my program and print my array List<String> getDataList I get a single string row of data as a printout.

[Flights on time 93%, Within 1 hour 99%, FLIGHT FROM TO DEPART ARRIVE STATUS, FR 9083 Bournemouth Alicante 10:10 13:35 Landed 19:00, FR 1902 Krakow Dublin 10:45 12:55 Estimated Arrival 23:05,

So basically, I want to format the data into a useable format and make it readable. As you can see each comma separates one flight from another, but I need to separate themselves and add text inbetween, like the following. (I use code for the text i want to add manually)

FR 9083 From: Bournemouth To: Alicante Dep:10:10 Arr:13:35 Status Landed 19:00 and so on.......

How do i go about separating the data in order to add text elements in the middle.

I've thought of using the spaces to do it, but some destinations like 'London Stanstead' have spaces in the names as well as just the separators.

Can someone tell me how I would go about doing this please? I'm learning as i go here......\

Many Thanks

3
  • post your input and output that you want only not this whole stuf Commented Feb 1, 2014 at 20:27
  • if you want to have property names in addition to values, it is better to have Map<String, String> instead of List<String>. Commented Feb 1, 2014 at 20:31
  • While there are some nice answers on String splitting and maps, all of them fail to address the point that it's impossible to parse the String, if the flight is from New York to San Diego, unless you can compare the words found versus every known airfield/town and know whether the separate words are tied to origin or destination. There simply isn't enough information to parse the String properly otherwise. If the website actually has a table with cells, instead of just rows for the information, you can solve this by scraping the data more reasonably. Commented Feb 1, 2014 at 20:59

3 Answers 3

3

Take a look Google Guava Splitter. It's a useful library in these cases.

http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/base/Splitter.html

Sign up to request clarification or add additional context in comments.

Comments

1

Well since you're using List interface Its probably has assigned an ArrayList to it.

So first of all you gotta convert it to String.

String unSort = getDataList.toString();
String Sort[] = unSort.split(",");
String noComma = Sort.toString();

This will solve your comma seperation problem, you can further split the sorted text by using some other regexes like ":" or any other.

String noColon[] = noComma.split(":"); 

Comments

0

This is the interesting part of the String.

FLIGHT FROM TO DEPART ARRIVE STATUS, FR 9083 Bournemouth Alicante 10:10 13:35 Landed 19:00,

And to me it seems like a job for a HashMap wrapped up in an Object.

Example

private Map<String, String> fields;

public Flight(String input)
{
    // Input is currently equal to the value above.
    fields = new HashMap<String, String>();
    parseInput();
}

private void parseInput(String input)
{
    // First split up each section.

    String[] tokens = input.split(",");

    // Next, split the headers up.
    String[] headers = tokens[0].split(" ");
    // And the values.
    String[] values = tokens[1].split(" ");

    for(int x = 0; x < headers.length; x++)
    {
         // Map each header to it's respective value.
         fields.put(headers[x], values[x]);
    }
}

public String toString()
{
    // Output the object as a string.
    StringBuilder builder = new StringBuilder();

    for(String key : fields.getKeySet())
    {
        builder.append(key + ":" + fields.get(key) + " ");
    }

    return builder.toString();
}

With this code, you pass in the String, starting with the field headers, and you call the toString method on the resulting Flight object, and it'll output your String in the desired format.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.