1

I am attempting to connect to a website where I'd like to extract its HTML contents. My application will never connect to the site - only time out.

Here is my code:

URL url = new URL("www.website.com");
URLConnection connection = url.openConnection();
connection.setConnectTimeout(2000);
connection.setReadTimeOut(2000);
BufferedReader reader = new BufferedReader(new InputStreamReader(connection.getInputStream());
String line;

while ((line = reader.readLine()) != null) {
  // do stuff with line
}

reader.close();

Any ideas would be greatly appreciated. Thanks!

8
  • are you able to access that website using your browser? any proxy set ? Commented Dec 27, 2010 at 16:42
  • Yeah, I'm able to access the website just fine via the browser. Shouldn't be a proxy set. How can I tell? Commented Dec 27, 2010 at 17:04
  • Are you really setting the timeout to 2 seconds? How complicated of a page are you loading? Change the timeout to something much higher like 10 minutes and see if you are able to load any data. Commented Dec 27, 2010 at 17:14
  • are you getting a connection timeout or read time out? are you seeing any exceptions? Have you tried telnet to the url you are connecting to and checked if you are able to connect or not? Commented Dec 27, 2010 at 17:26
  • Andrew, I've tested it without any limits on the time out and let the web page try to load until Tomcat throws a ConnectException, proclaiming the connect timed out. The page is not very complicated - a static page with an HTML table. Commented Dec 27, 2010 at 17:50

1 Answer 1

3

I believe the url should be (ie. you need a protocol):

URL url = new URL("http://www.website.com"); 

If that doesn't help then post your real SSCCE that demonstrates the problem so we don't have to guess what you are really doing because we can't tell if you are using your try/catch block correctly or if you are just ignoring exceptions.

Sign up to request clarification or add additional context in comments.

5 Comments

Right. The URL context: <scheme>://<authority><path>?<query>#<fragment>
In my actual code, it had the http protocol. What you see within my code is pretty much all there is expect for the method signature which is public String testURLConnection() throws IOException. Inside the while loop prints the line. My main objective is to parse the webpage's HTML contents, but I figure it would be best to start small and make sure the connection can be made first.
I agree, you should test the connections first before parsing the contents. So your test program should be about 15 lines of code. Post your SSCCE so we don't have to guess what your are doing. That way we can also copy/paste and test code ourselves. Also,why for a simple test are you setting the timeout value? Have you tried other URL's, like stackoverflow for example?
camickr, you won't be able to test the code because the URL leads to a webpage that is on an intranet. This why I put the bogus webpage.com link as the URL. I set the time out value because without it, the connection would be stalling for a few minutes. I did this so if the connection wouldn't connect immediately like it should, then save me some time and halt.
setting connection and read time out is a preferred option. though try to make these configurable on a per url basis and do some testing to find the optimal values

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.