2

I'm looking for a pure Java html client library. I need to retrieve html forms, fill the fields and submit them programmatically.

The library should connect to a website acting as a browser, handling cookies, parsing the document's forms and resolving the form submit hassle on its own.

In the past I used Apache HttpClient, but it wasn't simple enough as I was responsible for parsing the document and handle the cookies.

2 Answers 2

3

You may be looking for HtmlUnit -- a "GUI-Less browser for Java programs".

Here's a sample code that opens google.com, searches for "htmlunit" using the form and prints the number of results.

import com.gargoylesoftware.htmlunit.*;
import com.gargoylesoftware.htmlunit.html.*;

public class HtmlUnitFormExample {
    public static void main(String[] args) throws Exception {
        WebClient webClient = new WebClient();
        HtmlPage page = webClient.getPage("http://www.google.com");

        HtmlInput searchBox = page.getElementByName("q");
        searchBox.setValueAttribute("htmlunit");

        HtmlSubmitInput googleSearchSubmitButton = 
                          page.getElementByName("btnG"); // sometimes it's "btnK"
        page=googleSearchSubmitButton.click();

        HtmlDivision resultStatsDiv =
                                page.getFirstByXPath("//div[@id='resultStats']");

        System.out.println(resultStatsDiv.asText()); // About 301,000 results
        webClient.closeAllWindows();
    }
}
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks I think it's exactly what I need, the downside is that it has a lot of dependencies but it looks as a really good choice.
Yeah. Maven can help you with them (dependencies), if you aren't already using it. Anyway, if you need more help with HtmlUnit, just come back and we'll gladly help.
1

Try Lobo, a pure Java web browser. It has an API to embed it in a program.

If you only want the HTML (and CSS etc.) rendering engine you can directly use its engine.

1 Comment

Lobo seems to aim to render the page, I liked the HtmlUnit approach better. Thanks for the contribution anyway.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.