1

I'm a professional lawyer and an amateur programmer (mainly C++). On a daily basis, I must access the same website and search for tons of court cases (plain text). Recently, I came up with the idea of writing a simple C++ function to automatically submit the form data to the site, which would save me lots of work time. I've been learning WinInet over the past few days and, after many hours, I managed to send a simple GET request and retreive the main page source code, but I can't go any further than that. My objective is to send a POST request (I think) with the search parameters to the site and obtain as a response the source code of the real search result page.

What I managed to program so far is almost worthless, but at least it gives a glimpse of my current state.

#include <windows.h>
#include <wininet.h>
#include <string>

#pragma comment (lib, "Wininet.lib")

bool GetRequest() {

    HINTERNET hSession = InternetOpen(L"Mozilla/5.0", INTERNET_OPEN_TYPE_PRECONFIG, NULL, NULL, 0);
    if (!hSession) return false;

    HINTERNET hConnect = InternetConnect(hSession, L"scon.stj.jus.br", INTERNET_INVALID_PORT_NUMBER, L"", L"", INTERNET_SERVICE_HTTP, 0, 0);
    if (!hConnect) return false;

    HINTERNET hRequest = HttpOpenRequest(hConnect, L"GET", L"/", NULL, NULL, NULL, INTERNET_FLAG_KEEP_CONNECTION, 0);
    if (!hRequest) return false;

    if (!HttpSendRequest(hRequest, 0, 0, 0, 0)) return false;

    //from here on, I am able to read the main page source code using InternetReadFile, but I can't submit any form data

    InternetCloseHandle(hRequest);
    InternetCloseHandle(hConnect);
    InternetCloseHandle(hSession);

    return true;

}

EDIT:

I'm still having trouble trying to replicate a search request. Whenever I run the following code, I keep receiving the same webpage as a response. I think the values I'm passing with the GET request are wrong. I just copied the URL link generated by the browser when I perform a generic search, but the webform seems to have a different input list (I don't know if that's possible). Anyway, here's my current code:

#include <windows.h>
#include <wininet.h>
#include <string>

#pragma comment (lib, "Wininet.lib")

#define BUFFERSIZE 2048

bool GetRequest() {

    std::wstring inputValues = L"/SCON/pesquisar.jsp?preConsultaPP=&pesquisaAmigavel=+fraude&acao=pesquisar&novaConsulta=true&i=1&b=ACOR&livre=fraude&filtroPorOrgao=&filtroPorMinistro=&filtroPorNota=&data=&operador=e&thesaurus=JURIDICO&p=true&tp=T&processo=&classe=&uf=&relator=&dtpb=&dtpb1=&dtpb2=&dtde=&dtde1=&dtde2=&orgao=&ementa=&nota=&ref=";

    HINTERNET hSession = InternetOpen(L"Mozilla/5.0", INTERNET_OPEN_TYPE_PRECONFIG, NULL, NULL, 0);
    if (!hSession) return false;

    HINTERNET hConnect = InternetConnect(hSession, L"processo.stj.jus.br", INTERNET_INVALID_PORT_NUMBER, L"", L"", INTERNET_SERVICE_HTTP, 0, 0);
    if (!hConnect) return false;

    HINTERNET hGetRequest = HttpOpenRequest(hConnect, L"GET", inputValues.c_str(), NULL, NULL, NULL, INTERNET_FLAG_KEEP_CONNECTION, 0);
    if (!hGetRequest) return false;
    if (!HttpSendRequest(hGetRequest, NULL, 0, 0, 0)) return false;
          
    char buffer[BUFFERSIZE + 1] = "";
    DWORD dwBytesRead;
    BOOL bRead;

    FILE* file=NULL;
    fopen_s(&file, "output.txt", "w");
    if (!file) return false;   

    while (true) {

        bRead = InternetReadFile(hGetRequest, buffer, BUFFERSIZE, &dwBytesRead);
        if (dwBytesRead == 0) break;
        if (bRead) buffer[dwBytesRead] = 0;        
        fwrite(buffer, sizeof(char), strlen(buffer), file);
    }  

    InternetCloseHandle(hGetRequest);
    InternetCloseHandle(hConnect);
    InternetCloseHandle(hSession);

    return true;
}

1 Answer 1

4

If you look at the HTML for the page in question, you will find it has multiple <form> elements, for example:

<form id="frmPesquisaJurHeader" name="frmPesquisaJurHeader"
                                    action="https://scon.stj.jus.br/SCON/pesquisar.jsp" method="post" target="_blank"
                                    style="display: none">
                                    <input name="b" value="ACOR" type="hidden">
                                    <input name="O" value="JT" type="hidden">
                                    <input name="livre" id="headerLivre" type="hidden" value="">
                                </form>

You can see that this webform requires a POST request to https://scon.stj.jus.br/SCON/pesquisar.jsp with 3 <input> values.

As there is no enctype attribute specified on the <form> element, the media type for the POST body is expected to be application/x-www-form-urlencoded, which takes name=value pairs delimited by & (I'm not going to cover the multipart/form-data format in this answer, as it doesn't apply to the site in question, but you should research that as well).

So, such a webform would be submitted like this:

#include <windows.h>
#include <wininet.h>
#include <string>

#pragma comment (lib, "Wininet.lib")

bool GetRequest() {

    std::string inputValues = "b=ACOR&O=JT&headerLivre=";

    HINTERNET hSession = InternetOpen(L"Mozilla/5.0", INTERNET_OPEN_TYPE_PRECONFIG, NULL, NULL, 0);
    if (!hSession) return false;

    HINTERNET hConnect = InternetConnect(hSession, L"scon.stj.jus.br", INTERNET_DEFAULT_HTTPS_PORT, L"", L"", INTERNET_SERVICE_HTTP, 0, 0);
    if (!hConnect) return false;

    HINTERNET hRequest = HttpOpenRequest(hConnect, L"POST", L"/SCON/pesquisar.jsp", NULL, NULL, NULL, INTERNET_FLAG_KEEP_CONNECTION, 0);
    if (!hRequest) return false;

    if (!HttpSendRequest(hRequest, L"Content-Type: application/x-www-form-urlencoded\r\n", -1, inputValues.c_str(), inputValues.size())) return false;

    //...

    InternetCloseHandle(hRequest);
    InternetCloseHandle(hConnect);
    InternetCloseHandle(hSession);

    return true;

}

Let's look at a different webform example on the same page:

<form id="frmPesquisaProcHeader" action="https://ww2.stj.jus.br/processo/pesquisa/"
                                    method="get" name="frmPesquisaProcHeader" onsubmit="" target="_blank"
                                    style="display: none">
                                    <input name="termo" id="headerTermo" type="hidden" value="">
                                    <input name="aplicacao" value="processos.ea" type="hidden">
                                    <input name="tipoPesquisa" value="tipoPesquisaGenerica" type="hidden">
                                    <input id="chkordem" name="chkordem" value="DESC" type="hidden">
                                    <input id="chkMorto" name="chkMorto" value="MORTO" type="hidden">
                                </form>

Notice that this webform wants a GET request instead of POST, so you would need to include the input values in the URL itself, denoted by a ?, rather than in the request body, eg:

#include <windows.h>
#include <wininet.h>
#include <string>

#pragma comment (lib, "Wininet.lib")

bool GetRequest() {

    std::wstring inputValues = L"termo=&aplicacao=processos.ea&tipoPesquisa=tipoPesquisaGenerica&chkordem=DESC&chkMorto=MORTO";

    HINTERNET hSession = InternetOpen(L"Mozilla/5.0", INTERNET_OPEN_TYPE_PRECONFIG, NULL, NULL, 0);
    if (!hSession) return false;

    HINTERNET hConnect = InternetConnect(hSession, L"ww2.stj.jus.br", INTERNET_DEFAULT_HTTPS_PORT, L"", L"", INTERNET_SERVICE_HTTP, 0, 0);
    if (!hConnect) return false;

    HINTERNET hRequest = HttpOpenRequest(hConnect, L"GET", (L"/processo/pesquisa/?" + inputValues).c_str(), NULL, NULL, NULL, INTERNET_FLAG_KEEP_CONNECTION, 0);
    if (!hRequest) return false;

    if (!HttpSendRequest(hRequest, 0, 0, 0, 0)) return false;

    //...

    InternetCloseHandle(hRequest);
    InternetCloseHandle(hConnect);
    InternetCloseHandle(hSession);

    return true;

}

Repeat this as needed for each HTML <form> element that you want to submit. In general, you need to determine:

  • the action url to send the request to.
  • the method of the request.
  • the <input> values that must be sent in the specified (or implied) enctype format.
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you very much, @Remy Lebeau for the complete answer.
Hi, @RemyLebeau. I edited my OG post with an update. Basically, whichever request I make, either GET or POST, returns the same generic web page without the search results. I'm very confused on what I'm missing here.
@TylerD007 there are multiple webforms on the page you are trying to access. Are you sure you are submitting the correct one? Also, maybe HTTP cookies are another factor that you have to account for, too. Best to use your browser's built-in debugger to see the actual request it sends, and then you can replicate that request in your code. Use a tool like Fiddler to make sure what you are sending matches what you see in the browser.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.