0

I normally write a bunch of Typescript/Javascript code, and I took the HTTP protocol for granted. I decided to write a C++ program that can make GET requests to a given URL with sockets to learn more about how HTTP works. I've coded up what I think is the basic structure of what I should be doing:

  1. Create a socket
  2. Connect the socket to host
  3. Send request byte by byte
  4. Received response byte by byte

The issue is that the request always returns status code 400:

>>>> Request sent:
GET /links HTTP/1.1


>>>> Response received:
HTTP/1.1 400 Bad Request
Server: cloudflare
Date: Sun, 11 Oct 2020 05:57:11 GMT
Content-Type: text/html
Content-Length: 155
Connection: close
CF-RAY: -

<html>
<head><title>400 Bad Request</title></head>
<body>
<center><h1>400 Bad Request</h1></center>
<hr><center>cloudflare</center>
</body>
</html>

Here's the code I'm using to create the socket and connect it to the host:

    // ---- Create the socket
    int sockfd = socket(AF_INET, SOCK_STREAM, 0);
    if (sockfd < 0) error("!!!! ERROR opening socket");

    // ---- Look up (and parse) the server host, route
    std::string host = parseHost(url);
    std::string route = parseRoute(url);
    struct hostent *server = gethostbyname(host.c_str()); // host.c_str())
    if (server == NULL)
        error("!!!! ERROR, no such host");

    // ---- Fill in the server address structure
    struct sockaddr_in serv_addr;
    memset(&serv_addr, 0, sizeof(serv_addr));
    serv_addr.sin_family = AF_INET;
    serv_addr.sin_port = htons(PORT); // PORT
    memcpy(&serv_addr.sin_addr.s_addr, server->h_addr, server->h_length);

    // ---- Connect the socket to the given url's host
    const int didConnect = connect(sockfd, (struct sockaddr *)&serv_addr, sizeof(serv_addr));
    if (didConnect < 0) error("!!!! ERROR connecting to host");

If that logic seems correct, then here's the code that sends the request and receives the response:

    // ---- Send the GET request byte by byte
    int bytes, sent, received;
    char response[MAX_CONTENT_LENGTH];
    int total = sizeof(response) - 1;
    sent = 0;
    std::string message = "GET " + route + "HTTP/1.1" + "\r\n\r\n";
    do
    {
        bytes = write(sockfd, message.c_str() + sent, total - sent);
        if (bytes < 0) error("!!!! ERROR writing message to socket");
        if (bytes == 0) break;
        sent += bytes;
    } while (sent < total);

    std::cout << ">>>> Request sent: \n" << message << std::endl;

    // ---- Receive the response byte by byte
    memset(response, 0, sizeof(response));
    total = sizeof(response) - 1;
    received = 0;
    do {
        bytes = read(sockfd, response + received, total - received);
        if (bytes < 0)
            error("!!!! ERROR reading response from socket");
        if (bytes == 0)
            break;
        received += bytes;
    } while (received < total);

    if (received == total)
        error("!!!! ERROR storing complete response from socket (not enough space allocated)");
  • TLDR: Am I sending the right char array to the server? As of right now it has this format: "GET /route HTTP/1.1\r\n\r\n". Am I forgetting any headers/other information that lets the server know that it's a GET request?
7
  • 1
    Do you initialize total? Commented Oct 11, 2020 at 4:46
  • Ahh I didn't! I initialized it to sizeof(response)-1 and it works. Now there's an issue with how I'm receiving the response. Commented Oct 11, 2020 at 4:49
  • 1
    1) Crank up your warnings. I'm sure your toolchain would emit a warning for this. 2) Try to declare one variable per line, and always initialize them. If you can't initialize the variable, figure out if you really need it declared at that point. 3) The debugger is your friend :) Commented Oct 11, 2020 at 4:51
  • Gotcha :) I'm a C/C++ noob so this is great advice! Commented Oct 11, 2020 at 4:52
  • Still need help :( Commented Oct 11, 2020 at 6:01

1 Answer 1

2
>>>> Request sent:
GET /links HTTP/1.1

This is not a valid HTTP request. It is missing the Host header with the target domain name, i.e. it should be at least something like this

GET /links HTTP/1.1
Host: www.example.com

Apart from that your code simply tries to read data until the server closes the connection or the size of your internal buffer is reached. But HTTP/1.1 by default uses persistent connection. So the server might simply keep the connection open after the response is sent because it expects your next request, which means that your program will simply block doing nothing.

Note that HTTP is far more complex than it looks. Please study the actual standard if you want to implement it instead of using existing libraries. That's what standards are for.

Sign up to request clarification or add additional context in comments.

4 Comments

Would this require some form of "ack" to the server to let it know to close the connection?
@JasperHuang: Response header and body have a clear structure and how to determine the length of the body is available from the header. This means the client can determine, when to stop reading the response. There is also a connection header which can be used to overwrite the expectation regarding persistent connections. Again: please study the actual standard, it is all defined there.
@JasperHuang if you want the server to close the connection at the end of the response, you have to explicitly ask for that by sending a Connection: close header in the request.
@JasperHuang: "... sending a Connection: close header in the request .. " - Or even better use HTTP/1.0 instead of HTTP/1.1 since this simplifies the protocol even more. With HTTP/1.1 you have to be aware of chunked transfer encoding of the response, with HTTP/1.0 not.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.