1

Let's say we have an HTML form for sending data to the php/webserver server

<!DOCTYPE html>
<html>
<head>
  <title>Form</title>
</head>
<body>
  <form action="https://website.com/form.php" method="POST" accept-charset="utf-8">
    <label for="name">Name:</label>
    <input type="text" id="name" name="name"><br><br>

    <label for="email">Email:</label>
    <input type="email" id="email" name="email"><br><br>

    <input type="submit" value="Send">
  </form>
</body>
</html>

When a user sends a request to a server, how does the server know in what encoding the client sent the data? I analyzed the request headers and realized that despite using accept-charset="utf-8", the charset attribute was not added to the Content-Type header.

In general, I am interested in the following questions:

  1. How does the server know the encoding of the data sent?
  2. Is there a default encoding for HTTP request data?
  3. Can a charset be specified in the Content-Type header of an HTTP request? If so, how do I add charset?
4
  • 1
    it looks like UTF-8 is the default charset, look here : stackoverflow.com/a/16829056 Commented Sep 13 at 15:07
  • 2
    Please share how this problem is related to PHP Commented Sep 13 at 15:27
  • @KenLee That sets the type of the response, not the request. Commented Sep 13 at 15:27
  • Since this is labelled as a PHP question, PHP doesn't really know or care about the encoding of string data. PHP strings are binary data. Only when you call a function that needs to know (e.g. mb_strtoupper()) it's that encoded is really needed and, in such case, you either provide it as argument or set a default to rely on. In current versions, default is UTF-8, but it was ISO-8859-1 in the early days. Commented Sep 16 at 14:36

1 Answer 1

4

The default encoding is UTF-8. If you want to specify some other encoding, you can use the accept-charset attribute of the <form>.

The character encoding accepted by the server. The specification allows a single case-insensitive value of "UTF-8", reflecting the ubiquity of this encoding (historically multiple character encodings could be specified as a comma-separated or space-separated list).

However, the HTML specification doesn't specify the use of any other charset than UTF-8 for form submissions.

Sign up to request clarification or add additional context in comments.

5 Comments

How you know that the default encoding is UTF-8? Can you provide a reference to official specification?
It's in the place I linked to.
Mozilla website is not official specification :) But thank you so much anyway!
MDN the user-friendly documentation. The official specificationn is made by WHATWG, and Mozilla is one of the original founders of WHATWG. So be aasured that one of the deciding members of WHATWG follows their own specifications in their documentation.
@SamuelSmith The bottom of the linked page does provide a link to the official specification.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.