What encoding to use when interpreting HTTP/1.1 header field value

Question

In HTTP/1.1 specs I get this when it comes to define headers:

message-header = field-name ":" [ field-value ]

[...]

field-value = *( field-content | LWS )

field-contet = <the OCTETs making up the field-value and consisting of either *TEXT or combinations of token, separators, and quoted-string>

and the definition for OCTET and TEXT is:

OCTET = <any 8-bit sequence of data>

TEXT = <any OCTET except CTLs, but including LWS> ; where CTL refers to control characters from US-ASCII charset.

Question: Now, when it comes to header names (called field-names in definition), the encoding used is US-ASCII (specified in HTTP/1.1 specs), but how would a server application know what encoding to use for header values?

Note: I think it's normal to be US-ASCII encoded, but the definition lets enough room for different situation.

Julian Reschke · Accepted Answer · 2015-04-24 09:02:16Z

2

The semantics of non-ASCII code points is essentially undefined. Avoid them.

Recipients usually decode using ISO-8859-1, which at least allows recovery later on (because it'll preserve all octets).

(Also: you're looking at the wrong spec; RFC 2616 is obsoleted by RFC 7230)

edited Apr 24, 2015 at 9:02

answered Apr 24, 2015 at 8:08

Julian Reschke

42.5k8 gold badges103 silver badges101 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Chris Wesseling Over a year ago

RFC 9110 by now.

Collectives™ on Stack Overflow

What encoding to use when interpreting HTTP/1.1 header field value

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related