.NET UrlEncode not working?

Question

According to http://www.w3schools.com/tags/ref_urlencode.asp When requests get submitted they are URL encoded, so for example space gets transformed to %20. So far so good.

I have a problem with !. Submitting it in the form converts it to %21 as it should. However HttpUtility.UrlEncode (or its WebUtility partner) or Uri.EscapeDataString all will return ! Is this an expected behaviour? How should I encode my input from c# so that it converts it proper values?

I can't give you a technical explanation but i can confirm that there divergences. Also I also found many different implemented url encodes. — Boas Enkler
– Boas Enkler, Commented Jun 19, 2015 at 8:03
Note that Javascript encodeURIComponent does the same. There is an annotation in the mozilla help about that function that says: To be more stringent in adhering to RFC 3986 (which reserves !, ', (, ), and *), even though these characters have no formalized URI delimiting uses, the following can be safely used: — xanatos
– xanatos, Commented Jun 19, 2015 at 8:07
This is very interesting/weird. So we have some characters that are in 'grey' zone and can be but don't have to be encoded — Marcin Waligora
– Marcin Waligora, Commented Jun 19, 2015 at 8:14
On my machine (VS2015RC, Win 7 x64) Uri.EscapeDataString produces %21 for me. — Damien_The_Unbeliever
– Damien_The_Unbeliever, Commented Jun 19, 2015 at 8:17
@Damien_The_Unbeliever - Your project is likely targetting .NET 4.5. If it targets anything below, EscapeDataString shouldn't percentage encode it. — keyboardP
– keyboardP, Commented Jun 19, 2015 at 8:41

Community · Accepted Answer · 2017-05-23 11:59:23Z

7

An exclamation mark is considered a URL-safe ASCII character and therefore not percentage encoded.

From MSDN

The UrlEncode method URL-encodes any character that is not in the set of ASCII characters that is considered to be URL-safe. Spaces are encoded as the ASCII "+" character. URL-safe ASCII characters include the ASCI characters (A to Z and a to z), numerals (0 to 9), and some punctuation marks. The following table lists the punctuation marks that are considered URL-safe ASCII characters.

The table contains - _ . ! * ( )

Update

According to this answer, Uri.EscapeDataString should encode the ! when targetting .NET 4.5 projects but I'm unable to test it on my current machine. EscapeDataString on previous .NET frameworks does not percentage encode the characters above. You may simply need to use String.Replace and replace the characters above from the escaped URI.

edited May 23, 2017 at 11:59

CommunityBot

11 silver badge

answered Jun 19, 2015 at 8:04

keyboardP

69.5k13 gold badges162 silver badges209 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Marcin Waligora Over a year ago

Thanks keyboardP - Is there any way in .NET encode these?

Jon Hanna · Accepted Answer · 2015-06-19 10:03:12Z

So we have some characters that are in 'grey' zone and can be but don't have to be encoded.

All characters can be encoded. http://stackoverflow.com/questions and http://stackoverflow.com/%71%75%65%73%74%69%6F%6E%73 are both identical.

The only time a character cannot be encoded, is if it is being used in a way that has a special meaning with URIs, such as the / separating path elements.

The only time a character must be encoded, if:

It is one of those special-meaning characters, and not being used with that special meaning.
It is one of the reserved characters that may have a special meaning in a particular URI scheme or particular place.
It has a code point about U+007F.

There are exceptions to the last two though.

In the third case if you use a IRI then you don't encode such characters, which is pretty much the definition of an IRI. You can convert between IRI and URI by doing or undoing that encoding. (Any such characters in the host portion must be punycode encoded though, not URI-encoded).

In the second case it's safe to not encode the character if it isn't used as a delimiter in the context in question. So for example, & can be left as it is in some URIs but not in HTTP URIs where it is often used as a separator for query data. This though depends upon having particular knowledge of the particular URI scheme. It's also probably just not worth the risk of some other process not realising it's okay.

! is an example of this. RFC 3986 includes the production:

reserved    = gen-delims / sub-delims

gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"

sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
            / "*" / "+" / "," / ";" / "="

And so ! is in the set of characters that can be safe to leave unencoded or not, depending on the scheme in use.

Generally, if you're writing your own encoding code (such as when writing a HttpEncoder implementation) you're probably better off just always encoding !, but if you're using an encoder that doesn't encode ! all the time that's probably okay too; certainly in HTTP URIs it shouldn't make any difference.

Collectives™ on Stack Overflow

.NET UrlEncode not working?

2 Answers 2

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related