3

I am getting a string from a third party program that I don't control. My piece of the code outputs this in HTML. This works fine in English, but in other languages it will show in a funny way. For example, accents in Spanish look funny and characters in eastern languages (i.e. korean) will look very funny. I am pretty sure I need to do some encoding work so that all languages display correctly.

My understanding of encoding is kind of poor, so before posting the real question, which I intuitively think it is: "How do I encode this to UTF-8 in C#", I would like to get more understanding on the matter by posting simpler questions.

My question here is: How do I know which type of encoding does my input string has? In Spanish, it looks like this when I get an accent: "Acción", instead of "Acción". Is this ANSI or what am I dealing with?

Thanks a lot in advance!

1
  • 3
    It is pretty much impossible to tell just from the byte stream. You need to ask the makers of the third party program what encoding it outputs in and read using the same encoding. Chances are (from your description) that this is a Unicode encoding. Commented Dec 21, 2012 at 15:52

1 Answer 1

8

I get an accent: "Acción"

The presence of the à character is a dead give-away. Accented capital A characters have character code 0xC0 and up. Which is often the first byte in a two-byte utf-8 encoded character. The ó glyph is codepoint U+00F3, the utf-8 encoding for it is 0xC3 + 0xB3. Which are the codepoints for à and ³

The strings are encoded in utf-8 but you are reading it with an 8-bit encoding like Encoding.Default

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks a lot Hans. This totally answers the question. Do you know how I can save this in a String with UTF-8 in C#? Do you suggest me to post this in a new question?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.