1

Same question as this, but with UTF-8 instead of ASCII

In JavaScript, how can you get a string representation of a UTF-8 value?

e.g. how to turn "c385" into "Å" ?

or how to turn "E28093" into "—" (m dash) ?

or how to turn "E282AC" into "€" (euro sign) ?

My question is NOT a duplicate of Hex2Asc. You can see for yourself: hex2a("E282AC") will transform the string into "â¬" instead of transforming it into "€" (euro sign) !!

5
  • Take a look at this: stackoverflow.com/questions/834316/… Commented Jun 12, 2013 at 4:04
  • Actually the same answer. However, I wonder what char code "c3 85" would represent? And \u00c3 is Ã, not Å. Commented Jun 12, 2013 at 4:08
  • 1
    It's not the same question at all. In fact, the question I quoted is the same with the one you pointed at. Wikipedia: In UTF-8 the hexadecimal representation of Å is "c3 85". The answer there it will transform the string into another character: Ã Commented Jun 12, 2013 at 4:21
  • Or it will transform "E28093" into "â", instead of transforming it into "—" (m dash) Commented Jun 12, 2013 at 4:28
  • Here's a simple algorithm to do what you wish: jsfiddle.net/consultcory/9K6th/2. If this question gets reopened I'll post it as an answer. Commented Jun 12, 2013 at 13:25

2 Answers 2

3

I think this will do what you want:

function convertHexToString(input) {

    // split input into groups of two
    var hex = input.match(/[\s\S]{2}/g) || [];
    var output = '';

    // build a hex-encoded representation of your string
    for (var i = 0, j = hex.length; i < j; i++) {
        output += '%' + ('0' + hex[i]).slice(-2);
    }

    // decode it using this trick
    output = decodeURIComponent(output);

    return output;
}

console.log("'" + convertHexToString('c385') + "'");   // => 'Å'
console.log("'" + convertHexToString('E28093') + "'"); // => '–'
console.log("'" + convertHexToString('E282AC') + "'"); // => '€'

DEMO

Credits:

Sign up to request clarification or add additional context in comments.

Comments

1
var hex = "c5";
String.fromCharCode(parseInt(hex, 16));

you have to use c5, not c3 85 ref: http://rishida.net/tools/conversion/

Lear more about code point and code unit

  1. http://en.wikipedia.org/wiki/Code_point
  2. http://www.coderanch.com/t/416952/java/java/Unicode-code-unit-Unicode-code

1 Comment

thanks but I need to convert from UTF-8, not from ASCII. C5 is the ASCII code and C3 85 is the UTF-8 code. Most of the characters are not encoded in ASCII but all of them are encoded in Unicode (and in UTF-8)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.