0

I have encountered an interesting issue. I'm using node v8.1.4

I have the following buffer.

[ 191, 164, 235, 131, 30, 28, 164, 179, 101, 138, 94, 36, 115, 176, 83, 193, 9, 177, 85, 228, 189, 193, 127, 71, 165, 16, 211, 132, 228, 241, 57, 207, 254, 152, 122, 98, 100, 71, 67, 100, 29, 218, 165, 101, 25, 17, 177, 173, 92, 173, 162, 186, 198, 1, 80, 94, 228, 165, 124, 171, 78, 49, 145, 158 ] 

When i try to convert it to utf8 using nodejs and using browser i get different results. even length of string is not the same.

Is there a way to convert string to utf8 in browser same way as node js do?

It seems that some characters that some sequence which nodejs replace to U+FFFD are more lengthy than the replaced sequence in browser. so output utf8 string is different

Code i use in browser and in nodejs is same i have buffer object tmpString

  tmpString.toString('utf-8')

tmpString.toString('utf-8').length differs in browser and nodejs for the same source bytes.

In nodejs i use native buffer implementation, for browser webpack loads polyfill (feross/buffer i think)

i think more accurately would say that i try to interpret buffer bytes as UTF8 string.

4
  • What do you mean by "convert to UTF8"? Do you mean "interpret as UTF8 string", or do you mean "transform this XY-encoded buffer to a UTF8 buffer"? Please show the code you are using in node, and the code you tried to use in the browser. Commented Sep 3, 2017 at 14:41
  • update with details. it seems that correctly would say that i try to interpret buffer as utf8 string Commented Sep 3, 2017 at 14:49
  • If you are using a node Buffer polyfill and it does something different than the native one, you probably should report this test case as a bug. Commented Sep 3, 2017 at 14:52
  • thank you, will try to find the solution. Commented Sep 3, 2017 at 15:05

2 Answers 2

7

Have you tried the TextEncoder/TextDecoder APIs? I've used them for converting strings in both nodejs and the browser and haven't seen any differences.

E.g.:

const encoder = new TextEncoder('utf-8');
const decoder = new TextDecoder('utf-8');

const foo = 'Hello world!';
const encoded = encoder.encode(foo);
console.log(encoded);

const decoded = decoder.decode(encoded);
console.log(decoded);

Sign up to request clarification or add additional context in comments.

5 Comments

Yes, it produces the same output as browser toString(), may be it is more correct than nodejs one, but i'm looking for the same behaviour.
Are you sure the browser toString() method converts a buffer to an encoded utf-8 string? I can't find any information suggesting that is a method that exists. When I use toString() on a Uint8Array buffer it prints the byte values joined by commas, and if I use toString() on an ArrayBuffer it prints "[object ArrayBuffer]". Are we doing something differently?
Nodejs Buffer overrides toString method with the one that decodes strings using the supplied encoding, instead of outputing elements.
In Nodejs yes, but not in the browser. Or are you using some script that overrides it in the browser as well? And if that's the case, that is probably where the issue lies.
thank you, i'm trying to find where is the issue. Because polyfill code looks very close to the original one. so it seems that devil is in small details.
1

If you are reading a file and sending its content directly on the server side via nodejs:

const content = await readFile(fullpath);
socket.clients.forEach(ws => ws.send(content));

You will get an array buffer as shown below:

{ type: 'Buffer', data: [35, 32, ...] }

This could be converted into string like so:

const encoder = new TextDecoder("utf-8");
const array = new Uint8Array(value.data);
const textContent = encoder.decode(array);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.