-2

I have a problem converting byte array to string in right format. Im reading byte array over TCP socket, it gives me bytes, one of the bytes is byte 158. If i read string with:

Encoding.Latin1.GetString(data)

it gives me string in format "blahblah\u009eblahblah". \u009e is the code for letter ž. The sting i need should be "blahblahžblahblah". How i can get the string in the right format?

Alredy tried other encodings like ACSII, UTF8 etc.. none of them got me the right format.

EDIT some code example how im getting the data and what im doing with it:

TcpClient client = new TcpClient(terminal.server_IP, terminal.port);
        NetworkStream stream = client.GetStream();
        stream.ReadTimeout = 2000;

        string message = "some message for terminal";
        byte[] msg = Encoding.Latin1.GetBytes(message);

        stream.Write(msg, 0, msg.Length);
        int bytes = stream.Read(data, 0, data.Length);
        string rsp = Encoding.Latin1.GetString(data, 0, bytes);

EDIT2 So, i dont know what was the problem... just created a new project for .NET Framework versoin 4.7.2, in that project its worikng fine. Thanks for suggestions for everyone, credit goes to @Jeppe Stig Nielsen

12
  • stackoverflow.com/questions/14057434/… Commented Oct 19, 2021 at 7:48
  • That looks a lot like unicode. I really wonder why UTF8 didn't work. Can you post a minimal reproducible example for us to reproduce this? Commented Oct 19, 2021 at 7:49
  • 3
    Is it that the byte array actually contains the textual representation of Unicode characters? how are you viewing the results. where are you getting the data from? Commented Oct 19, 2021 at 7:50
  • 1
    Could you provide the byte array, please? You can do it as string dump = string.Join(" ", msg); Console.WriteLine(dump);. Then, please, provide the desired string Commented Oct 19, 2021 at 8:04
  • 1
    @Taliga there are lot of smart people trying to help you here, If someone asks you to supply something they feel is pertinent to the clarity of the question, you should oblige and not discount such requests Commented Oct 19, 2021 at 8:33

1 Answer 1

0

Encoding.Latin1 is not usable in your case. True Latin 1 does not contain ž (LATIN SMALL LETTER Z WITH CARON).

If you want Windows-1252, use

Encoding.GetEncoding("Windows-1252").GetString(data)

This will turn bytes of decimal value 158 (hex 0x9E) into lowercase ž.


It may also be "Windows-1250" that you have. What other non-English letters do you expect in your text? Compare Windows-1252 and Windows-1250; they are different in general, but both agree that hex byte 0x9E (dec 158) is ž.


When on a .NET Core system where the above does not work immediately, attempt to execute:

Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
var goodText = Encoding.GetEncoding("Windows-1252").GetString(data);

Finding the type CodePagesEncodingProvider may need a reference to the assembly System.Text.Encoding.CodePages.dll.

Sign up to request clarification or add additional context in comments.

4 Comments

tried Encoding.GetEncoding("Windows-1252") got error: 'Windows-1252' is not a supported encoding name. For information on defining a custom encoding, see the documentation for the Encoding.RegisterProvider method.
@Taliga You are right, I was on the old .NET Framework (which also explains why I did not see Latin1 property which is new in .NET 5). You need to figure out if you have Windows-1252 or Windows-1250 or similar. Edit: Are you under Windows, or another OS?
Windows-1250 throws same error, im under windows, WPF project with .NET 5.0
@Taliga I added more to my answer above. See if it works.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.