3

I'm trying to convert a byte array in string with simple code:

System.out.printf("%s \n", new String(b));

where b has this content (in hex chars):

32d001000001000000000000246d3639653331697769736374683134633439687763796c7862796f74697167786f786c7504696e666f0000010001

If I run my code in Windows I get the entire decoded String, however in Linux it seems dropped until the null byte (00). If I skip these bytes the correct String in produced.

How can I get the same result in Linux OS? Sorry but I can't attach image due to restriction :'(

Thanks in advance!

3
  • 1
    and you are sure that the input in windows and linux is identical? Java is not platform dependent, if you provide the same input, it should give the same output Commented Aug 12, 2015 at 8:46
  • 4
    @Stultuske: That's not true. The constructor the OP is using uses the platform-default encoding. Just because Java can run on multiple platforms doesn't mean that every possible program gives the same result on every platform. There are plenty of platform-specific parts, e.g. file separators, path separators and line breaks. Commented Aug 12, 2015 at 8:50
  • @Stultuske: the input is the same, I've tried to specify charset into the String ctor, but the result is the same. Commented Aug 12, 2015 at 8:59

3 Answers 3

3

Yes, that's because you're using the constructor that uses the platform default encoding to convert binary data to text. It's entirely reasonable for it to create different strings on different platforms - although I suspect your interpretation of Linux "dropping until the null byte" is incorrect, and may be due to the way you're displaying the strings.

Don't use the platform default encoding - or do so explicitly if you really want it. Assuming this really is text data, specify an appropriate encoding e.g. using StandardCharsets.

However, if this is arbitrary binary data (e.g. the result of encryption or compression) then you shouldn't be converting it into a string in this way at all - you should use a hex or base64 conversion.

Sign up to request clarification or add additional context in comments.

3 Comments

input contains mixed data, text and others. In window I can show all printable chars into the eclipse console, not in Linux.
@DomenicoChiarito: Well the first thing you need to do is work out which parts of the input are meant to be text. Next you should understand that you need to know which encoding the text is in... Basically there isn't nearly enough information for us to give specific help at the moment. You could decide to treat the whole thing as ISO-8859-1, but that's unlikely to give you a useful result...
UTF8 is fine as charset, then I found it was just a question of console, it doesn't show me the output string but the variable contains all chars, printable and not. tks!
0

In order to convert Byte array into String format correctly, we have to explicitly create a String object and assign the Byte array to it.

String s = new String(bytes);

reference: http://www.mkyong.com/java/how-do-convert-byte-array-to-string-in-java/

1 Comment

yes, but this give me a different result in different OS with the same input
0

Choose the right charset :

String s = new String(bytes, "UTF-16");

or

String s = new String(bytes, "EUC-JP");

http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#String%28byte[],%20java.nio.charset.Charset%29

https://docs.oracle.com/javase/7/docs/technotes/guides/intl/encoding.doc.html

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.