23

I need to encode a String to byte array using UTF-8 encoding. I am using Google guava, it has Charsets class already define Charset instance for UTF-8 encoding. I have 2 ways to do:

  1. String.getBytes( charsetName )

    try {        
        byte[] bytes = my_input.getBytes ( "UTF-8" );
    } catch ( UnsupportedEncodingException ex) {
    
    }
    
  2. String.getBytes( Charset object )

    // Charsets.UTF_8 is an instance of Charset    
    
    byte[] bytes = my_input.getBytes ( Charsets.UTF_8 );
    

My question is which one I should use? They return the same result. For way 2 - I don't have to put try/catch! I take a look at the Java source code and I see that way 1 and way 2 are implemented differently.

Anyone has any ideas?

4
  • Do you get equivalent results from both? If so, I would favor the latter case. If not, you need to decide which you consider to be correct. Commented Apr 26, 2014 at 21:35
  • Yes, they return the same result. But my concern is why they are implemented differently? Why way 1 will not call way 2 internally? Commented Apr 26, 2014 at 21:37
  • @Loc What makes you think the former isn't calling the latter internally? (or, that they both wouldn't be calling some other common internal method?) docjar.com/html/api/java/lang/String.java.html lines 951 - 980 Commented Apr 26, 2014 at 21:42
  • @BrianRoach Roach They call StringCoding.encode but the way 1 call this method with first parameter is String, way 2 call this method with the first parameter is Charset instance. If we take a look at this method ( 2 version ), they are implemented differently. Commented Apr 26, 2014 at 21:46

4 Answers 4

25

If you are going to use a string literal (e.g. "UTF-8") ... you shouldn't. Instead use the second version and supply the constant value from StandardCharsets (specifically, StandardCharsets.UTF_8, in this case).

The first version is used when the charset is dynamic. This is going to be the case when you don't know what the charset is at compile time; it's being supplied by an end user, read from a config file or system property, etc.

Internally, both methods are calling a version of StringCoding.encode(). The first version of encode() is simply looking up the Charset by the supplied name first, and throwing an exception if that charset is unknown / not available.

Sign up to request clarification or add additional context in comments.

2 Comments

No. Internally, they call StringCoding.encode() but there are two version of StringCoding.encode(). The way 1 call this method with first parameter is charsetName, way2 call this method with first parameter is Charset instance. 2 version of StringCoding.encode() are implemented differently. I don't know why.
Sorry, I'll edit to clarify - the lookup is happening in encode()
12

The first API is for situations when you do not know the charset at compile time; the second one is for situations when you do. Since it appears that your code needs UTF-8 specifically, you should prefer the second API:

byte[] bytes = my_input.getBytes ( Charsets.UTF_8 ); // <<== UTF-8 is known at compile time

The first API is for situations when the charset comes from outside your program - for example, from the configuration file, from user input, as part of a client request to the server, and so on. That is why there is a checked exception thrown from it - for situations when the charset specified in the configuration or through some other means is not available.

Comments

4

Since they return the same result, you should use method 2 because it generally safer and more efficient to avoid asking the library to parse and possibly break on a user-supplied string. Also, avoiding the try-catch will make your own code cleaner as well.

The Charsets.UTF_8 can be more easily checked at compile-time, which is most likely the reason you do not need a try-catch.

Comments

3

If you already have the Charset, then use the 2nd version as it's less error prone.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.