4

The below code checks if there is a duplicate character

String s = "Bengaluru";
boolean[] characters = new boolean[128];

    for(int i=0; i<s.length();i++){
        char ch = s.charAt(i);          
        if(characters[ch] == true){
            return;
        }
        else
            characters[ch] = true;//Here true is getting stored in the ASCII value of the character. 
    }
1
  • There is no conversion, characters are numbers. Commented Feb 8, 2018 at 12:10

3 Answers 3

4

The full answer is much more complicated than what dasblinkenlight is suggesting.

Since Java 5, the data type char does not represent a character or Unicode codepoint anymore, but a UTF-16 encoded value, which might be a complete character or a fraction of a character. This UTF-16 value is in reality just a 16-bit unsigned integer in the range 0 to 65535 and will be casted automatically to an int when used as an array index, just as the other numeric datatypes like short or byte. If you really want a Unicode codepoint as a character, you should use the method codePointAt(int index) instead of charAt(int index). The Unicode code point can be in the range 0 to 1114111 (0x10ffff).

How the methods charAt and codePointAt methods work internally is implementation specific. It is often incorrectly claimed that a String is just a wrapper around an array of chars, but the internal implementation of the String class is not mandated by the language or API specification. Since Java 6, the Oracle VM has been using different optimization strategies to save memory and is not always using a plain char array.

Sign up to request clarification or add additional context in comments.

2 Comments

So the base point is that when a char is used as an arrayIndex , it is automatically casted to an int and that is how it indirectly represents ASCII value of the character. Is my understanding correct?
@ChethanSwaroop Yes, assuming you are using ASCII as a genericized brand like "Kleenex". Please stop. It would be more accurate to say "character code" in general or "UTF-16 code unit" specifically. And, as the answer explains, char is not always the complete codepoint or "character" (e.g., "🙏".equals("\uD83D\uDE4F")).
1

Java represents chars using 16-bit UNICODE code points*. There is no conversion to ASCII happening - it's just that the initial 128 code points happen to represent the same characters as the corresponding ASCII values.

Java does perform a conversion of char to int in order to make the indexing possible. This is a built-in conversion that happens implicitly, because it is widening. In other words, any value that can be stored in a char can be represented in an int without a loss.

* Java-5 switched to UTF-16 representation, changing interpretation of some numbers as "partial characters". chars remained 16-bit unsigned numbers, though.

2 Comments

This has been wrong since the release of Java 5, 13 years ago.
@jarnbjo I added a footnote for that, thank you. It has zero effect on the main point of the answer, though, because OP's main confusion is with suitability of chars for indexing an array.
0

Java Supports Automatic widening primitive conversions

https://docs.oracle.com/javase/specs/jls/se8/html/jls-5.html#jls-5.1.2

How to stop Java from automatically casting a char value to an int?

char to int, long, float, or double

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.