0

when i read the entire XML file in JEditorPane all works fine except the BOM charatcer. I get a BOM charatcer (a dot) at start of file. If i remove the dot and save file it is saved as ANSI.In notepad++ it shows (ANSI as UTF-8) encoding for the same file. If i dont remove the dot XML parser fails to parse the document. Can u help me with this.???? thanks.

1
  • Which parser ? have you define the content-type on the JEditorPane ? Commented Oct 18, 2010 at 6:26

4 Answers 4

1

If your XML file only contains ASCII characters it will be valid ASCII/ANSI as well as valid UTF8, so don't worry about Notepad++ recognizing the file as ANSI.

While you can use a BOM for UTF8, it is discouraged because it will break a lot of Unix programs and you really shouldn't do it.

Sign up to request clarification or add additional context in comments.

Comments

1

Continue use UTF-8 without BOM. Try Editplus go to menu Document->File Encoding ->Change File Encoding then chose UTF-8.

Comments

0

Using the -D option of the java command, set the system property file.encoding, as suggested in this answer.

java -Dfile.encoding=utf-8

Comments

0

Problem:

utf-8 does not use the BOM, so most programs don't expect it and fail to parse/handle it. As far as I know only some Microsoft programs insert it to detect the utf-8 encoding faster.

Solution:

  • Remove the BOM, nobody needs it.
  • Don't use buggy editors with non standard encoding. (=> my opinion)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.