when i read the entire XML file in JEditorPane all works fine except the BOM charatcer. I get a BOM charatcer (a dot) at start of file. If i remove the dot and save file it is saved as ANSI.In notepad++ it shows (ANSI as UTF-8) encoding for the same file. If i dont remove the dot XML parser fails to parse the document. Can u help me with this.???? thanks.
4 Answers
Using the -D option of the java command, set the system property file.encoding, as suggested in this answer.
java -Dfile.encoding=utf-8
Comments
Problem:
utf-8 does not use the BOM, so most programs don't expect it and fail to parse/handle it. As far as I know only some Microsoft programs insert it to detect the utf-8 encoding faster.
Solution:
- Remove the BOM, nobody needs it.
- Don't use buggy editors with non standard encoding. (=> my opinion)