1

I, the following servlet code does not display the characters, place them, he says something like this:  ршншнщ олрршш. Could you help fix it, I will be very grateful, I beginner in java so you can please send me the code to encoded everything was fine, it is advised to use:

.getBytes("UTF-8");

Here's the code:

import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.io.PrintWriter;
import java.util.ArrayList;
import java.util.List;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

public class servlet extends HttpServlet {
    /**
     * 
     */
    private static final long serialVersionUID = 1L;

    public static List<String> getFileNames(File directory, String extension) {

        List<String> list = new ArrayList<String>();
        File[] total = directory.listFiles();
        for (File file : total) {
            if (file.getName().endsWith(extension)) {
                list.add(file.getName());
            }
            if (file.isDirectory()) {
                List<String> tempList = getFileNames(file, extension);
                list.addAll(tempList);          
            }
        }
        return list;
    }



@SuppressWarnings("resource")
        protected void doPost(HttpServletRequest request, HttpServletResponse response) 
                throws ServletException, IOException{ 
                request.setCharacterEncoding("utf8");
                response.setContentType("text/html; charset=UTF-8");
                String myName = request.getParameter("text");

                List<String> files = getFileNames(new File("C:\\Users\\vany\\Desktop\\test"), "txt");
                for (String string : files) {
                if (myName.equals(string)) {
                       try {
                            File file = new File("C:\\Users\\vany\\Desktop\\test\\" + string);
                            FileReader reader = new FileReader(file);
                            int b;
                            PrintWriter writer = response.getWriter();
                            writer.print("<html>");
                            writer.print("<head>");
                            writer.print("<title>HelloWorld</title>");
                            writer.print("<body>");
                            writer.write("<div>");
                            while((b = reader.read()) != -1) {
                                writer.write((char) b);
                            }
                            writer.write("</div>");
                            writer.print("</body>");
                            writer.print("</html>");

                        } 
                       catch (Exception ex) {

                        }
                    }

                }
               }
        }

all I solved the problem, close all the giant question thanks.Special thanks to @BalusC put him pluses)

15
  • Sorry for the design if that is not the case Commented Nov 9, 2012 at 13:06
  • Java programmers for you it's probably a no-brainer, but for me this is a beginner so hard and I ask to write the code, please Commented Nov 9, 2012 at 13:08
  • Is that supposed to be ршншнщ олрршш Commented Nov 9, 2012 at 13:13
  • Then it means UTF-8 being misinterpreted as Windows-1251. Check in your browser that the server is sending header properly. In google chrome developer tools, check the headers from network tab. Commented Nov 9, 2012 at 13:17
  • I have no need to address this issue is the software is not sure that anybody will change the coding, so please help me with the code Commented Nov 9, 2012 at 13:19

1 Answer 1

2

This problem is two-fold.

First, you forgot to set the response encoding. This way the response is written with server platform default encoding. Add the following line before writing any byte/character to the response.

response.setCharacterEncoding(StandardCharsets.UTF_8.name());

Second, you're reading the file using server platform default encoding.

Reader reader = new FileReader(file);

You should be reading the file using an explicitly specified encoding matching the encoding actually used by the text file itself. This can be done with help of InputStreamReader.

Reader reader = new InputStreamReader(new FileInputStream(file), StandardCharsets.UTF_8);

See also:


Unrelated to the concrete problem, HTML code doesn't belong in a servlet. It belongs in a JSP. Continue here to learn how to deal with it: Generate an HTML Response in a Java Servlet.

Sign up to request clarification or add additional context in comments.

6 Comments

Thank you, and where do I enter it?
I believe it's just the content-type header being sent,  ршншнщ олрршш are already the correct bytes but just misinterpreted.
@Esailija: the content type header doesn't tell the server what encoding to use to write the response. It tells the client what encoding to use to read the response. But if the server itself didn't write the response in UTF-8 at all, then the client would of course misinterpret it, which is exactly what is happening here. The server should have been told to write the response in UTF-8. The content type header should of course be kept there, I have nowhere said to remove it. I recommend to read the "See also" link to understand the stuff better.
@BalusC the server is writing the response п»ї ршншнщ олрршш, in other words, the raw bytes in the response are EF BB BF 20 D1 80 D1 88 D0 BD D1 88 D0 BD D1 89 20 D0 BE D0 BB D1 80 D1 80 D1 88 D1 88. When these are interpreted in UTF-8, one will see ршншнщ олрршш in the browser. When they are interpreted in windows-1251, one sees п»ї ршншнщ олрршш. So the browser was interpreting them in windows-1251, so the content-type header must have been wrong and setting it to utf-8 is enough to fix it.
@Esailija: Those garbled characters are in first place not caused by wrong response encoding, but by wrong FileReader encoding. It is reading an UTF-8 file using the platform default encoding (which in turn is also written to the response using the platform default encoding). If the FileReader encoding was been fixed, but the response encoding not, then you would have seen the problem the other way round.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.