1

I want to read .csv files which contains special characters (polish language).

I'm using ExcelJs to read .csv:

    var workbook = new Excel.Workbook();
    workbook.csv.readFile(uploadsPath + "/" + filename, {delimiter: ';'})
        .then(function (worksheet) {
            var worksheet = workbook.getWorksheet(1);

            console.log(worksheet.getRow(3).getCell(7).value);
        });
}

With this code I'm getting "Wroc�aw" instead of "Wrocław".

I tried using encoding:

    var workbook = new Excel.Workbook();
    workbook.csv.readFile(uploadsPath + "/" + filename, {encoding: 'utf-16le'})
        .then(function (worksheet) {
            var worksheet = workbook.getWorksheet(1);

            console.log(worksheet.getRow(3).getCell(7).value);
        });
}

But then I'm getting this error:

TypeError [ERR_INVALID_ARG_TYPE]: The "buf" argument must be one of type Buffer, TypedArray, or DataView. Received type object

How to deal with it?

4
  • In the second variant, should not the encoding be utf16le? And should not the delimiter be also included in the options? Commented Feb 7, 2019 at 23:36
  • I am not sure if the part of the word would be decoded properly if the encoding is UTF-16 and the first example read it as UTF-8. Could it be some ANSI encoding for the Polish language? Like Windows-1250 or ISO-8859-2? If so, you may need decoder like iconv-lite. Commented Feb 7, 2019 at 23:48
  • 1
    @vsemozhetbyt, with utf16le and delimiter error is the same. I just tried with the iconv-lite and ANSI encoding and results are: Wroc�aw for Windows-1250, and Wroc?aw for ISO-8859-2. Commented Feb 8, 2019 at 0:10
  • Maybe try other ones with this letter? en.wikipedia.org/wiki/%C5%81#Computer_usage Commented Feb 8, 2019 at 0:32

2 Answers 2

1

Ok, I found a simple solution.

I created function

function changeEncoding(path) {
    var buffer = fs.readFileSync(path);
    var output = iconv.encode(iconv.decode(buffer, "win1250"), "utf-8");
    fs.writeFileSync(path, output);
}

I simply reading file, and with the help of iconv-lite, firstly decoding from win1250 and then saving the file with utf-8 encoding.

Sign up to request clarification or add additional context in comments.

Comments

0

First I think ł is a utf-8.

Try printing it in the browser, it may be the console that make it look like this

2 Comments

I'm inserting it into the database and it's the same problem.
Is the DB utf-8?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.