3

I'm losing my mind here. I've read and tried so much things that I'm totally lost. I don't usually use Python, and I'm trying to update a code. Before, csv files did not contains any specials characters (like "é"...) and now it does. The actual code returns the exception UnicodeEncodeError :

try:
        self.FichierE = codecs.open(self.CheminFichierE,"r", "utf-8")
        self.ReaderFichierE = csv.reader(self.FichierE, delimiter=';')
    except IOError:
        self.TextCtrl.AppendText(u"Fichier E n'a pas été trouvé")
        return

try:
        DataFichierE = [ligne for ligne in self.ReaderFichierE]
    except UnicodeDecodeError:
        self.TextCtrl.AppendText(self.NomFichierE+ u" n'est pas lisible")
        return
    except UnicodeEncodeError:
        self.TextCtrl.AppendText(self.NomFichierE+ u" n'est pas lisible (ASCII)")
        return

I've tried so many things, I'll just put the last thing I did (and that I thought it should work) :

try:
        DataFichierE = []
        for utf8_row in self.ReaderFichierE:
            unicode_row = [x.decode('utf8') for x in utf8_row]
            DataFichierE.append(unicode_row)
    except UnicodeDecodeError:
        self.TextCtrl.AppendText(self.NomFichierE+ u" n'est pas lisible")
        return
    except UnicodeEncodeError:
        self.TextCtrl.AppendText(self.NomFichierE+ u" n'est pas lisible (ASCII)")
        return

Any help will be much appreciated !

1 Answer 1

1

You can try using pandas.

import pandas
myfile = open('myfile.csv')
data = pandas.read_csv(myfile, encoding='utf-8', quotechar='"', delimiter=';')
print(data.values)

or unicodecsv

import unicodecsv
myfile = open('myfile.csv')
data = unicodecsv.reader(myfile, encoding='utf-8', delimiter=';')
for row in data:                                                 
    print row

You may be able to install them using pip:

pip install pandas

pip install unicodecsv

Depending on your needs you may also try simple string operations:

data = [line.strip().split(';') for i, line in enumerate(open('./foo.csv').readlines()) if i != 0]

Update You can also try replacing unicode characters with ASCII equivalents:

from StringIO import StringIO
import codecs
import unicodedata

...

    try:
        self.FichierE =  StringIO(
            unicodedata.normalize(
                'NFKD', codecs.open(self.CheminFichierE, "r", "utf-8").read()
            ).encode('ascii', 'ignore'))
        self.ReaderFichierE = csv.reader(self.FichierE, delimiter=';')

    except IOError:
        self.TextCtrl.AppendText(u"Fichier E n'a pas été trouvé")
        return

    try:
        DataFichierE = [ligne for ligne in self.ReaderFichierE]
    except UnicodeDecodeError:
        self.TextCtrl.AppendText(self.NomFichierE+ u" n'est pas lisible")
        return
    except UnicodeEncodeError:
        self.TextCtrl.AppendText(self.NomFichierE+ u" n'est pas lisible (ASCII)")
        return
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.