1

I've read a csv file into Python, and it contains many objects for which the value is \N. I need to replace all of those instances with 'NaN'.

I've gotten the file to read in correctly, but I get an error when I try to replace the \Ns.

import pandas as pd

df = pd.read_csv(r'file.csv')

df.replace('\N', 'NaN')

File "<ipython-input-63-a631ab1f5217>", line 3
    df.replace('\N', 'NaN')
              ^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: malformed \N character escape
4
  • is that a typo or do you have a r before the path to your csv-file? Also maybe use np.nan instead of 'NaN' Commented Oct 10, 2019 at 21:42
  • 2
    if it's really \N, use replace(\\N ? Commented Oct 10, 2019 at 21:42
  • adding the extra backslash worked thank you! Commented Oct 10, 2019 at 21:44
  • pandas read_csv accepts a parameter called na_values which can be used to set any string as NaN directly when reading the file. You could do: df = pd.read_csv(r'file.csv', na_values='\N') (or \\N again) and drop the df.replace call. Commented Oct 10, 2019 at 21:47

2 Answers 2

1

Python uses backslashes as a symbol to signify escape sequences like newlines, tabs, quotes, etc. So if you want to use backslashes in a string, you must replace all the single backslashes with double backslashes, like so;

df.replace('\\N', 'NaN')
Sign up to request clarification or add additional context in comments.

Comments

0

Pass na_values="\\N" parameter:

df = pd.read_csv('file.csv',na_values="\\N")

Double backslash should be used to escape backslash.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.