pandas-read_excel syntax error [duplicate]

Question

Im trying to read in some data from the second sheet of a excel spreadsheet. skipping the first 18 rows and only columns C to F. This is what I have tried

import pandas as pd

new_file=pd.read_excel("C:\Users\denis\Documents\Dissertation\Raw Data\CO\1213Q1.xls",sheetname=1, skiprows=18, parse_cols=[2,5])

when I run this I get the following error

runfile('C:/Users/denis/Documents/Dissertation/Code/test.py', wdir='C:/Users/denis/Documents/Dissertation/Code')
  File "C:/Users/denis/Documents/Dissertation/Code/test.py", line 9
    new_file=pd.read_excel("C:\Users\denis\Documents\Dissertation\Raw Data\CO\1213Q1.xls",sheetname=1, skiprows=18, parse_cols=[2,5])
                          ^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

does anyone know what could be causing this?

Does this solve your problem? May be you need to add r before filename. pd.read_excel(r"C:\Users\denis\Documents\Dissertation\Raw Data\CO\1213Q1.xls",..) — harpan
– harpan, Commented Jun 15, 2018 at 20:51

rafaelc · Accepted Answer · 2018-06-15 20:53:24Z

4

You either have to escape the backslashes or use r in front to indicate raw string, i.e.

new_file=pd.read_excel(r"C:\Users\denis\Documents\Dissertation\Raw Data\CO\1213Q1.xls",sheetname=1, skiprows=18, parse_cols=[2,5]))

or

new_file=pd.read_excel("C:\\Users\\denis\\Documents\\Dissertation\\Raw Data\\CO\\1213Q1.xls",sheetname=1, skiprows=18, parse_cols=[2,5]))

answered Jun 15, 2018 at 20:53

rafaelc

59.4k15 gold badges64 silver badges87 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Tomas Farias · Accepted Answer · 2018-06-15 21:08:06Z

1

Take a look at this question: "Unicode Error "unicodeescape" codec can't decode bytes... Cannot open text files in Python 3

I suggest not passing a str as the first argument, and instead letting pathlib.Path handle this for you. Also, the docs specify sheetname and parse_cols are deprecated and skiprows should be list-like.

from pathlib import Path
import pandas as pd

p = Path('C:\Users\denis\Documents\Dissertation\Raw Data\CO\1213Q1.xls')
df = pd.read_excel(
    p, 
    sheet_name=1, 
    skiprows=list(range(18)), # skip first 18 rows (0-indexed)
    parse_cols=list(range(2, 6)) # only parse columns 2 (C) to 5 (F)
)

answered Jun 15, 2018 at 21:08

Tomas Farias

1,3531 gold badge14 silver badges18 bronze badges

Collectives™ on Stack Overflow

pandas-read_excel syntax error [duplicate]

2 Answers 2

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Linked

Related