0

I am having an issue trying to use a code for converting a file into csv. I am using the code below as a start

directory = 'C:\OI Data'

filename = 'OpenInterest08-24-16'

data_xls = pd.read_excel(os.path.join(directory,filename), 'Sheet1',    index_col=None)
    data_xls.to_csv(os.path.join(directory,filename +'.csv'), encoding='utf-8')

and I am getting the following error:

Traceback (most recent call last):

File "", line 1, in

File "C:\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 714, in runfile execfile(filename, namespace)

File "C:\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 74, in execfile exec(compile(scripttext, filename, 'exec'), glob, loc)

File "C:/Users/Public/Documents/Python Scripts/work.py", line 26, in data_xls = pd.read_excel(os.path.join(directory,filename), 'Sheet1', index_col=None)

File "C:\Anaconda2\lib\site-packages\pandas\io\excel.py", line 170, in read_excel io = ExcelFile(io, engine=engine)

File "C:\Anaconda2\lib\site-packages\pandas\io\excel.py", line 227, in init self.book = xlrd.open_workbook(io)

File "C:\Anaconda2\lib\site-packages\xlrd__init__.py", line 441, in open_workbook ragged_rows=ragged_rows,

File "C:\Anaconda2\lib\site-packages\xlrd\book.py", line 91, in open_workbook_xls biff_version = bk.getbof(XL_WORKBOOK_GLOBALS)

File "C:\Anaconda2\lib\site-packages\xlrd\book.py", line 1230, in getbof bof_error('Expected BOF record; found %r' % self.mem[savpos:savpos+8])

File "C:\Anaconda2\lib\site-packages\xlrd\book.py", line 1224, in bof_error raise XLRDError('Unsupported format, or corrupt file: ' + msg) xlrd.biffh.XLRDError: Unsupported format, or corrupt file: Expected BOF record; found '\n\n\n\n\n '

I am struggling to figure out the file format I am using https://www.theice.com/marketdata/reports/icefuturesus/PreliminaryOpenInterest.shtml?futuresExcel=&tradeDate=8%2F24%2F16

opening the file myself I get the following enter image description here

I am still a beginner at python and some help would be much appreciated.

Thanks

3
  • Is your file an xls or xlsx file? pandas will choose the appropriate parser depending on the extension and there doesn't appear to be one... Commented Sep 6, 2016 at 11:33
  • This is problematic: data_xls.to_csv(os.path.join(directory,filename,'.csv'), encoding='utf-8')... Commented Sep 6, 2016 at 11:33
  • Try instead: data_xls.to_csv(os.path.join(directory,filename+'.csv'), encoding='utf-8') Commented Sep 6, 2016 at 11:34

1 Answer 1

1

You can start by fixing this part:

data_xls.to_csv(os.path.join(directory,filename,'.csv'), encoding='utf-8')

What happens when you do that is:

'C:\OI Data\\OpenInterest08-24-16\\.csv'

Which is not what you want. Instead do:

os.path.join(directory,filename+'.csv')

Which will give you:

'C:\OI Data\\OpenInterest08-24-16.csv'

Also, this is not a problem here, but in general be careful with this because a single backslash and a character can indicate an escape sequence, e.g. \n is a newline:

directory = 'C:\OI Data'

Instead escape the backslash like so:

directory = 'C:\\OI Data'
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.