I am reading a data_frame directly from a database using pandas.io.sql.read_frame:
cnx = pandas.io.sql.connect(host='srv',user='me',password='pw',database='db')
df = pandas.io.sql.read_frame('sql_query',cnx)
It works nicely in retrieving the data. But I would like to parse one of the columns as a datetime64, akin to what can be done when reading from a CSV file, e.g.:
df2 = pandas.io.read_csv(csv_file, parse_dates=[0])
But there is no parse_dates flag for read_frame. What alternative approach is recommended?
The same question applies to the index_col in read_csv, which indicates which col. should be the index. Is there a recommended way to do this with read_frame?
pandas.io.sqlto pandas, and it is still a work in progress, particularly detection of specific datatypes. I expect an upcoming version will contain big improvements. You can catch up on some recent discussion here: github.com/pydata/pandas/issues/1662 and here: github.com/pydata/pandas/issues/2717pd.tslib.Timestampobjects. And there is anindex_colargument forread_frame. Are you using the latest stable release of pandas?index_col=[0], as I do withpandas.io.read_csv, and it failed: KeyError: u'no item named 0'. After reading your comment, I triedindex_col=[key_name_string]instead, and it worked. Also, as the required column index is a datetime, pandas now correctly identifies the DataFrame as having a DatetimeIndex. So my problem is solved, thank you! However, before I set the col. as index, the DateTime type was not parsed correctly, so aparse_datesargument forpandas.io.sql.read_framewould be great.