3

I have a dataframe from CSV file as follows,

         TimeStamp
0        12/7/2017 8:00
1        12/7/2017 7:00
2        12/7/2017 6:00
3        12/7/2017 5:00
4        12/7/2017 4:00
5        12/7/2017 3:00
6        12/7/2017 2:00
7        12/7/2017 1:00
8        12/7/2017 0:00
9       11/7/2017 23:00
10      11/7/2017 22:00
...
9996     3/12/2015 6:00
9997     3/12/2015 5:00
9998     3/12/2015 4:00
9999     3/12/2015 3:00
Name: TimeStamp, Length: 10000, dtype: object

I am trying to use Pandas to read the data from a specific date and time range for example, from 11/7/2017 8:00 to 12/7/2017 8:00.

I have tried using Boolean mask, DatetimeIndex and .Between methods and it read data out of that range and from 2016 and 2015 as well. Here are my codes,

import pandas as pd

eurusd = pd.read_csv('fxhistoricaldata_EURUSD_hour.csv')
eurusd = eurusd[(eurusd['TimeStamp'] >= '11/7/2017 8:00') & 
(eurusd['TimeStamp'] <= '12/7/2017 8:00')]

print(eurusd['TimeStamp'])

or using .between,

eurusd = eurusd[eurusd['TimeStamp'].between('11/7/2017 8:00', '12/7/2017 8:00')]

The results are as such,

2        12/7/2017 6:00
3        12/7/2017 5:00
4        12/7/2017 4:00
5        12/7/2017 3:00
6        12/7/2017 2:00
7        12/7/2017 1:00
8        12/7/2017 0:00
23       11/7/2017 9:00
24       11/7/2017 8:00
513     12/6/2017 23:00
514     12/6/2017 22:00
515     12/6/2017 21:00
516     12/6/2017 20:00
517     12/6/2017 19:00
518     12/6/2017 18:00
519     12/6/2017 17:00
520     12/6/2017 16:00
521     12/6/2017 15:00
522     12/6/2017 14:00
523     12/6/2017 13:00
524     12/6/2017 12:00
525     12/6/2017 11:00
         ...       
8827     12/2/2016 5:00
8828     12/2/2016 4:00
8829     12/2/2016 3:00
Name: TimeStamp, Length: 305, dtype: object

Can anyone help me rectify my problem or are there any function that can help me fulfill my task? Any help is greatly appreciated!

2
  • 2
    convert the timestamp column like eurusd['Timestamp'] = pd.to_datetime(eurusd['Timestamp']) and your solution will work. Commented Jul 17, 2017 at 10:04
  • or use parse_dates=True for read_csv Commented Jul 17, 2017 at 10:07

2 Answers 2

3

You need to Ensure eurusd['TimeStamp'] is a series with dtype datetime64[ns]:

eurusd['TimeStamp'] = pd.to_datetime(eurusd['TimeStamp'])

Create a boolean mask:

mask = (eurusd['TimeStamp'] > start_date) & (eurusd['TimeStamp'] <= end_date)

Now, Re-assign (or whatever you want to do with the output):

eurusd = eurusd.loc[mask]
Sign up to request clarification or add additional context in comments.

Comments

1

You can do this with the datetime and dateutil libraries:

from dateutil import parser
from datetime import datetime

df['TimeStamp'] = df['TimeStamp'].apply(lambda x: parser.parse(x))

df[(df['TimeStamp'] < datetime(2017, 12, 7)) & (df['TimeStamp']> datetime(2015, 3, 12))] #or whichever dates you're looking for

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.