3

Hi I want to convert a dataframe column (string) into date.I found it converted some of the dates correctly and some of them are wrong.

df
  Id       Date     Rev
1605380 1/12/2018   3000.0
2237851 27/11/2018  3000.0
1797180 11/2/2018   2000.0
1156126 9/1/2018    2000.0
1205792 8/4/2017    2000.0

df['Date'] = pd.to_datetime(df['Date'])

The output I got

 Id       Date      Rev
1605380 2018-01-12  3000.0
2237851 2018-11-27  3000.0
1797180 2018-11-02  2000.0
1156126 2018-09-01  2000.0
1205792 2017-08-04  2000.0

It seems that if the "day" is not two digit, datetime converted it into "month" instead of "day". Therefore, 1/12/2018 should be 2018-12-01, not 2018-01-12. How can I fix this issue ?

I actually only need year and month for the output.

Ideal output

  Id       Date     Rev
1605380 2018-12    3000.0
2237851 2018-11    3000.0
1797180 2018-02    2000.0
1156126 2018-01    2000.0
1205792 2017-04    2000.0
2
  • You can use format in the to_datetime function to encode the exact syntax of the date you have: here it would be format="%d/%m/%Y" after that you can convert the datetime object to any string, cf. here pandas.pydata.org/pandas-docs/stable/generated/…. Commented Jan 18, 2019 at 4:19
  • Note that pandas is trying to be helpful and interpret your date and time strings as m/d/Y, except in cases where it can't and where it tries alternatives like d/m/Y (as it did for 27/11/2018). @Ascurion's suggestion is the correct way forwards, as it should always be '%m/%d/%Y' in your case, it appears. Commented Jan 18, 2019 at 4:22

1 Answer 1

3

You just need to specify the format parameter to '%d/%m/%Y' to explicitly tell the date format as commented. Or set dayfirst to True. A datetime object actually has information for year, month, day, and time, so to get just month and year displayed, you'll have to convert back to string:

df['Date'] = pd.to_datetime(df['Date'], dayfirst=True).dt.strftime('%Y-%m')
Sign up to request clarification or add additional context in comments.

2 Comments

Hi thanks for the reply. It still extracts "day" as month, instead of "month" for day that are not two digits. for example 1/12/2018 becomes 2018-01, however it should be 2018-12
That's odd. Works for me. Is that the case for both dayfirst=True and format='%d/%m/%Y?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.