0

How can I include datetimes into pd.DataFrame?

import pandas as pd
from datetime import datetime

df = pd.DataFrame({"a": ['2002-02-02', '2002-02-03', '2002-02-04']})
df["b"] = df["a"].apply(lambda t: datetime.strptime(t, '%Y-%m-%d'))  # datetime.strptime returns datetime.datetime
print(datetime(2002, 2, 2) in df["b"])

outputs False.

Similarly,

f["c"] = df["b"].apply(lambda t: t.to_pydatetime())
print(datetime(2002, 2, 2) in df["c"])

outputs False.

Note that neither this nor this works. Following any of those approaches, I end up with Timestamps instead of datetimes in the data frame.

I am using Python 3.8.5 and Pandas 1.2.1.

5
  • datetime(2002, 2, 2) in list(df['b'])? Commented Jul 9, 2021 at 10:06
  • @Epsi95 True. However, that means that I have to convert this every time. Commented Jul 9, 2021 at 10:10
  • @Epsi95 But datetime(2002, 2, 2) in list(pd.to_datetime(df['b']).unique()) is again False. Commented Jul 9, 2021 at 10:20
  • datetime(2002, 2, 2) in list(pd.to_datetime(pd.to_datetime(df['b']).unique())) Commented Jul 9, 2021 at 10:54
  • the title of the question confuses me; in pandas you'll want to work with the built-in datatype (datetime64 from numpy). Note that pandas will auto-convert Python standard lib datetime to it's built-in datatype. Only if you have a pd.Series of type datetime.date or datetime.time, the type won't be modified. Commented Jul 9, 2021 at 11:14

1 Answer 1

1

You can see after all your manipulations that all series of datetime obejcts are automatically converted to timestamps when added to the dataframe:

>>> df
            a          b          c
0  2002-02-02 2002-02-02 2002-02-02
1  2002-02-03 2002-02-03 2002-02-03
2  2002-02-04 2002-02-04 2002-02-04
>>> df.dtypes
a            object
b    datetime64[ns]
c    datetime64[ns]
dtype: object

I suggest you use the built-in pandas datetime handling, it’s definitely not much harder than python datetime objects:

>>> pd.Timestamp(2002, 2, 2) in df['b'].to_list()
True
>>> df['b'].eq(pd.Timestamp(2002, 2, 2))
0     True
1    False
2    False
Name: b, dtype: bool
>>> df['b'].eq(pd.Timestamp(2002, 2, 2)).any()
True

Additionally this opens up a wealth of possibilities to further handle dates and times that you can’t do with python datetime objects.

For example you can compare directly str instead of building Timestamp objects:

>>> df['b'].eq('2002-02-02')
0     True
1    False
2    False
Name: b, dtype: bool
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.