0

Hi I am new to learning data science and was just trying to play with a data set:
When I check the dtype of a column it shows as object. I want to convert it to a string so I can strip the components into different columns, but changing to string is kind of not working and my strip method/function does not work.

I have a Date column with Date and Time in it like: 2025-05-19 12:08:22

My idea is to convert it to string with:

data_frame_H['Date'] = data_frame_H['Date'].astype('str')

and apply this to it:

df['clean_date'] = df['Date'].str.extract(r'0-9*-0-9*-0-9*', '', regex=True).str.strip()

But when I run this I get:
AttributeError: Can only use .str accessor with string values!

3
  • 1
    What are you trying to do? Dates aren't strings, they're binary values, both in Python and Pandas. There's no reason to "extract" anything from them, just use the date functions through .dt. to get what you want. They aren't "dirty" either. There are no spaces precisely because they aren't strings Commented Nov 14 at 13:10
  • 1
    If you want the date part, not the time, use data_frame_H['Date'].dt.date. You'll find all the date accessors in the docs. Same for time, .dt.time and the other components, eg .dt.year, .dt.microsecond. Commented Nov 14 at 13:17
  • Ohh it works, thanks a lot. It was so simple. I had date and time in same column, so i thought to extract them I can strip it as string into two columns and convert it again to datetime. But this is much straighforward. What should I do with this question now? Delete it I am new to stack overflow as well, this is my first post. Commented Nov 14 at 13:25

1 Answer 1

0

dtype=object in pandas does not guarantee strings — your column most likely contains datetime objects, so .str won’t work.

You don’t need regex for this. Convert to datetime properly and then split date/time using datetime accessors.

Correct way(Normal Way)

df['Date'] = pd.to_datetime(df['Date'])
df['clean_date'] = df['Date'].dt.date

regex Way

df['Date'] = df['Date'].astype(str)
df['clean_date'] = df['Date'].str.extract(r'(\d{4}-\d{2}-\d{2})')
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.