0

I have a dataset that contains a date_time column in this format:

date_time
01.01.2018 00:00 - 01.01.2018 00:15 (UTC)
01.01.2018 00:15 - 01.01.2018 00:30 (UTC)
01.01.2018 00:30 - 01.01.2018 00:45 (UTC)
01.01.2018 00:45 - 01.01.2018 01:00 (UTC)
.
.
.

By reading this dataset using pandas, I need to convert date_time column from the type object to datetime by using the following:

df['date_time'] = pd.to_datetime(df['date_time'])

However, I get this error

  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/dateutil/parser/_parser.py", line 643, in parse
    raise ParserError("Unknown string format: %s", timestr)
dateutil.parser._parser.ParserError: Unknown string format: 01.01.2018 00:00 - 01.01.2018 00:15 (UTC)

Clearly, pandas is not aware of date_time original format. How can I solve this?

1 Answer 1

3

First off, what you have is a range, not a single datetime.

Start with splitting it into start and end datetimes:

df[['dt_start', 'dt_end']] = df['date_time'].str.split(' - ', expand = True)

Now, starting datetime converts without issues (use utc = True if you want to keep it timezone-aware):

df['dt_start'] = pd.to_datetime(df['dt_start'], utc = True)

Ending datetime needs a little help in feeding it the format manually since to_datetime parser with arguments at default can't recognize (UTC) stamp:

df['dt_end'] = pd.to_datetime(df['dt_end'], format = '%d.%m.%Y %H:%M (UTC)', utc = True)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.