2

Update: This was fixed by pull/6673


I have the following dataframe:

df = (
    pl.DataFrame(
        {
            "int": [1, 2, 3],
            "date": ["2010-01-31T23:00:00+00:00","2010-02-01T00:00:00+00:00","2010-02-01T01:00:00+00:00"]
        }
    )
    .with_columns(
        pl.col("date").str.to_datetime()
        .dt.convert_time_zone("Europe/Amsterdam")
    )
)

which gives:

┌─────┬────────────────────────────────┐
│ int ┆ date                           │
│ --- ┆ ---                            │
│ i64 ┆ datetime[μs, Europe/Amsterdam] │
╞═════╪════════════════════════════════╡
│ 1   ┆ 2010-02-01 00:00:00 CET        │
│ 2   ┆ 2010-02-01 01:00:00 CET        │
│ 3   ┆ 2010-02-01 02:00:00 CET        │
└─────┴────────────────────────────────┘

I would like to convert this datetime type to a string with a time zone designator, e.g. 2010-02-01 00:00:00+01:00

I tried the following:

df.with_columns(pl.col("date").dt.to_string("%Y-%m-%d %H:%M:%S%z"))

which gives the following error:

pyo3_runtime.PanicException: a formatting trait implementation returned an error: Error

My desired output is stated below, which is what you get when you convert a datetime column to a string type in pandas with the "%Y-%m-%d %H:%M:%S%z" as the format:

┌─────┬──────────────────────────┐
│ int ┆ date                     │
│ --- ┆ ---                      │
│ i64 ┆ str                      │
╞═════╪══════════════════════════╡
│ 1   ┆ 2010-02-01 00:00:00+0100 │
│ 2   ┆ 2010-02-01 01:00:00+0100 │
│ 3   ┆ 2010-02-01 02:00:00+0100 │
└─────┴──────────────────────────┘

Is there any way to realize this result? Leaving out the %z at the end when specifying the format works but the UTC time offset is something I need.

1
  • I think that could be classified as a "bug" and could be reported. As a workaround you could try .apply(str) - that seems to give %:z instead of %z though. Commented Jan 29, 2023 at 0:27

1 Answer 1

3

py-polars v0.16.3 fixes the issue:

import polars as pl

df = (
    pl.DataFrame(
        {
            "int": [1, 2, 3],
            "date": ["2010-01-31T23:00:00+00:00","2010-02-01T00:00:00+00:00","2010-02-01T01:00:00+00:00"]
        }
    )
    .with_columns(
        pl.col("date").str.to_datetime()
        .dt.convert_time_zone("Europe/Amsterdam")
    )
)

print(
      df.with_columns(pl.col("date").dt.to_string("%Y-%m-%d %H:%M:%S%z"))
)

shape: (3, 2)
┌─────┬──────────────────────────┐
│ int ┆ date                     │
│ --- ┆ ---                      │
│ i64 ┆ str                      │
╞═════╪══════════════════════════╡
│ 1   ┆ 2010-02-01 00:00:00+0100 │
│ 2   ┆ 2010-02-01 01:00:00+0100 │
│ 3   ┆ 2010-02-01 02:00:00+0100 │
└─────┴──────────────────────────┘

Notes

  1. to get a colon-separated UTC offset, use %:z. See also Rust / chrono formatting directives.
  2. convert_time_zone is the new with_time_zone. I hope it stays that way ;-)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.