0

I have a data frame with an index column and column with a list of values (lists could be different length):

df2 = pl.DataFrame({'x': [1, 2, 3], 'y': [['a', 'b', 'c'], ['d', 'e', 'f', 'g'], ['h', 'i', 'j']]})

shape: (3, 2)
┌─────┬──────────────────┐
│ x   ┆ y                │
│ --- ┆ ---              │
│ i64 ┆ list\[str\]      │
╞═════╪════════════ ═════╡
│ 1   ┆ ["a", "b", "c"]  │
│ 2   ┆ ["d", "e", … "g"]│
│ 3   ┆ ["h", "i", "j"]  │
└─────┴──────────────────┘


I'm trying to transpose the list, convert it into a series and retain the index so the resulting data frame would look like:

┌─────┬─────┐
│ x   ┆ yp  │
│ --- ┆ --- │
│ i64 ┆ str │
╞═════╪═════╡
│ 1   ┆ "a" │
| 1   ┆ "b" |
| 1   ┆ "c" |
| 2   ┆ "d" |
| 2   ┆ "e" |
| 2   ┆ "f" |
| 2   ┆ "g" |
│ 3   ┆ "h" │
|...  ┆...  |
└─────┴─────┘

I could probably iterate through the data frame but I don't think this would be the most optimal way to do this. Any help would be appreciated.

1 Answer 1

1
import polars as pl

df2 = pl.DataFrame({'x': [1, 2, 3], 'y': [['a', 'b', 'c'], ['d', 'e', 'f', 'g'], ['h', 'i', 'j']]})

# Unnest the 'y' column and repeat 'x' values
df_unnested = df2.explode('y')

# Print the resulting DataFrame
print(df_unnested)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.