0

I have following pandas dataframe:

id          term            code
2445 | 2716 abcd | efgh     2345
1287        hgtz            6567

I would like to explode id and term column. How can I explode multiple columns to keep the values across the columns id, term and code together.

The expected output is:

id          term            code
2445        abcd            2345
2716        efgh            2345
1287        hgtz            6567

I have tried so far is: df.assign(id=df['id'].str.split(' | ')).explode('id')

2 Answers 2

4

You're in the right way, you just need some help from concat with a listcomp :

out = (
        pd.concat([df[col].str.split("\s*\|\s*")
                          .explode() for col in ["id", "term"]], axis=1)
             .join(df["code"])
      )

Output : ​ print(out)

     id  term  code
0  2445  abcd  2345
0  2716  efgh  2345
1  1287  hgtz  6567
Sign up to request clarification or add additional context in comments.

2 Comments

You can remove replace and use a regex with split: str.split(r'\s*\|\s*'). Especially if the term column contains multiple words for one term.
This make the code shorter/better. Thanks for the trick, Corralien ;)
3

here is a way using .str.split() and explode() which can accept multiple columns

(df[['id','term']].stack()
.str.split(' | ',regex=False)
.unstack()
.explode(['id','term'])
.join(df[['code']]))

Output:

     id  term  code
0  2445  abcd  2345
0  2716  efgh  2345
1  1287  hgtz  6567

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.