0

I need to create a dataframe from nested list

I have tried different methods, But none worked

R = Row("id","age","serial")
List=[[1,2,3],[4,5,6],[7,8,9]]
sp=spark.createDataFrame([R(i) for i in (List)])

Expected:

please find the expected outout here

1 Answer 1

1

Instead of R(i) you must use R(*i). This passes individual elements of the inner list to the Row object.

In addition to this, zip must be applied on the input list to get a list of tuples, like below,

[(1, 4, 7), (2, 5, 8), (3, 6, 9)]

Full code,

R = Row("id","age","serial")
L=[[1,2,3],[4,5,6],[7,8,9]]
sp=spark.createDataFrame([R(*i) for i in zip(*L)])
sp.show()

Output:

+---+---+------+
| id|age|serial|
+---+---+------+
|  1|  4|     7|
|  2|  5|     8|
|  3|  6|     9|
+---+---+------+
Sign up to request clarification or add additional context in comments.

2 Comments

He is asking just for the transpose of your output.
My bad. Must've overlooked it. I've updated my answer. Thanks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.