I need to create a dataframe from nested list
I have tried different methods, But none worked
R = Row("id","age","serial")
List=[[1,2,3],[4,5,6],[7,8,9]]
sp=spark.createDataFrame([R(i) for i in (List)])
Expected:
I need to create a dataframe from nested list
I have tried different methods, But none worked
R = Row("id","age","serial")
List=[[1,2,3],[4,5,6],[7,8,9]]
sp=spark.createDataFrame([R(i) for i in (List)])
Expected:
Instead of R(i) you must use R(*i). This passes individual elements of the inner list to the Row object.
In addition to this, zip must be applied on the input list to get a list of tuples, like below,
[(1, 4, 7), (2, 5, 8), (3, 6, 9)]
Full code,
R = Row("id","age","serial")
L=[[1,2,3],[4,5,6],[7,8,9]]
sp=spark.createDataFrame([R(*i) for i in zip(*L)])
sp.show()
Output:
+---+---+------+
| id|age|serial|
+---+---+------+
| 1| 4| 7|
| 2| 5| 8|
| 3| 6| 9|
+---+---+------+