I have a data frame which looks like below:
+---+-----+--------------------------------------------------------------------------------------------------+------+
|uid|label|features |weight|
+---+-----+--------------------------------------------------------------------------------------------------+------+
|1 |1.0 |[WrappedArray([animal_indexed,2.0,animal_indexed]), WrappedArray([talk_indexed,3.0,talk_indexed])]|1 |
|2 |0.0 |[WrappedArray([animal_indexed,1.0,animal_indexed]), WrappedArray([talk_indexed,2.0,talk_indexed])]|1 |
|3 |1.0 |[WrappedArray([animal_indexed,0.0,animal_indexed]), WrappedArray([talk_indexed,1.0,talk_indexed])]|1 |
|4 |2.0 |[WrappedArray([animal_indexed,0.0,animal_indexed]), WrappedArray([talk_indexed,0.0,talk_indexed])]|1 |
+---+-----+--------------------------------------------------------------------------------------------------+------+
and the schema is
root
|-- uid: integer (nullable = false)
|-- label: double (nullable = false)
|-- features: array (nullable = false)
| |-- element: array (containsNull = true)
| | |-- element: struct (containsNull = true)
| | | |-- name: string (nullable = true)
| | | |-- value: double (nullable = false)
| | | |-- term: string (nullable = true)
|-- weight: integer (nullable = false)
But I want to convert the features from Array[Array] to just Array i.e. flatMap a column array into the same column to get a schema like
root
|-- uid: integer (nullable = false)
|-- label: double (nullable = false)
|-- features: array (nullable = false)
| | |-- element: struct (containsNull = true)
| | | |-- name: string (nullable = true)
| | | |-- value: double (nullable = false)
| | | |-- term: string (nullable = true)
|-- weight: integer (nullable = false)
Thanks in advance.