I have a dataset that contains some nested pyspark rows stored as strings. When I read them into pyspark, of the columns is read as a string that look something like this:
'Row(name='Bob', updated='Sat Nov 21 12:57:54', isProgrammer=True)'
My goal is to parse some of these subfields into separate columns, but I am having trouble reading them in. .
df.select(col('user')['name'].alias('name'))
is the syntax I am trying, but it doesn't seem to be working. It gives me this error:
Can't extract value from user#11354: need struct type but got string
Is there an easy way to read this type of data?