2

I have one string in List something like

ListofString = ['Column1,Column2,Column3,\nCol1Value1,Col2Value1,Col3Value1,\nCol1Value2,Col2Value2,Col3Value2']

How do i convert this string to pyspark Dataframe like below

'\n' being a new row

Column1         Column2         Column3
-----------------------------------------
Col1Value1      Col2Value1      Col3Value1
Col1Value2      Col2Value2      Col3Value2

1 Answer 1

3

You simply need to convert the list of string in the correct format like this:

# convert the list of string into proper format
>>> l = ' '.join(ListofString)
>>> l = l.replace(',',' ')
>>> l = [x.strip().split(' ') for x in l.split('\n')]

>>> print(l)

>>> [['Column1', 'Column2', 'Column3'], ['Col1Value1', 'Col2Value1', 'Col3Value1'], ['Col1Value2', 'Col2Value2', 'Col3Value2']]

>>> df = spark.createDataFrame(l[1:],l[0])

>>> df.show()

+----------+----------+----------+
|   Column1|   Column2|   Column3|
+----------+----------+----------+
|Col1Value1|Col2Value1|Col3Value1|
|Col1Value2|Col2Value2|Col3Value2|
+----------+----------+----------+
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.