2

I want to create a Dataframe in PySpark with the following code

from pyspark.sql import *
from pyspark.sql.types import *

temp = Row("DESC", "ID")
temp1 = temp('Description1323', 123)

print temp1

schema = StructType([StructField("DESC", StringType(), False),
                     StructField("ID", IntegerType(), False)])

df = spark.createDataFrame(temp1, schema)

But i am receiving the following error:

TypeError: StructType can not accept object 'Description1323' in type type 'str'

Whats wrong with my code?

1 Answer 1

4

The problem is that you are passing a Row where you should be passing a list of Rows. Try this:

from pyspark.sql import *
from pyspark.sql.types import *

temp = Row("DESC", "ID")
temp1 = temp('Description1323', 123)

print temp1

schema = StructType([StructField("DESC", StringType(), False),
                     StructField("ID", IntegerType(), False)])

df = spark.createDataFrame([temp1], schema)

df.show()

And the result:

+---------------+---+
|           DESC| ID|
+---------------+---+
|Description1323|123|
+---------------+---+
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.