Converting string list to Python dataframe - pyspark python sparksql

Question

I have the following Python / Pyspark code:

sql_command = ''' query ''''
df = spark.sql(sql_command)
ls_colnames = df.schema.names
ls_colnames
     ['id', 'level1', 'level2', 'level3', 'specify_facts']

cSchema = StructType([
    StructField("colname", StringType(), False)
  ])
df_colnames = spark.createDataFrame(dataset_array,schema=cSchema)

File "/opt/mapr/spark/spark-2.1.0/python/pyspark/sql/types.py", line 1366, in _verify_type raise TypeError("StructType can not accept object %r in type %s" % (obj, type(obj))) TypeError: StructType can not accept object 'id' in type class 'str'

What can I do to get a spark object of the colnames? `

Neeraj Bhadani · Accepted Answer · 2017-08-10 11:20:08Z

3

Not sure if I have understood your question correctly. But if you are tryng to create a dataframe based on the given list, you can use below code for the same.

from pyspark.sql import Row
l =  ['id', 'level1', 'level2', 'level3', 'specify_facts']
rdd1 = sc.parallelize(l)
row_rdd = rdd1.map(lambda x: Row(x))
sqlContext.createDataFrame(row_rdd,['col_name']).show()

Hope it Helps.

Regards,

Neeraj

answered Aug 10, 2017 at 11:20

Neeraj Bhadani

3,14021 silver badges28 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Converting string list to Python dataframe - pyspark python sparksql

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related