0

I have a code in pyspark. I need to convert it to string then convert it to date type, etc.

I can't find any method to convert this type to string. I tried str(), .to_string(), but none works. I put the code below.

from pyspark.sql import functions as F

df = in_df.select('COL1')
> type(df) 
> <class 'pyspark.sql.dataframe.DataFrame'>

> df.printSchema() 
> |-- COL1: offsetdatetimeudt (nullable = true)
6
  • 1
    Can you please add the output of df.printSchema() to your question? Commented Jul 7, 2019 at 14:46
  • is this what you looking for? stackoverflow.com/questions/38610559/… just convert df to string is kind pointless since it is an entire column, Commented Jul 7, 2019 at 14:47
  • |-- COL1: offsetdatetimeudt (nullable = true) output of df.printSchema() Commented Jul 7, 2019 at 15:12
  • I need to convert each row to Date, therefore I need it to be a string. Commented Jul 7, 2019 at 15:13
  • Your column values look like this: 2019-07-07T00:00:00.000Z? Commented Jul 7, 2019 at 16:40

1 Answer 1

0

Straightforward to cast column directly to string

df2 = df.withColumn('COL1', df['COL1'].cast(StringType()))
Sign up to request clarification or add additional context in comments.

7 Comments

Thanks a lot, but it fails: AnalysisException: u"cannot resolve 'unix_timestamp(... I tried all these things, I just need to convert the column into string.
from pyspark.sql import functions as F, don't forget to import sql functions to your spark job. That causes issue with cannot resolve
Can anyone convert the single column into string?? I want to work on column of strings) isn't it too trivial?
I import all required functions
added cast to String in my answer, check str_col1 you can try it (My DF type has string by default, but it should work in your case)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.