1

I'm reading a file where columns can be struct when they have a value else can be string when there is no data. Inline example assigned_to and group are struct and have data.

root
 |-- number: string (nullable = true)
 |-- assigned_to: struct (nullable = true)
 |    |-- display_value: string (nullable = true)
 |    |-- link: string (nullable = true)
 |-- group: struct (nullable = true)
 |    |-- display_value: string (nullable = true)
 |    |-- link: string (nullable = true)

To flatten the JSON I'm doing the following,

df23 = spark.read.parquet("dbfs:***/test1.parquet")
val_cols4 = []

#the idea is the day when the data type of the columns in struct I dynamically extract values otherwise create new columns and default to None.
for name, cols in df23.dtypes:
  if 'struct' in cols:
    val_cols4.append(name+".display_value") 
  else:
    df23 = df23.withColumn(name+"_value", lit(None))

Now if I had to use val_cols4 to select from dataframe df23 all the struct columns have the same name "display_value".

root
 |-- display_value: string (nullable = true)
 |-- display_value: string (nullable = true)

How do I rename the columns to appropriate values? I tried the following,

for name, cols in df23.dtypes:
  if 'struct' in cols:
    val_cols4.append("col('"+name+".display_value').alias('"+name+"_value')") 
  else:
    df23 = df23.withColumn(name+"_value", lit(None))

This doesn't work and errors out when I do a select on the dataframe.

1 Answer 1

2

You can append an aliased column object rather than a string to val_cols4, e.g.

from pyspark.sql.functions import col, lit

val_cols4 = []

for name, cols in df23.dtypes:
  if 'struct' in cols:
    val_cols4.append(col(name+".display_value").alias(name+"_value")) 
  else:
    df23 = df23.withColumn(name+"_value", lit(None))

Then you can select the columns, e.g.

newdf = df23.select(val_cols4)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.