0

Say I run this

DF1.withColumn("Is_elite",
               array_intersect(DF1.year,DF1.elite_years))
    .show()

I get the result I want which is a new column called Is_elite with the correct values and all Then in the next command I run

DF1.show

It just shows me what DF1 would have looked like had I not run the first command, my column is missing.

1 Answer 1

2

Since you have added .show() method in the line, it is not returning a new data frame. Make the following changes and try it out

elite_df = DF1.withColumn("Is_elite",array_intersect(DF1.year,DF1.elite_years))
elite_df.show()

In case you get confused about the object in python, try to print the type of object.

#the following must return a dataframe object. 
print(type(elite_df)) 

Dataframes are immutable and every transformation create a new dataframe reference and hence if you try to print the old datagram, you will not get the revised result.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.