I have a PySpark dataframe with multiple columns (around 320)
I have to find keyword baz in col A. in case the baz is found, then replace the existing value in all columns listed in the list columns_for_replacement with None
columns_for_replacement = ["B", "C", "D", "E", "F", "G", "H", "I"]
I am trying to modify the below code to do the same:
for i in columns_for_replacement:
df = df.withColumn(i,when((col(i)=='baz'),None).otherwise(col(i)))
The above code works only for the specific column which is not my expected requirement.
Base dataframe:
A B C D E F G H I J
baz abc abc abc abc abc abc abc abc abc
baz abc abc abc abc abc abc abc abc abc
def abc abc abc abc abc abc abc abc abc
baz abc abc abc abc abc abc abc abc abc
map abc abc abc abc abc abc abc abc abc
baz abc abc abc abc abc abc abc abc abc
noo abc abc abc abc abc abc abc abc abc
Expected dataframe:
A B C D E F G H I J
baz abc
baz abc
def abc abc abc abc abc abc abc abc abc
baz abc
map abc abc abc abc abc abc abc abc abc
baz abc
noo abc abc abc abc abc abc abc abc abc