3

I have a dataframe with a text column and a name column. I would like to check if the name exists in the text column and if it does to replace it with some value. I was hoping that the following would work:

df = df.withColumn("new_text",regex_replace(col("text),col("name"),"NAME"))

but Column is not iterable so it does not work. Do I have to write a udf to do that? How would that look like?

3

1 Answer 1

7

You are almost close. Here is detailed example with withColumn and selectExpr options:

Sample df

df = spark.createDataFrame([('This is','This'),
('That is','That'),
('That is','There')],
['text','name'])

#+-------+-----+
#|   text| name|
#+-------+-----+
#|This is| This|
#|That is| That|
#|That is|There|
#+-------+-----+

Option 1: withColumn using expr function

from pyspark.sql.functions import expr, regexp_replace

df.withColumn("new_col1",expr("regexp_replace(text,name,'NAME')")).show()

#+-------+-----+--------+
#|   text| name|new_col1|
#+-------+-----+--------+
#|This is| This| NAME is|
#|That is| That| NAME is|
#|That is|There| That is|
#+-------+-----+--------+

Option 2: selectExpr using regexp_replace

 from pyspark.sql.functions import regexp_replace


df.selectExpr("*",
          "regexp_replace(text,name,'NAME') AS new_text").show()

#+-------+-----+--------+
#|   text| name|new_text|
#+-------+-----+--------+
#|This is| This| NAME is|
#|That is| That| NAME is|
#|That is|There| That is|
#+-------+-----+--------+
Sign up to request clarification or add additional context in comments.

3 Comments

Do you happen to know how to handle the case when name is a regex expression? I'm seeing an issue with expr("regexp_replace(column, 'regex', 'replace_value')")
To add, it's because the 'regex' is regex, but it being surrounded as a string from expr seems to interfere.
I think I solved it, but not sure why. ^([^.]+)?\\. worked instead of ^.*?\\. (but the latter works when I don't use expr)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.