8

In PySpark I have a dataframe composed by two columns:

+-----------+----------------------+
| str1      | array_of_str         |
+-----------+----------------------+
| John      | [mango, apple, ...   |
| Tom       | [mango, orange, ...  |
| Matteo    | [apple, banana, ...  | 

I want to add a column concat_result that contains the concatenation of each element inside array_of_str with the string inside str1 column.

+-----------+----------------------+----------------------------------+
| str1      | array_of_str         | concat_result                    |
+-----------+----------------------+----------------------------------+
| John      | [mango, apple, ...   | [mangoJohn, appleJohn, ...       |
| Tom       | [mango, orange, ...  | [mangoTom, orangeTom, ...        |
| Matteo    | [apple, banana, ...  | [appleMatteo, bananaMatteo, ...  |

I'm trying to use map to iterate over the array:

from pyspark.sql import functions as F
from pyspark.sql.types import StringType, ArrayType

# START EXTRACT OF CODE
ret = (df
  .select(['str1', 'array_of_str'])
  .withColumn('concat_result', F.udf(
     map(lambda x: x + F.col('str1'), F.col('array_of_str')), ArrayType(StringType))
  )
)

return ret
# END EXTRACT OF CODE

but I obtain as error:

TypeError: argument 2 to map() must support iteration
4

1 Answer 1

8

You only need small tweaks to make this work:

from pyspark.sql.types import StringType, ArrayType
from pyspark.sql.functions import udf, col

concat_udf = udf(lambda con_str, arr: [x + con_str for x in arr],
                   ArrayType(StringType()))
ret = df \
  .select(['str1', 'array_of_str']) \
  .withColumn('concat_result', concat_udf(col("str1"), col("array_of_str")))

ret.show()

You don't need to use map, standard list comprehension is sufficient.

Sign up to request clarification or add additional context in comments.

1 Comment

Only caveat is that this will break if any of the str1 or array_of_str values are null. You'd have to add explicit error checking in your udf.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.