0

Is there any way by which I can use UDF's created in pyspark into Java Spark job

I know there is a way to use Java UDF into pyspark, but I am looking for other way round

1 Answer 1

1

First, I have to say that I don’t recommend you to do that. It sounds like a huge latency for the UDF, and I really suggest you to try write the UDF in Scala / Java.

If you still want to do that, here is how: you should write a UDF that creates a Python interpreter and executes your code. Here is a Scala code example:

System.setProperty("python.import.site", "false")
val interpreter = new PythonInterpreter
interpreter.exec("from __builtin__ import *")
// execute a function that takes a string and returns its length
val someFunc = interpreter.get("len")
val result = someFunc.__call__(new PyString("Test!"))
val realResult = result.__tojava__(classOf[Integer]).asInstanceOf[Int]
print(realResult)

This code call the len Python function and returns its result on the string "Test!".

I really think it’ll cause a bad performance for your job, and you should reconsider this plan again.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.