0

I wanted to Convert scala dataframe into pandas data frame

    val collection = spark.read.sqlDB(config)
    collection.show()

    #Should be like df=collection
1

2 Answers 2

1

You are asking for a way of using a Python library from Scala. This is a bit weird to me. Are you sure you have to do that? Maybe you know that, but Scala DataFrames have a good API that will probably give you the functionality you need from pandas.

If you still need to use pandas, I would suggest you to write the data that you need to a file (a csv, for example). Then, using a Python application you can load that file into a pandas dataframe and work from there.

Trying to create a pandas object from Scala is probably overcomplicating things (and I am not sure it is currently possible).

Sign up to request clarification or add additional context in comments.

Comments

0

I think If you want to use pandas based API in SPARK code, then you can install Koalas-Python library. So, Whatever the function you want to use from pandas API directly you can embed them in SPARK code.

To install kolas

pip install koalas

2 Comments

I think here collection variable is dataframe. toPandas() def would be there. If you apply the toPandas() function it will return pandas based data frame. This link will give more information about installing kolas and how to use It. medium.com/future-vision/…
Not related to my question.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.