I wanted to Convert scala dataframe into pandas data frame
val collection = spark.read.sqlDB(config)
collection.show()
#Should be like df=collection
I wanted to Convert scala dataframe into pandas data frame
val collection = spark.read.sqlDB(config)
collection.show()
#Should be like df=collection
You are asking for a way of using a Python library from Scala. This is a bit weird to me. Are you sure you have to do that? Maybe you know that, but Scala DataFrames have a good API that will probably give you the functionality you need from pandas.
If you still need to use pandas, I would suggest you to write the data that you need to a file (a csv, for example). Then, using a Python application you can load that file into a pandas dataframe and work from there.
Trying to create a pandas object from Scala is probably overcomplicating things (and I am not sure it is currently possible).
I think If you want to use pandas based API in SPARK code, then you can install Koalas-Python library. So, Whatever the function you want to use from pandas API directly you can embed them in SPARK code.
To install kolas
pip install koalas
collection variable is dataframe. toPandas() def would be there. If you apply the toPandas() function it will return pandas based data frame. This link will give more information about installing kolas and how to use It. medium.com/future-vision/…