I have been using Spark-excel (https://github.com/crealytics/spark-excel) to write the output to a single sheet of an Excel sheet. However, I am unable to write the output to different sheets (tabs).
Can anyone suggest any alternative?
Thanks, Sai
I have been using Spark-excel (https://github.com/crealytics/spark-excel) to write the output to a single sheet of an Excel sheet. However, I am unable to write the output to different sheets (tabs).
Can anyone suggest any alternative?
Thanks, Sai
I would suggest to split the problem into two phases:
You can make a first write to the Excel file, then append with other dataframes.
My python code for this:
location = "/mnt/my_folder" # adapt to the folder you like to save to.
(
first_df.write
.format('com.crealytics.spark.excel')
.option("dataAddress", "'First sheet'!A1")
.option("header", "true")
.mode("overwrite")
.save(f"{location}/export.xlsx")
)
(
second_df.write
.format('com.crealytics.spark.excel')
.option("dataAddress", "'Second sheet'!A1")
.option("header", "true")
.mode("append") # No overwrite the second time (default), as we need to save multiple sheets (tabs) to the Excel file.
.save(f"{location}/export.xlsx")
)
Note, this worked for me, using the library version com.crealytics:spark-excel_2.12:0.13.5.
If you have a newer version of the library installed, check https://github.com/nightscape/spark-excel on how to update the above code snippet in order to use the newer version of the library properly. You might need to change the .format(...) argument.
I know the original question requested a solution in Scala. It's been a long time since I've written that. I'll make this answer a Community wiki so somebody who knows Scala can edit this answer, so the answer is in Scala rather than Python.
sheetNameparameter?