1

I have been using Spark-excel (https://github.com/crealytics/spark-excel) to write the output to a single sheet of an Excel sheet. However, I am unable to write the output to different sheets (tabs).

Can anyone suggest any alternative?

Thanks, Sai

3
  • Can't you just split your data into several tables and save each other separately while specifying a different sheetName parameter? Commented Feb 23, 2018 at 23:26
  • I tried that.The data gets overwritten since this supports only overwrite in the save mode. Commented Feb 26, 2018 at 14:48
  • Hi @Bharath, today, 7 years later, append mode is possible. I've written a minimal code example in an answer below. Commented Sep 22 at 8:39

2 Answers 2

0

I would suggest to split the problem into two phases:

  1. save the data into multiple csv using multiple Spark flows
  2. write an application, that converts multiple csv files to a single excel sheet, using e.g. this Java library: http://poi.apache.org/
Sign up to request clarification or add additional context in comments.

1 Comment

thank you for the suggestion. I will try that out and will let you know.
0

You can make a first write to the Excel file, then append with other dataframes.

My python code for this:

location = "/mnt/my_folder" # adapt to the folder you like to save to.

(
  first_df.write
  .format('com.crealytics.spark.excel')
  .option("dataAddress", "'First sheet'!A1")
  .option("header", "true")
  .mode("overwrite")
  .save(f"{location}/export.xlsx")
)

(
  second_df.write
  .format('com.crealytics.spark.excel')
  .option("dataAddress", "'Second sheet'!A1")
  .option("header", "true")
  .mode("append") # No overwrite the second time (default), as we need to save multiple sheets (tabs) to the Excel file.
  .save(f"{location}/export.xlsx")
)

Note, this worked for me, using the library version com.crealytics:spark-excel_2.12:0.13.5.

If you have a newer version of the library installed, check https://github.com/nightscape/spark-excel on how to update the above code snippet in order to use the newer version of the library properly. You might need to change the .format(...) argument.

I know the original question requested a solution in Scala. It's been a long time since I've written that. I'll make this answer a Community wiki so somebody who knows Scala can edit this answer, so the answer is in Scala rather than Python.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.