7

I want to get docx file from azure blob storage, convert it into pdf and save it again into azure blob storage. I want to use pypandoc to convert docx to pdf.

pypandoc.convert_file('abc.docx', format='docx', to='pdf',outputfile='abc.pdf')

But, I want to run this code in azure function where I will not get enough space to save files, hence I am downloading file from azure blob storage using BytesIO as a stream as follows.

blob_service_client = BlobServiceClient.from_connection_string(cs)
container_client=blob_service_client.get_container_client(container_name)
blob_client = container_client.get_blob_client(filename)
streamdownloader=blob_client.download_blob()

stream = BytesIO()
streamdownloader.download_to_stream(stream)

now I want to convert my docx file which is accessible using stram to pdf. converted pdf also savable as BytesIO stream so could upload it into blob storage without taking system memory. but pypandoc showing error as RuntimeError: source_file is not a valid path if you could suggest some other way to convert docx to pdf which could handle BytesIO file format, then I like to mention I will work in linux environment where library like doc2pdf does not support.

1
  • Have you found a solution to this issue? Commented Mar 30, 2022 at 15:11

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.