0

I use Spark to read JSON files that appear in a folder everyday with path pattern Yyyy/mm/dd to convert them into Iceberg format. Both folders JSON and Iceberg are in a s3 bucket on different paths.

Im using a stream reader as in

jsondf = spark.readStream.format("json").schema(myschema).option("cleanSource", "archive").option("sourceArchiveDir", "s3a://mybucket/myarchivepath").load("s3a://mybucket/sourcefolder/yyyy/mm/dd").select("*")

I have been trying several choices of streamwriters. A continuous streamwriter seems to work well and archives files when they popup. But we dont have so many files so I want to try a trigger. Once=true triggers seems to be a wrong choice for archiving but I dont know why (any reason for Once=true to fail when archiving? It looks to me like the natural choice for archiving). Due to this Im trying availableNow=true like in:

jsondf.writeStream.trigger(availableNow=true).format("iceberg").option("checkpointLocation", "s3a://mybucket/chkpointfolder").outputMode("append").start(jsontable)

Excuse any typos. I'm writing from a mobile.

Given the version without triggers works and archives, why using triggers make the archive to fail? As a matter of fact, I don't even see that this streamWriter makes the reader read any file at all.

PS: Im using Spark 3.4.1. It seems trigger Once is deprecated and is recommended to use availableNow.

3
  • Sorry i did not get. Does AvailableNow works or not for your usecase? Commented Apr 17 at 12:56
  • What does the title say? It does not work. I started using Once=True and didnt work. availableNow=True does not work either Commented Apr 17 at 13:58
  • I forgot the title once i read the description as the description talks about only once mostly. So I asked the question without checking title. No need to be rude about it I suppose considering we all are here to help each other out :) Commented Apr 17 at 14:13

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.