0
  1. monitoring container input/landing
  2. .json file arrives in a format yy/mm/DD/myfile.json
  3. if valid json file --> move it to input/staging/.json
  4. if not valid --> copy to input/rejected/.json

Function triggers multiple times for each subfolder and output folder has 3 copies of the same file. How to modify function to only trigger once and only copy file once?

import logging import azure.functions as func import json

my init.py

def main(myblob: func.InputStream, inputBlob: bytes, outputBlob1: func.Out[bytes], outputBlob2: func.Out[bytes]):
    logging.info(f"Python blob trigger function processed blob \n"
                 f"Name: {myblob.name}\n"
                 f"Blob Size: {myblob.length} bytes")
    
    # Read the contents of the input blob
    blob_content = myblob.read()
    processed_file = validateJSON(blob_content) # returns True or False

    # if pass json validation 
    if processed_file:
        outputBlob1.set(myblob.read())
        logging.info(f"Blob copied to outputBlob1: {myblob.name}")
    else:
        outputBlob2.set(myblob.read())
        logging.info(f"Blob copied to outputBlob2: {myblob.name}")

# func to validate json data (not file!)
def validateJSON(jsonData):
    try:
        json.loads(jsonData)
    except ValueError as err:
        return False
    return True

my function.json file:

{
  "scriptFile": "__init__.py",
  "bindings": [
    {
      "name": "myblob",
      "type": "blobTrigger",
      "direction": "in",
      "path": "input/landing/{name}",
      "connection": "mystorageaccount"
    },
    {
      "name": "inputBlob",
      "type": "blob",
      "dataType": "binary",
      "direction": "in",
      "path": "input/landing/{name}",
      "connection": "mystorageaccount"
    },
    {
      "name": "outputBlob1",
      "type": "blob",
      "dataType": "binary",
      "direction": "out",
      "path": "input/staging/{rand-guid}.json",
      "connection": "mystorageaccount"
    },
    {
      "name": "outputBlob2",
      "type": "blob",
      "dataType": "binary",
      "direction": "out",
      "path": "input/regected/{rand-guid}.json",
      "connection": "mystorageaccount"
    }
  ]
}

my terminal output:

[2023-07-08T14:44:03.452Z] Host lock lease acquired by instance ID '000000000000000000000000FA91B3A1'.
[2023-07-08T14:46:27.618Z] Executing 'Functions.BlobTrigger1' (Reason='New blob detected(LogsAndContainerScan): input/landing/2023/07',

[2023-07-08T14:46:28.031Z] Python blob trigger function processed blob 
Name: input/landing/2023/07
Blob Size: None bytes
[2023-07-08T14:46:28.164Z] Blob copied to outputBlob2: input/landing/2023/07
[2023-07-08T14:46:28.282Z] Executing 'Functions.BlobTrigger1' (Reason='New blob detected(LogsAndContainerScan): input/landing/2023/07/08', 

[2023-07-08T14:46:28.485Z] Python blob trigger function processed blob 
Name: input/landing/2023/07/08
Blob Size: None bytes[2023-07-08T14:46:28.500Z] Blob copied to outputBlob2: input/landing/2023/07/08

[2023-07-08T14:46:28.991Z] Executed 'Functions.BlobTrigger1' (Succeeded, Id=6a6e5f58-b49e-46c9-a019-c8814c87e5fb, Duration=1656ms)
[2023-07-08T14:46:29.166Z] Executed 'Functions.BlobTrigger1' (Succeeded, Id=cfe1f858-fe5e-46cd-85fd-281fff7a0204, Duration=1057ms)
[2023-07-08T14:46:29.330Z] Executing 'Functions.BlobTrigger1' (Reason='New blob detected(LogsAndContainerScan): input/landing/2023/07/08/invalidJSON.json', Id=5a81c13f-b633-4be1-bdac-7281389f4403)

[2023-07-08T14:46:29.629Z] Python blob trigger function processed blob 
Name: input/landing/2023/07/08/invalidJSON.json
Blob Size: None bytes
[2023-07-08T14:46:29.629Z] Blob copied to outputBlob2: input/landing/2023/07/08/invalidJSON.json
[2023-07-08T14:46:30.211Z] Executed 'Functions.BlobTrigger1' (Succeeded, Id=5a81c13f-b633-4be1-bdac-7281389f4403, Duration=1157ms)

result: multiple copies

enter image description here

1 Answer 1

1

Azure blob trigger python function executes multiple times for each subfolder and creates multiple copies of the file

I have reproduced in my environment and below is the code which worked worked for me:

function.json:

{
  "bindings": [
    {
      "name": "myblob",
      "path": "samples-workitems/land/{name}",
      "connection": "AzureWebJobsStorage",
      "direction": "in",
      "type": "blobTrigger"
    },
    {
      "name": "outputBlob1",
      "direction": "out",
      "type": "blob",
      "connection": "AzureWebJobsStorage",
      "path": "samples-workitems/approved/{rand-guid}.json"
    },
    {
      "name": "outputBlob2",
      "direction": "out",
      "type": "blob",
      "connection": "AzureWebJobsStorage",
      "path": "samples-workitems/rejected/{rand-guid}.json"
    }
  ]
}

init.py:

import logging
import azure.functions as func
import json

 

def main(myblob: func.InputStream, outputBlob1: func.Out[bytes], outputBlob2: func.Out[bytes]):
    logging.info(f"Python blob trigger function processed blob \n"
                 f"Name: {myblob.name}\n"
                 f"Blob Size: {myblob.length} bytes")

 

    blob_content1 = myblob.read()
    processed_file = validateJSON(blob_content1)  # returns True or False

 

    # if pass json validation
    if processed_file:
        outputBlob1.set(blob_content1)
        logging.info(f"Blob copied to outputBlob1: {myblob.name}")
    else:
        outputBlob2.set(blob_content1)
        logging.info(f"Blob copied to outputBlob2: {myblob.name}")

 

    

 

# func to validate json data (not file!)
def validateJSON(jsonData1):
    try:
        json.loads(jsonData1)
    except ValueError as err:
        return False
    return True

Output:

If Success:

enter image description here

enter image description here If rejected:

enter image description here

enter image description here

This is the code and process which worked for me try to change function.json(i have observerd 4 bindings, make it to 3) and init file(why are you using inputblob remove it) according to mine. Try to change your code and you will get desired output as i have got

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks. But in myblob has daily drops a folder with json file (not just a file). So every day "input/landing" will receive a dynamic folder "2023/08/08/.json. That is why function executes multiple times (for each folder and subfolder)
But seems like simply adding .json extension in function.json file does the trick. "path": "input/landing/{name}.json",
Yes, you are correct that does the trick. Follow my code and you will get output as I have got.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.