0

I'm confused about GCP Workflows data size limits. The documentation states:

  • The maximum size of an HTTP response (if saved to a variable, the memory limit for variables applies): 2 MB
  • The maximum cumulative size for variables, arguments, and events: 512 KB

Question: How can you actually use a 2MB HTTP response when you're limited to 512KB for variables?

Example scenario:

main:
  steps:
    # This works fine (256KB response)
    - testSmall:
        call: http.get
        args:
          url: "https://microsoftedge.github.io/Demos/json-dummy-data/256KB.json"
        result: smallData

    # This fails - 1MB response exceeds 512KB variable limit
    - testLarge:
        call: http.get
        args:
          url: "https://microsoftedge.github.io/Demos/json-dummy-data/1MB.json"
        result: largeData  # ← 512KB limit exceeded

    # This never executes due to previous step failure
    - useSubsetOfLargeData:
        assign:
          - largeDataSubset: ${largeData[0]}

The problem: Even though the HTTP response can be up to 2MB, you must assign it to the result variable to access the data, which hits the 512KB cumulative limit.

How can I actually utilize larger HTTP responses (1-2MB) in GCP Workflows?

6
  • Your example is valid in that it demonstrates a limit of Workflows but it also shows a probable anti-pattern for Workflows use. I assume that 2MB is an evidence-based HTTP response size for the type of calls that Workflows is designed to work with. The pattern would be to extract some subset of structured (not blob) data from the response and use this as input into subsequent steps. In your example, the static URLs you include would likely be the data (variable values) and given to some step that processes these. Commented Jul 29 at 3:34
  • Yes those images were just for an example, assume it was JSON returned instead. The HTTP response would still return 660KB of JSON data that would cause that step to fail. I think we cannot access and use "largeData" anywhere else in the workflow because that step fails. How do we get a response (below 2mb and above 512kb) and still use it somewhere in other steps? Commented Jul 29 at 3:50
  • I changed the example now to use JSON Commented Jul 29 at 3:56
  • You're correct. You have options. In your 1MB.json example, you have 1MB of data but it's ~3000 records sp ~350 (<512) bytes/record. So you should be able to iterate over the response body and process elements in each record. Alternatively, you would customarily be able to invoke REST API (rather than a simple JSON blob GET) with filtering to restrict the returned records/fields. Alternatively, you could have Workflows invoke e.g. Cloud Run to pre-process the data for you and reduce it to the subset of data that you need. Commented Jul 29 at 14:48
  • 1
    Apologies for the delayed response, recovering from a power outage... I tried this and you're correct there's no way to process a >512KB response directly. I assumed (incorrectly) that it would be possible to either iterate directly (stream) over the response or extract a subset but neither appears possible. Commented Jul 30 at 17:28

1 Answer 1

1

While Workflows can receive HTTP responses up to 2MB, you cannot store the entire response in a variable if it exceeds 512KB.

To work around this:

  1. Use API pagination or filtering to fetch smaller pieces of data (each ≤512KB) in multiple steps then iterate over those chunks inside your workflow. Call the API multiple times with pagination parameters (like ?page=1&size=100) to get smaller pieces below 512KB limit each. This way, no single variable stores >512KB data, and the whole dataset is processed piece by piece.

Here’s a sample YAML that you can pattern:

main:
  params: [page]
  steps:
    - fetchPage:
        call: http.get
        args:
          url: "https://api.example.com/records?page=${page}&size=100"
        result: response
    - processRecords:
        # process response.body.records here
    - decideNextPage:
        switch:
          - condition: ${response.body.hasMore}
            next: nextPage
          - condition: true
            next: done
    - nextPage:
        call: main
        args:
          page: ${page + 1}
    - done:
        return: ...

This pattern ensures no single variable exceeds the 512KB limit.

If pagination/filtering isn't available, delegate data processing to Cloud Run or Cloud Functions invoked from your workflow, which can handle large data sizes.

Sign up to request clarification or add additional context in comments.

1 Comment

pagination/filtering is not available. Also cannot delegate data processing to Cloud Run or Cloud Functions in subsequent steps because of the 512KB data limit. Because of this I think it's not possible to solve this question.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.