I'm confused about GCP Workflows data size limits. The documentation states:
- The maximum size of an HTTP response (if saved to a variable, the memory limit for variables applies): 2 MB
- The maximum cumulative size for variables, arguments, and events: 512 KB
Question: How can you actually use a 2MB HTTP response when you're limited to 512KB for variables?
Example scenario:
main:
steps:
# This works fine (256KB response)
- testSmall:
call: http.get
args:
url: "https://microsoftedge.github.io/Demos/json-dummy-data/256KB.json"
result: smallData
# This fails - 1MB response exceeds 512KB variable limit
- testLarge:
call: http.get
args:
url: "https://microsoftedge.github.io/Demos/json-dummy-data/1MB.json"
result: largeData # ← 512KB limit exceeded
# This never executes due to previous step failure
- useSubsetOfLargeData:
assign:
- largeDataSubset: ${largeData[0]}
The problem: Even though the HTTP response can be up to 2MB, you must assign it to the result variable to access the data, which hits the 512KB cumulative limit.
How can I actually utilize larger HTTP responses (1-2MB) in GCP Workflows?
1MB.jsonexample, you have 1MB of data but it's ~3000 records sp ~350 (<512) bytes/record. So you should be able to iterate over the response body and process elements in each record. Alternatively, you would customarily be able to invoke REST API (rather than a simple JSON blob GET) with filtering to restrict the returned records/fields. Alternatively, you could have Workflows invoke e.g. Cloud Run to pre-process the data for you and reduce it to the subset of data that you need.