-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Wasm irreducible loop transformation #121728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Wasm irreducible loop transformation #121728
Conversation
Determine how to emit Wasm control flow from the JIT's control flow graph. Relies on loop-aware RPO to determine the block order. Currently only handles the main method. Assumes irreducible loops have been fixed upstream (which is not yet guaranteed; bails out if not so). Doesn't actually do any emission, just prints a textual description in the JIT dump (along with a dot markup version). Uses only LOOP and BLOCK. Tries to limit the extent of BLOCK. Run for now as an optional phase even if not targeting Wasm, to do some stress testing. Contributes to dotnet#121178
Loops for Wasm control flow codegen don't involve EH or runtime mediated control flow transfers. Implement a custom block successor enumerator for Wasm, and adjust `fgRunDFS` to allow using this and also to generalize how the DFS is initiated. Use this to build a "Wasm" DFS. In that DFS handle both the main method and all funclets (by specifying funclet entries as additional DFS starting points). Update the loop finding code to make suitable changes when it is driven from a "Wasm" DFS instead of the typical all successor / all predecessor DFS. Remove the restriction in the Wasm control flow codegen that only handles the main method; now it works for the main method and all funclets. Contributes to dotnet#121178.
…nsform to handle catchret better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements transformation of irreducible loops for WebAssembly control flow generation. The key changes include:
- Introduction of SCC (Strongly Connected Component) detection using Kosaraju's algorithm to identify irreducible loops
- Transformation of multi-entry SCCs into reducible loops via switch dispatch blocks
- New Wasm-specific DFS that respects WebAssembly control flow semantics (ignoring exceptional flow and funclet returns)
- Reordering of compilation phases to run the SCC transformation before lowering
Reviewed Changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| fgwasm.h | New header defining WasmSuccessorEnumerator, FgWasm class, and Scc class for Wasm control flow analysis |
| fgwasm.cpp | Implementation of Wasm DFS, SCC detection (Kosaraju's algorithm), and SCC transformation via switch dispatch |
| jitconfigvalues.h | Changed JitWasmControlFlow default from 0 to 1 to enable Wasm control flow by default |
| flowgraph.cpp | Updated DFS to support multiple entry blocks and special handling for Wasm control flow |
| fgdiagnostic.cpp | Updated diagnostic checks to use entry block vectors |
| compiler.hpp | Modified fgRunDfs to accept SuccessorEnumerator template parameter and entry block vector |
| compiler.h | Added declarations for new methods, forWasm flag in FlowGraphDfsTree |
| jiteh.cpp | Added overload of bbFindInnermostCommonTryRegion accepting region index |
| compphases.h | Added two new phases: PHASE_WASM_TRANSFORM_SCCS and PHASE_DFS_BLOCKS_WASM |
| compmemkind.h | Added WasmSccTransform memory kind |
| compiler.cpp | Moved Wasm control flow phases earlier in compilation pipeline |
| CMakeLists.txt | Added fgwasm.h to JIT_WASM_HEADERS |
| fgbasic.cpp | Updated fgSplitEdge to support BBJ_CALLFINALLYRET |
Comments suppressed due to low confidence (2)
src/coreclr/jit/fgwasm.cpp:1
- Corrected spelling of 'modifcations' to 'modifications'.
// Licensed to the .NET Foundation under one or more agreements.
src/coreclr/jit/fgwasm.cpp:1
- The variable 'numBlocks' is misleadingly named; it actually holds a BasicBlock pointer, not a count. Should be renamed to 'block'.
// Licensed to the .NET Foundation under one or more agreements.
|
@dotnet/jit-contrib PTAL The SCC transformation is currently enabled to verify it correctly transforms flow, so there are fairly substantial diffs. Surprisingly TP seems flat, so perhaps this transformation is making life easier for later phases, despite creating more blocks and more IR. Async is a big source of irreducible loops, so many diffs are seen in This transformation will be disabled before merging. |
|
Resolved the (self-inflicted) merge conflicts. |
|
The parts I understand LGTM |
|
@dotnet/jit-contrib any other comments? If not, I need one of you to approve this. |
If the Wasm DFS detects improper loop headers, then we have irreducible loops that cannot be expressed in Wasm control flow.
To fix this, run a pass to find the SCCs in the flow graph using Kosaraju's algorithm. Then invoke this algorithm recursively on the subgraph formed from the nodes in each SCC, minus the SCC entry nodes (nodes in the SCC with preds not in the SCC). Repeat until all "nested" SCCs are identified. This represents the full set of irreducible loops we need to transform. Note no SCCs share headers but nested SCCs will share interior blocks.
Single-entry SCCs are reducible loops and don't require any special processing as they can be emitted as Wasm lops. But multi-entry SCCs are irreducible loops and must be transformed.
So we transform each multi-emtry SCC (working inner to outer) by creating a per-SCC control var and dispatch block. Each SCC header is assigned an index from 0...N-1, where N is the number of headers in that SCC. The dispatch block switches to each the headers based on their index and the control var. Each pre-existing edge to the header is then logically split and the index var is assigned the index for that header and retargeted to the dispatch node. As an optimization and to handle some unsplittable edges, if an SCC header's pred has the header as its only successor, we put the control var assignment into the pred instead of splitting the edge.
This transforms each multi-entry SCC into a single-entry reducible loop. In checked builds we verify by rerunning the DFS and assert that there are no longer any improper headers.
Note there are other strategies for resolving SCCs into reducible loops that might offer better performance; we are intentionally picking something simple.
Defer handling cases where the original DFS found non-funclet blocks that could only be reached via EH, as we do not yet have a way of describing how Wasm control can reach such blocks. We will revisit this once we have the Wasm EH model design in place. Such cases are fairly rare (eg a try/catch that ends with a goto or return).
We currently run the SCC transform before lower to allow lower the chance to optimize the switch and because we introduce new IR. There is a risk that a sufficiently clever later phase (say one that could do block cloning or jump threading) might undo the dispatch structure and recreate an irreducible loop, but that doesn't seem to happen. The subsequent Wasm control flow phase will also assert that its run of Wasm DFS does not have any improper headers.
Continuation of #120534.
Contributes to #121178.