IR Optimizer Passes¶
The converter runs a lightweight, IR-only optimization sweep after lowering and before serialization. Passes must be structure-only (no op-specific math) and safe across onnx_ir variants. This guide documents the current canon and the invariants each pass must respect.
Pipeline Placement¶
The optimizer runs as Step 2 in the conversion pipeline (see architecture.md):
- Build raw IR (
to_onnx) optimize_graph← runs here- Late attribute overrides
- Shape inference (no-op currently)
- Finalize shapes
- Return from
conversion_api - Post-process (shape loosening, export prep)
This placement ensures: - Optimization sees the raw, unpatched graph for maximum benefit. - Late overrides only patch nodes that survived optimization. - Shape finalization operates on an already-optimized graph.
Transpose Pair Folding¶
Pattern
Transpose → [pure elementwise]* → Transpose
Condition The composed permutation of the Transpose nodes equals identity.
Allowed middle ops
Elementwise operators that do not reorder elements, including:
Relu, Gelu, Elu, Sigmoid, Tanh, LeakyRelu, Cast, CastLike, Identity, Not, etc.
Not folded
Anything that crosses non-elementwise operators such as AveragePool, Conv, or similar layout-sensitive ops.
Matching heuristics¶
- Follow the true consumer chain by name or object identity (some
onnx_irbuilds wrap/renameValueobjects). - Skip helper nodes on side branches (
Const,Shape, etc.) that do not consume the current tensor. - Require single consumer at each hop (no branching rewires).
- Read permutations from the
permattribute when available. - When
permis missing, treat the pair as cancellable only if the input and output shapes match and the middle segment is strictly elementwise.
Rewiring and deletion¶
onnx_ir.Node.inputsmay be immutable; useNode.replace_input_with(index: int, value: Value)when provided by the backend.- Rewire all consumers of the second transpose’s output (by name or object) to the kept tensor.
- Update graph/model outputs and the var→value map so no reference points at removed nodes.
- Delete nodes in reverse order (second transpose first), maintaining any live list mirrors (
graph.nodes,graph._nodes, etc.).
This pass is intentionally conservative, portable across onnx_ir variants, and oblivious to specific operator semantics.
Identity Reshape Removal¶
Pattern
Reshape(x, shape) where shape is a constant that exactly matches x’s known dimensions.
Condition
- The shape input is a constant tensor with no -1 or 0 entries.
- Every dimension of x is statically known and equal to the requested target.
- Output metadata (if present) already reflects the same shape.
Effect Rewire consumers of the Reshape output directly to the input and drop the node. Any now-unused shape initializers are left for later dead-code removal passes.
This trims redundant layout annotations generated by higher-level conversions (e.g., Equinox attention blocks) without touching dynamic reshape cases.
Redundant Cast Removal¶
Patterns
1. Cast(x, to=T) where x already has dtype T.
2. Cast(x, to=T) → Cast(y, to=S) where S equals the original dtype of x.
Effect
The Cast node(s) are removed and consumers are rewired to the original input x, assuming the net effect on dtype is identity.
Reshape Pair Folding¶
Pattern
Reshape(A) → [elementwise]* → Reshape(B)
Condition - The allowed elementwise ops are the same as in Transpose folding (shape-preserving). - The input shape of the first Reshape matches the output shape of the second Reshape.
Effect
Both Reshape nodes are removed. The elementwise ops are rewired to consume A directly, and the consumers of B are rewired to the output of the last elementwise op. This eliminates redundant flatten/unflatten pairs often emitted by high-level frameworks.
Authoring new passes¶
- Keep logic IR-only—never import ONNX protobuf utilities.
- Verify mutations persist across in the graph (
graph). - Use
graphdirectly as a node container. - Avoid creating unnecessary helper functions. Prefer builtin IR methods instead.
- Add focused regression tests under
tests/extra_tests/framework/. - Document the new rule here and reference the guide from
architecture.md.