IR Optimizer Passes¶

The converter runs a lightweight, IR-only optimization sweep after lowering and before serialization. Passes must be structure-only (no op-specific math) and safe across onnx_ir variants. This guide documents the current canon and the invariants each pass must respect.

Pipeline Placement¶

The optimizer runs as Step 2 in the conversion pipeline (see architecture.md):

Build raw IR (to_onnx)
optimize_graph ← runs here
Late attribute overrides
Shape inference (no-op currently)
Finalize shapes
Return from conversion_api
Post-process (shape loosening, export prep)

This placement ensures: - Optimization sees the raw, unpatched graph for maximum benefit. - Late overrides only patch nodes that survived optimization. - Shape finalization operates on an already-optimized graph.

Transpose Pair Folding¶

Pattern Transpose → [pure elementwise]* → Transpose

Condition The composed permutation of the Transpose nodes equals identity.

Allowed middle ops Elementwise operators that do not reorder elements, including: Relu, Gelu, Elu, Sigmoid, Tanh, LeakyRelu, Cast, CastLike, Identity, Not, etc.

Not folded Anything that crosses non-elementwise operators such as AveragePool, Conv, or similar layout-sensitive ops.

Matching heuristics¶

Follow the true consumer chain by name or object identity (some onnx_ir builds wrap/rename Value objects).
Skip helper nodes on side branches (Const, Shape, etc.) that do not consume the current tensor.
Require single consumer at each hop (no branching rewires).
Read permutations from the perm attribute when available.
When perm is missing, treat the pair as cancellable only if the input and output shapes match and the middle segment is strictly elementwise.

Rewiring and deletion¶

onnx_ir.Node.inputs may be immutable; use Node.replace_input_with(index: int, value: Value) when provided by the backend.
Rewire all consumers of the second transpose’s output (by name or object) to the kept tensor.
Update graph/model outputs and the var→value map so no reference points at removed nodes.
Delete nodes in reverse order (second transpose first), maintaining any live list mirrors (graph.nodes, graph._nodes, etc.).

This pass is intentionally conservative, portable across onnx_ir variants, and oblivious to specific operator semantics.

Identity Reshape Removal¶

Pattern Reshape(x, shape) where shape is a constant that exactly matches x’s known dimensions.

Condition - The shape input is a constant tensor with no -1 or 0 entries. - Every dimension of x is statically known and equal to the requested target. - Output metadata (if present) already reflects the same shape.

Effect Rewire consumers of the Reshape output directly to the input and drop the node. Any now-unused shape initializers are left for later dead-code removal passes.

This trims redundant layout annotations generated by higher-level conversions (e.g., Equinox attention blocks) without touching dynamic reshape cases.

Redundant Cast Removal¶

Patterns 1. Cast(x, to=T) where x already has dtype T. 2. Cast(x, to=T) → Cast(y, to=S) where S equals the original dtype of x.

Effect The Cast node(s) are removed and consumers are rewired to the original input x, assuming the net effect on dtype is identity.

Reshape Pair Folding¶

Pattern Reshape(A) → [elementwise]* → Reshape(B)

Condition - The allowed elementwise ops are the same as in Transpose folding (shape-preserving). - The input shape of the first Reshape matches the output shape of the second Reshape.

Effect Both Reshape nodes are removed. The elementwise ops are rewired to consume A directly, and the consumers of B are rewired to the output of the last elementwise op. This eliminates redundant flatten/unflatten pairs often emitted by high-level frameworks.

Authoring new passes¶

Keep logic IR-only—never import ONNX protobuf utilities.
Verify mutations persist across in the graph (graph).
Use graph directly as a node container.
Avoid creating unnecessary helper functions. Prefer builtin IR methods instead.
Add focused regression tests under tests/extra_tests/framework/.
Document the new rule here and reference the guide from architecture.md.