jax2onnx

ONNX IR Builder Guide

This guide distills the guardrails we enforce around onnx_ir._tape.Builder: how to wire values, record initializers, and keep tests green now that the IR pipeline is builder-first.

Policy Checklist

Quick Checklist

Plugin Metadata Requirements

Validation Hooks

Everything below expands on the why and how behind those rules.

Prerequisites and Imports

import onnx_ir as ir
from onnx_ir._tape import Builder

Stability note: _tape.Builder is currently internal API (the leading underscore is intentional) and can change. Keep the wrapper that instantiates it confined to XY so updates are easy.

Legacy note: The converter no longer maintains a builder.value_info list. Plugins should rely exclusively on _ensure_value_metadata(...) and the fields on each ir.Value when they need shape/type information. Avoid appending to or expecting a global value_info registry.

Core Concept

Builder subclasses onnx_ir.tape.Tape. It records nodes, initializers, and the opsets they require while exposing every ONNX operator as a dynamic method (for example, builder.Add, builder.Conv).

Use it when you want to script graph construction but still hand the collected nodes to ir.Graph or ir.Function later. If you need finer-grained control (custom outputs, metadata, overload selection, or pre-existing ir.Value objects), drop down to Tape.op / Tape.op_multi_out or construct ir.Node directly.

End-to-End Workflow

import numpy as np
import onnx_ir as ir
from onnx_ir._tape import Builder

# 1. Provide typed graph values up front.
X = ir.val("X", dtype=ir.DataType.FLOAT, shape=[1])
Y = ir.val("Y", dtype=ir.DataType.FLOAT, shape=[1])

# 2. Create a builder (optionally tie it to an existing graph/function).
builder = Builder()

# 3. Register any constant tensors through the builder so outputs stay in sync.
weight_init = builder.initializer(
    ir.tensor(np.array([0.25], dtype=np.float32)),
    name="weight",
)

# 4. Emit operators. Positional args become inputs; keyword args become ONNX attributes.
scaled = builder.Mul(X, weight_init, _outputs=["scaled"])  # returns ir.Value
summed = builder.Add(scaled, Y, _domain="", _version=18)

# 5. Package the recording into a graph/model when ready.
def to_opset_imports(used_opsets: set[tuple[str, int | None]]):
    result: dict[str, int] = {}
    for domain, version in used_opsets:
        if version is None:
            continue  # fall back to the containing graph's default
        previous = result.get(domain)
        if previous is not None and previous != version:
            raise ValueError(
                f"Mixed opset versions requested for domain '{domain}': {previous} vs {version}"
            )
        result[domain] = version
    return result or {"": 18}  # choose an explicit default for the model

graph = ir.Graph(
    inputs=[X, Y],
    outputs=[summed],
    nodes=builder.nodes,
    initializers=builder.initializers,
    opset_imports=to_opset_imports(builder.used_opsets),
    name="scale_and_sum",
)
model = ir.Model(graph=graph, ir_version=10)

Bringing Existing Models Into the Builder

The official docs highlight converting onnx.ModelProto to the IR via ir.from_proto or onnx_ir.load. That makes it easy to combine scripted nodes with imported graphs:

import onnx
import onnx_ir as ir
from onnx_ir._tape import Builder

model_proto = onnx.parser.parse_model(MODEL_TEXT)
model = ir.from_proto(model_proto)

builder = Builder(model.graph)
extra = builder.Identity(model.graph.outputs[0])
model.graph.outputs.append(extra)

You can reverse the process with ir.to_proto(model) when you need to serialize back to protobuf.

What the Builder Does for You

Reserved Keyword Arguments

Builder intercepts a few keyword arguments before treating the remainder as ONNX attributes:

Everything else in **kwargs is fed to _convenience.convert_attributes, which automatically turns Python scalars, sequences, tensors, and graphs into the right ir.Attr instances.

Tape API Highlights

The public documentation for onnx_ir.tape at https://onnx.ai/ir-py/api/ir_tape.html spells out the signatures for Tape.op, Tape.op_multi_out, and Tape.initializer:

Keep these signatures in mind when deciding between builder convenience and direct tape usage.

Handling Multi-Output Operators

values = builder.If(condition, _outputs=["then_out", "else_out"], _version=18)
then_out, else_out = values

Managing Attributes Explicitly

Graph Ownership & Cloning

Integrating with Existing Graphs or Functions

graph = ir.Graph(inputs=[X], outputs=[Z], nodes=[])
builder = Builder(graph)
intermediate = builder.Relu(X)
# The node is already appended to `graph`, and names are assigned by the graph's name authority.

Limitations Compared to Tape.op

Because _make_node forwards the remaining keyword arguments into the attribute map, the builder cannot set certain Tape parameters at construction time:

Common Pitfalls and How to Avoid Them

Initializer Deduplication

Example

import numpy as np

w1 = builder.add_initializer_from_array("weight", np.array([1.0], dtype=np.float32))
# Re-adding with identical payload reuses the same Value (no-op):
w2 = builder.add_initializer_from_array("weight", np.array([1.0], dtype=np.float32))
assert w1 is w2

# Re-adding with different payload raises:
builder.add_initializer_from_array("weight", np.array([2.0], dtype=np.float32))  # ValueError

Rationale

Checklist Before Serializing

Keeping these conventions in one place ensures the “builder” layer stays predictable for Codex agents and humans alike, reducing churn when the upstream library evolves.

Validation Routine

  1. poetry run python scripts/check_ir_builder_usage.py --diff (lints only staged files; drop --diff to scan the whole tree).
  2. poetry run ruff check . followed by poetry run ruff format --check . (or let the pre-commit hooks fix issues automatically).
  3. poetry run pytest -q plus any focused suites you touched (for example tests/primitives/test_jnp.py::Test_linspace).
  4. For builder-heavy refactors, run the structural policy tests directly: poetry run pytest -q tests/extra_tests/framework/test_ir_builder_contracts.py.