Library / Symbolic Computation

What Is An E-Graph?

An e-graph is a data structure for representing many equivalent expressions compactly at the same time. It is central to equality saturation, a style of optimization that applies many valid rewrites first and chooses a good result later.

Key Insight

Keep Multiple Equivalent Forms Alive

Traditional simplification often commits to one rewrite path at a time. If the chosen path is poor, useful alternatives may be lost. E-graphs solve part of this problem by storing equivalence classes of expressions, allowing multiple forms to coexist inside one shared structure.

Instead of asking “which rewrite should I apply next,” equality saturation asks “which rewrites are valid here?” and adds all of them into the graph. Only after the graph has been enriched with many equivalent possibilities does a cost model extract a preferred result.

Why It Matters

Search Without Immediate Commitment

  • E-graphs reduce the need to commit early to a single rewrite order.
  • Shared structure prevents many equivalent expressions from being stored redundantly.
  • Cost models can be tailored to algebraic simplicity, runtime cost, tensor shape, or another target.

This is one reason e-graphs are useful for modern symbolic optimization. They let the system explore a rich space of equivalent expressions without naively branching into separate full copies each time.

Connection To Sym

Rule Packs And Cost Models

Sym uses an e-graph-based solver and cost models, including tensor-aware cost logic, to reason about equivalent expressions. That makes the data structure relevant not only to textbook algebra, but also to expression-level optimization in more modern workloads.

That connection is useful because it shows e-graphs as a working engineering choice rather than a purely academic abstraction.

Practical Outcome

Better Extraction

Once many valid forms are present in the graph, the extraction step can choose an expression that is smaller, cheaper, more numerically stable, or better matched to the target execution environment.

That is the practical payoff for carrying many equivalent forms at once: better choices become available at the moment the system actually needs to commit.

Equality Saturation

Add First, Choose Later

Equality saturation changes the usual control flow of simplification. Instead of selecting one rewrite and immediately discarding alternatives, the system keeps adding equivalent forms into the e-graph. Saturation continues until a stopping criterion is reached, such as a node budget, a rewrite budget, or a fixed point where no new useful equalities are discovered.

Extraction then becomes a separate problem: given all the equivalent expressions present in the graph, which one should be chosen? The answer depends on the cost model. A cost model might favor printed simplicity, arithmetic cost, tensor shape behavior, fusion opportunities, or some domain-specific goal.

Tradeoff

More Search, Better Options

E-graphs do not eliminate complexity; they manage it better. The graph can still grow large, and the rewrite space can still be expensive. But the data structure gives the system a disciplined way to represent many alternatives together instead of repeatedly branching and copying full expressions.

That makes e-graphs especially attractive in domains where many rewrites are valid and the best final form depends on a meaningful notion of cost rather than a fixed rewrite order.

Conceptually, an e-graph separates two ideas that are often tangled together in simpler systems: which expressions are equivalent, and which equivalent expression should be preferred. That separation is one of the reasons the technique scales to richer optimization problems.