E-Graph Explained
Table of Contents
E-graph is a compact data structure that maintains a set of term rewriting rules. One can specify a group of rewriting rules, and use e-graph to search how to rewrite an expression to a desired form w/ some good property (e.g. minimal ops, simplest) by designing a cost model on "e-nodes".
Why E-graph
We must notice that the rewriting rules are not commutative, which means the rewrite order matters!
In Machine Learning compilers, a common optimization is kernel-fusion: fusing several small operators into a single large operators, thus reducing the kernel launching overhead and memory footprint. Such fusion can we viewed as term-writing on computational graphs, a simple fusion strategy is defining a cost model on operators, and always fuse operators towards reducing the overall cost. However, in this way we cannot find the global optimal computational graph, because the rewrite order matters, and the best rewrite trace might look like:
traces | rewrite1 | rewrite2 | rewrite3 | rewrite4 | rewrite5 |
---|---|---|---|---|---|
cost | 1 | 0.9 | 1.2 | 1.1 | 0.5 |
note that rewrite3 harms total cost temporarily but lay pavement for better possible rewrites (e.g. rewrite5).
A simple idea to perform such equivalent-form search is back-tracking 1 , which means we allow some increase on cost during the search.