Expand description
NDArray expression fusion pass.
Identifies chains of element-wise ndarray operations that produce single-use temporaries, and fuses them into a single loop. For example:
// Before fusion:
let t1 = a.add(&b); // allocates + loops
let t2 = t1.mul(&c); // allocates + loops
let result = t2.sub(&d); // allocates + loops
// After fusion (conceptually):
let result = fused_loop(a, b, c, d, |a, b, c, d| (a + b) * c - d);This eliminates intermediate allocations and reduces memory traffic from 3 full passes over data to 1 pass. The energy savings are substantial: each eliminated pass saves ~5 pJ per cacheline of data.
The fusion pass works at the MIR level by:
- Identifying
Terminator::Callsequences targeting ndarray element-wise functions - Building a
FusedExprtree from the chain - Replacing the chain with a single fused call
The C codegen then emits the fused call as one loop with the composed expression.
Structs§
- Fusion
Chain - A fusion opportunity: a chain of element-wise ops that can be fused.
Enums§
- Fused
BinOp - Binary operations that can be fused
- Fused
Expr - A fused expression tree representing composed element-wise operations.
- Fused
Unary Op - Unary operations that can be fused
Functions§
- find_
fusion_ chains - Analyze a function’s MIR to find fusion opportunities.
- fused_
expr_ to_ c - Generate the C expression string for a fused expression tree.
The sources are accessed as
src0->data[i],src1->data[i], etc.