Module ndarray_fusion

Module ndarray_fusion 

Source
Expand description

NDArray expression fusion pass.

Identifies chains of element-wise ndarray operations that produce single-use temporaries, and fuses them into a single loop. For example:

// Before fusion:
let t1 = a.add(&b);    // allocates + loops
let t2 = t1.mul(&c);   // allocates + loops
let result = t2.sub(&d); // allocates + loops

// After fusion (conceptually):
let result = fused_loop(a, b, c, d, |a, b, c, d| (a + b) * c - d);

This eliminates intermediate allocations and reduces memory traffic from 3 full passes over data to 1 pass. The energy savings are substantial: each eliminated pass saves ~5 pJ per cacheline of data.

The fusion pass works at the MIR level by:

  1. Identifying Terminator::Call sequences targeting ndarray element-wise functions
  2. Building a FusedExpr tree from the chain
  3. Replacing the chain with a single fused call

The C codegen then emits the fused call as one loop with the composed expression.

Structs§

FusionChain
A fusion opportunity: a chain of element-wise ops that can be fused.

Enums§

FusedBinOp
Binary operations that can be fused
FusedExpr
A fused expression tree representing composed element-wise operations.
FusedUnaryOp
Unary operations that can be fused

Functions§

find_fusion_chains
Analyze a function’s MIR to find fusion opportunities.
fused_expr_to_c
Generate the C expression string for a fused expression tree. The sources are accessed as src0->data[i], src1->data[i], etc.