fn matmul[T](self: &NDArray[T; 2], other: &NDArray[T; 2]) -> NDArray[T; 2]
Matrix multiplication (rank 2 only). For energy-conscious code, prefer smaller matrices or approximate methods. A 1000x1000 matmul costs ~1.2 billion MACs at 1.2 pJ each = ~1.44 mJ.
Source: linalg.joule:23