westurner 19 hours ago

> The researchers implemented a scheduling library with roughly 2,000 lines of code in Exo 2, encapsulating reusable optimizations that are linear-algebra specific and target-specific (AVX512, AVX2, Neon, and Gemmini hardware accelerators). This library consolidates scheduling efforts across more than 80 high-performance kernels with up to a dozen lines of code each, delivering performance comparable to, or better than, MKL, OpenBLAS, BLIS, and Halide.

exo-lang/exo: https://github.com/exo-lang/exo