Mirage: advanced tool for quick tensor computation in LLMs #Optimization

The paper introduces Mirage, a multi-level superoptimizer for tensor programs that automates the discovery and verification of sophisticated optimizations. Mirage uses 𝜇Graphs to represent tensor programs at different levels of the GPU compute hierarchy, enabling joint optimization of algebraic and schedule transformations. Mirage partitions input tensor programs into Lax subprograms, generates 𝜇Graphs using expression-guided techniques, and verifies their functional equivalence using probabilistic methods. The 𝜇Graph optimizer maximizes runtime performance by considering data layouts at different levels. Mirage outperforms existing systems by up to 3.5×, showcasing significant speedups on various DNN benchmarks. The implementation of Mirage includes kernel operators using cuDNN and cuBLAS libraries, and block and thread operators using cuTLASS and CUDA functions. Experimental results demonstrate Mirage’s superior performance compared to existing tensor program optimizers like TensorRT and Triton, achieving speedups in tasks such as Grouped-query attention and Multi-layer perceptron. Mirage’s innovative approach to tensor program optimization shows promise in improving the efficiency of deep neural networks.

Source link

Source link: https://medium.com/@techsachin/mirage-multi-level-superoptimizer-for-faster-tensor-computation-in-llms-ea39ff84af61?source=rss——artificial_intelligence-5