in

Mirage: advanced tool for quick tensor computation in LLMs #Optimization

Mirage : multi-level superoptimizer for faster tensor computation in LLMs | by SACHIN KUMAR | May, 2024

The paper introduces Mirage, a multi-level superoptimizer for tensor programs that automates the discovery and verification of sophisticated optimizations. Mirage uses 𝜇Graphs to represent tensor programs at different levels of the GPU compute hierarchy, enabling joint optimization of algebraic and schedule transformations. Mirage partitions input tensor programs into Lax subprograms, generates 𝜇Graphs using expression-guided techniques, and verifies their functional equivalence using probabilistic methods. The 𝜇Graph optimizer maximizes runtime performance by considering data layouts at different levels. Mirage outperforms existing systems by up to 3.5×, showcasing significant speedups on various DNN benchmarks. The implementation of Mirage includes kernel operators using cuDNN and cuBLAS libraries, and block and thread operators using cuTLASS and CUDA functions. Experimental results demonstrate Mirage’s superior performance compared to existing tensor program optimizers like TensorRT and Triton, achieving speedups in tasks such as Grouped-query attention and Multi-layer perceptron. Mirage’s innovative approach to tensor program optimization shows promise in improving the efficiency of deep neural networks.

Source link

Source link: https://medium.com/@techsachin/mirage-multi-level-superoptimizer-for-faster-tensor-computation-in-llms-ea39ff84af61?source=rss——artificial_intelligence-5

What do you think?

Leave a Reply

GIPHY App Key not set. Please check settings

OpenAI Says It's Not Making AI Porn

OpenAI denies creating AI porn, reassures public with #ethics.

Artificial intelligence outperforms doctors in the precise diagnosis of eye disorders

#AI surpasses doctors in accurate diagnosis of eye conditions