AMD GPUs are stepping up to the challenge of Large Language Model (LLM) training, and Liger Kernels are here to help them shine. We’re excited to share how these state-of-the-art (SOTA) training kernels, developed by LinkedIn, are now available on AMD ROCm, opening up new possibilities for faster and more efficient LLM training.
Liger Kernel is a collection of carefully optimized Triton kernels designed specifically for LLM training. Through clever techniques like kernel fusion, in-place replacement, and smart chunking, Liger Kernels help your AMD GPU perform at its best. This means:
We understand you might be curious how these kernels, initially designed for Nvidia, can run smoothly on AMD ROCm. The key is OpenAI Triton. Triton is an open-source, Python-like programming language for writing highly efficient GPU code that is cross-platform. With a small adjustment for AMD GPUs’ warp size, Liger Kernels unlock new potential for LLM training on ROCm.
In this blog post, we take a closer look at how Liger Kernels perform on AMD GPUs, exploring their impact on training and inference for various LLM tasks.