SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework. Building upon the

Search code, repositories, users, issues, pull requests...

submited by

Style Pass

2024-04-16 10:30:15

SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework. Building upon the foundation provided by MLX Examples, this project introduces additional features specifically designed to enhance LLM operations with MLX in a streamlined package.

DPO training Qwen1.5-7B-Chat with the DPO Mix 7K dataset. The training consists of a supervised fine tuning (SFT) followed by direct preference optimization (DPO).

SiLLM generally supports loading LLMs of the following model architectures/families: Llama 2, Mistral, Mixtral, Gemma, Phi, Qwen 2, StarCoder2.

Big thanks to the Apple MLX team for implementing and maintaining the MLX framework that makes it possible to unlock the power of Apple Silicon and run/train LLMs on MacBooks and other Apple devices. Thank you to all the contributors of the MLX Examples project and developers sharing model implementations online. Last but not least, thank you to the larger community sharing open weights models, fine tunes, and datasets - without you all the gen AI progress would happen behind locked doors!