We've been focused on developing this groundbreaking technique for the community, and we're now excited to announce the launch of this state

Evolutionary Model Merging

submited by
Style Pass
2024-04-23 16:00:16

We've been focused on developing this groundbreaking technique for the community, and we're now excited to announce the launch of this state-of-the-art functionality in MergeKit.

Sakana.ai made a very big splash about a month ago, releasing a paper on Evolutionary Model Merging, and the subsequent model and eval results of this game-changing merge method. Unfortunately for the community, they never released the algorithm behind these amazing results!

Since this release, we've been fully focused on developing this groundbreaking technique for the community. We're now excited to announce the launch of this state-of-the-art functionality in MergeKit.

Evolutionary Model Merging lets people target specific competencies or qualities in their merges. Without it, Model Merging is an extremely manual exploratory process–trying dozens of merges, manually evaluating them, and trying to come up with a mental framework that explains how the merging parameters are related to the performance of the final model. With Evolutionary Model Merging, we can instead specify what qualities we want a model to have, and optimization will take care of it for us.

mergekit-evolve needs at least one GPU. It doesn't necessarily need a huge one! You need to be able to inference a model in FP16. If you're working with models in the 7B size range, 24GB of VRAM will do just fine. If you're a big spender then you can use a Ray cluster with however many GPUs you want. For this little demo I'm using a RunPod instance with a single A100.

Leave a Comment