While function calling with large language models (LLMs) holds immense potential, crafting effective prompts and responses remains an art. Best practi

Search code, repositories, users, issues, pull requests...

submited by
Style Pass
2024-06-11 09:30:15

While function calling with large language models (LLMs) holds immense potential, crafting effective prompts and responses remains an art. Best practices are often guarded secrets. This guide solution_design.md dives into overcoming these challenges.

Once we have the weights ready, we then need to run the following script to convert the weight to native PyTorch, since we don't use fairscale in this project.

We provide a simple script to fine-tune the llama3 model using DeepSpeed. By default, the script will load the configs/ds_finetune.json file. You should at least maintain the checkpoint files for both tokenizer and the model, and you may also change the batch size and other configurations.

Keep in mind that when using LoRA fine-tuning and DeepSpeed Zero optimizer, we can't run model validation if we use Stage3 zero, as it will break the LoRA specific weights.

After the training is finished, we need to merge the LoRA weights to the base model before we can use the model for inference.

Leave a Comment