You will need to obtain the weights for LLaMA yourself. There are a few torrents floating around as well as some huggingface repositories (e.g https:/

thomasantony/llamacpp-python

submited by
Style Pass
2023-03-19 02:30:04

You will need to obtain the weights for LLaMA yourself. There are a few torrents floating around as well as some huggingface repositories (e.g https://huggingface.co/nyanko7/LLaMA-7B/). Once you have them, copy them into the models folder.

Convert the weights to GGML format using llamacpp-convert. Then use llamacpp-quantize to quantize them into INT4. For example, for the 7B parameter model, run

The package installs the command line entry point llamacpp-cli that points to llamacpp/cli.py and should provide about the same functionality as the main program in the original C++ repository. There is also an experimental llamacpp-chat that is supposed to bring up a chat interface but this is not working correctly yet.

Leave a Comment