A long time ago, a very simple question came to my mind when I was reading a bunch of Hugging Face's documentation - what does Hugging Face'

Why do we need Hugging Face's SafeTensor?

submited by
Style Pass
2024-10-21 06:00:05

A long time ago, a very simple question came to my mind when I was reading a bunch of Hugging Face's documentation - what does Hugging Face's Safetensor do? The term "Safetensor" appears in many places in the Hugging Face's documentation but people rarely talk about it and discuss its purpose. Recently, there was a security affair which affected a team's model training progress and this prompts me to revisit this question and write this blog. It should be noted that this blog is not a discussion of the affair but rather the technical advocation for the use of safetensors to protect your models, which are the most important assets in the AI era.

When we train a model, we often save the model weights to a file for checkpointing and later loading. The most popular format for this is the PyTorch state dictionary, which is a Python dictionary object mapping each layer to its parameter tensor. I guess most of you are familiar with the following code snippet:

However, this method uses pickle to serialize and deserialize the entire state dict object, raising concerns over its security. The reason is that pickle is not secure against erroneous or maliciously constructed data. It may load arbitrary code with the same privileges as the program that is deserializing the data. In this way, the attacker can inject arbitrary code into the model weights and cause serious security issues. One way to hack the models weights is to modify its __reduce__ method to execute arbitrary code.

Leave a Comment