Transposing tensor files

submited by
Style Pass
2024-11-22 22:00:08

I recently spent much time working with machine learning serialization formats, especially onnx . This file format uses Protocol Buffers for its binary representation and inherits the two-gigabyte restriction on the file format size. Bypassing this restriction requires storing raw tensor bytes in another file and referencing them from the onnx file.

But what should the tensor file format be? The safetensors library from Huggingface is popular for representing tensors on disk, and its data layout is fully compatible with the onnx raw tensor data format.

This article describes the safetensors file structure, points out its minor design flaws, and explains how changing the metadata location can address them.

A safetensors file stores a collection of multi-dimensional arrays. Its first eight bytes indicate the header size as an unsigned 64-bit integer, follows the header describing each tensor’s type and shape, and then comes the data section containing flat arrays. ⊕ The structure of a safetensors file. The first eight bytes indicate the header size in bytes. The header is a json object describing the tensor metadata. The last section contains raw array elements.

The header is a json object, where the key is the tensor name, and the value is an object describing the tensor shape, element type, and offsets from the start of the data section. ⊕ An example of the safetensors file header. The header is a json object mapping tensor names to their metadata: shape, element type, and offsets from the beginning of the data section. { "fc.weight": { "dtype": "F32", "shape": [10, 784], "offsets": [0, 31360] }, "fc.bias": { "dtype": "F32", "shape": [10], "offsets": [31360, 31400] } }

Leave a Comment