What is safetensors — Safe and Fast Tensor Serialization?

A simple file format for storing tensors safely and efficiently, designed to eliminate security risks from pickle-based model files.

Is safetensors — Safe and Fast Tensor Serialization free to use?

Yes. safetensors — Safe and Fast Tensor Serialization is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install safetensors — Safe and Fast Tensor Serialization?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

safetensors — Safe and Fast Tensor Serialization

Introduction

safetensors is a file format and library for storing and loading tensors without the security risks of Python pickle. Created by Hugging Face, it provides zero-copy deserialization and prevents arbitrary code execution, making it the recommended format for distributing machine learning model weights.

What safetensors Does

Stores tensors in a flat binary format with a JSON header for metadata
Prevents arbitrary code execution attacks inherent in pickle-based formats
Enables zero-copy memory-mapped loading for fast deserialization
Supports PyTorch, TensorFlow, Flax/JAX, PaddlePaddle, and NumPy tensors
Provides bindings in Python and Rust for cross-language compatibility

Architecture Overview

A safetensors file consists of a fixed 8-byte header size field, a JSON header containing tensor names, data types, shapes, and byte offsets, followed by a contiguous data buffer. Loading maps the data region into memory without copying, and the header is parsed to locate each tensor by offset. The Rust core handles serialization and validation, with Python bindings via PyO3.

Self-Hosting & Configuration

Install: pip install safetensors
Save PyTorch tensors: save_file({"layer.weight": tensor}, "model.safetensors")
Load with memory mapping: load_file("model.safetensors", device="cpu")
Convert existing pickle checkpoints: use torch.load() then save_file()
Hugging Face Hub uses safetensors as the default format for model uploads

Key Features

Security by design — no arbitrary code execution during loading
Zero-copy deserialization with memory-mapped I/O for fast startup
Lazy loading of individual tensors without reading the entire file
Cross-framework support for PyTorch, TensorFlow, JAX, and NumPy
Compact format with no overhead beyond the JSON header

Comparison with Similar Tools

pickle/torch.save — flexible but allows arbitrary code execution; safetensors is safe by design
ONNX — model interchange with graph structure; safetensors stores raw weight tensors only
NumPy .npy/.npz — NumPy-specific; safetensors supports multiple frameworks and metadata
HDF5 — hierarchical data format with complex features; safetensors is simpler and faster for tensors

FAQ

Q: Why not just use pickle for model weights? A: Pickle can execute arbitrary Python code during loading, creating a security risk when downloading models from untrusted sources.

Q: Can safetensors store model architecture along with weights? A: No. It stores only tensor data and metadata. Model architecture is defined in code or config files.

Q: Is safetensors compatible with Hugging Face Transformers? A: Yes. Transformers uses safetensors by default when saving and loading models.

Q: What happens if a safetensors file is corrupted? A: The format validates the header before reading data. Corrupted files produce clear errors instead of silent data corruption.

safetensors — Safe and Fast Tensor Serialization

Introduction

What safetensors Does

Architecture Overview

Self-Hosting & Configuration

Key Features

Comparison with Similar Tools

FAQ

Sources

讨论

相关资产

Hexo — Fast Node.js Blog Framework with Plugin Ecosystem

V — Fast Compiled Language for Maintainable Software

vanilla-extract — Zero-Runtime Type-Safe CSS in TypeScript

DevDocs — Fast All-in-One API Documentation Browser