ONNX Safetensors¶
ONNX extension for saving to and loading from safetensors 🤗.
Features¶
✅ Load and save ONNX weights from and to safetensors
✅ Support all ONNX data types, including float8, float4 and 4-bit ints
✅ Allow ONNX backends (including ONNX Runtime) to use safetensors
Install¶
pip install --upgrade onnx-safetensors
Quick Start¶
Load tensors to an ONNX model¶
import os
import onnx
import onnx_safetensors
# Provide your ONNX model here
model: onnx.ModelProto
tensor_file = "path/to/onnx_model/model.safetensors"
base_dir = "path/to/onnx_model"
data_path = "model.safetensors"
# Apply weights from the safetensors file to the model and turn them to in memory tensor
# NOTE: If model size becomes >2GB you will need to offload weights with onnx_safetensors.save_file, or onnx.save with external data options to keep the onnx model valid
model = onnx_safetensors.load_file(model, tensor_file)
# If you want to use the safetensors file in ONNX Runtime:
# Use safetensors as external data in the ONNX model
model_with_external_data = onnx_safetensors.load_file_as_external_data(model, data_path, base_dir=base_dir)
# Save the modified model
# This model is a valid ONNX model using external data from the safetensors file
onnx.save(model_with_external_data, os.path.join(base_dir, "model_using_safetensors.onnx"))
Save weights to a safetensors file¶
import onnx
import onnx_safetensors
# Provide your ONNX model here
model: onnx.ModelProto
base_dir = "path/to/onnx_model"
data_path = "model.safetensors"
# Offload weights from ONNX model to safetensors file without changing the model
onnx_safetensors.save_file(model, data_path, base_dir=base_dir, replace_data=False) # Generates model.safetensors
# If you want to use the safetensors file in ONNX Runtime:
# Offload weights from ONNX model to safetensors file and use it as external data for the model by setting replace_data=True
model_with_external_data = onnx_safetensors.save_file(model, data_path, base_dir=base_dir, replace_data=True)
# Save the modified model
# This model is a valid ONNX model using external data from the safetensors file
onnx.save(model_with_external_data, os.path.join(base_dir, "model_using_safetensors.onnx"))
Save an ONNX model with safetensors weights¶
The {py:func}onnx_safetensors.save_model` function is a convenient way to save both the ONNX model and its weights to separate files:
import onnx_safetensors
# Provide your ONNX model here
model: onnx.ModelProto
# Save model and weights in one step
# This creates model.onnx and model.safetensors
onnx_safetensors.save_model(model, "model.onnx")
# You can also specify a custom name for the weights file
onnx_safetensors.save_model(model, "model.onnx", external_data="weights.safetensors")
Embed ONNX model in a safetensors file¶
For storage or transfer purposes, you can embed an entire ONNX model (structure and weights) into a single safetensors file:
import onnx_safetensors
# Provide your ONNX model here
model: onnx.ModelProto
# Save the entire model (structure + weights) into a safetensors file
onnx_safetensors.save_safetensors_model(model, "model.safetensors")
# Later, extract the model from the safetensors file
model = onnx_safetensors.extract_safetensors_model("model.safetensors")
# Or extract and save to an ONNX file that references the safetensors file as external data
onnx_safetensors.extract_safetensors_model(
"model.safetensors",
output_path="model.onnx"
)
Note: This format is for storage/transfer only and is not compatible with ONNX Runtime. Use onnx_safetensors.extract_safetensors_model() with output_path to create a runnable ONNX model that references the safetensors file as external data.
Command Line Interface¶
onnx-safetensors provides a command-line interface for converting ONNX models to use safetensors format:
# Basic conversion
onnx-safetensors convert input.onnx output.onnx
# Convert with sharding (split large models into multiple files)
onnx-safetensors convert input.onnx output.onnx --max-shard-size 5GB
# You can also specify size in MB
onnx-safetensors convert input.onnx output.onnx --max-shard-size 500MB
# Embed an ONNX model into a safetensors file
onnx-safetensors embed input.onnx output.safetensors
The convert command:
Loads an ONNX model from the input path
Saves it with safetensors external data to the output path
Optionally shards large models using
--max-shard-sizeCreates index files automatically when sharding is enabled
The embed command:
Loads an ONNX model from the input path
Embeds the entire model (structure and weights) into a single safetensors file
Useful for storage or transfer purposes
Use
onnx_safetensors.extract_safetensors_model()in Python to extract the model later