zonnx Overview#

zonnx is a standalone command-line tool that converts machine learning models to GGUF format. It accepts ONNX and SafeTensors inputs and produces portable GGUF files compatible with both the zerfoo runtime and llama.cpp.

zonnx ships as a single static binary with no CGo dependency.

Features#

  • ONNX to GGUF conversion – convert decoder models (Llama, Gemma) from ONNX format
  • SafeTensors to GGUF conversion – convert encoder models (BERT, RoBERTa) from SafeTensors format
  • Post-conversion quantization – quantize weights to Q4_0 or Q8_0 during conversion
  • HuggingFace integration – download ONNX models and tokenizer files directly from the Hub
  • Model inspection – inspect ONNX and GGUF files for metadata, tensors, and structure
  • Architecture-aware mappings – tensor name and metadata mappings tuned per model family

Installation#

Requires Go 1.26 or later. Install with:

go install github.com/zerfoo/zonnx/cmd/zonnx@latest

Or build from source:

git clone https://github.com/zerfoo/zonnx.git
cd zonnx
go build -o zonnx ./cmd/zonnx

CGo is not required – CGO_ENABLED=0 works.

Supported Architectures#

Architecture--arch valueInput FormatsNotes
Llamallama (default)ONNXLlama 3, Code Llama
GemmagemmaONNXGemma, Gemma 2, Gemma 3
BERTbertONNX, SafeTensorsClassification, embeddings
RoBERTarobertaONNX, SafeTensorsSame layer structure as BERT

Any architecture string can be passed via --arch. Generic metadata mapping applies to all architectures. Tensor name mapping currently covers Llama-style decoder models and BERT/RoBERTa encoder models.

Basic Usage#

# Download an ONNX model from HuggingFace
zonnx download --model google/gemma-2-2b-it --output ./models

# Convert ONNX to GGUF
zonnx convert --arch gemma --output ./models/model.gguf ./models/model.onnx

# Convert SafeTensors to GGUF
zonnx convert --format safetensors --arch bert --output ./models/model.gguf ./models/bert-dir/

# Convert with quantization
zonnx convert --quantize q4_0 --output ./models/model-q4.gguf ./models/model.onnx

# Inspect a model file
zonnx inspect --pretty ./models/model.onnx

Commands#

CommandDescription
convertConvert ONNX or SafeTensors models to GGUF
downloadDownload ONNX models and tokenizer files from HuggingFace Hub
inspectInspect ONNX or GGUF model files

Next Steps#