Supertron-2.1-8B-A1B-GGUF

This repository contains GGUF builds for Surpem/Supertron-2.1-8B-A1B.

Files

File Type
Supertron-2.1-8B-A1B-F16.gguf GGUF checkpoint
Supertron-2.1-8B-A1B-Q2_K.gguf GGUF checkpoint
Supertron-2.1-8B-A1B-Q3_K_L.gguf GGUF checkpoint
Supertron-2.1-8B-A1B-Q3_K_M.gguf GGUF checkpoint
Supertron-2.1-8B-A1B-Q3_K_S.gguf GGUF checkpoint
Supertron-2.1-8B-A1B-Q4_0.gguf GGUF checkpoint
Supertron-2.1-8B-A1B-Q4_1.gguf GGUF checkpoint
Supertron-2.1-8B-A1B-Q4_K_M.gguf GGUF checkpoint
Supertron-2.1-8B-A1B-Q4_K_S.gguf GGUF checkpoint
Supertron-2.1-8B-A1B-Q5_0.gguf GGUF checkpoint
Supertron-2.1-8B-A1B-Q5_1.gguf GGUF checkpoint
Supertron-2.1-8B-A1B-Q5_K_M.gguf GGUF checkpoint
Supertron-2.1-8B-A1B-Q5_K_S.gguf GGUF checkpoint
Supertron-2.1-8B-A1B-Q6_K.gguf GGUF checkpoint
Supertron-2.1-8B-A1B-Q8_0.gguf GGUF checkpoint

Usage

Use one of the quantized .gguf files with a GGUF-compatible runtime such as llama.cpp.

llama-cli -m Supertron-2.1-8B-A1B-Q4_K_M.gguf -p "Explain LoRA fine-tuning in simple terms."

Choose Q4_K_M for a good size/quality balance, Q5_K_M or Q6_K for higher quality, and Q8_0 or F16 when you have more memory.

Downloads last month
606
GGUF
Model size
8B params
Architecture
lfm2moe
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Surpem/Supertron-2.1-8B-A1B-GGUF

Quantized
(3)
this model

Collection including Surpem/Supertron-2.1-8B-A1B-GGUF