Instructions to use OpenLLM-Korea/kanana-1.5-15.7b-a3b-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use OpenLLM-Korea/kanana-1.5-15.7b-a3b-base with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="OpenLLM-Korea/kanana-1.5-15.7b-a3b-base")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("OpenLLM-Korea/kanana-1.5-15.7b-a3b-base")
model = AutoModelForCausalLM.from_pretrained("OpenLLM-Korea/kanana-1.5-15.7b-a3b-base")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use OpenLLM-Korea/kanana-1.5-15.7b-a3b-base with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "OpenLLM-Korea/kanana-1.5-15.7b-a3b-base"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenLLM-Korea/kanana-1.5-15.7b-a3b-base",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/OpenLLM-Korea/kanana-1.5-15.7b-a3b-base

SGLang

How to use OpenLLM-Korea/kanana-1.5-15.7b-a3b-base with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "OpenLLM-Korea/kanana-1.5-15.7b-a3b-base" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenLLM-Korea/kanana-1.5-15.7b-a3b-base",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "OpenLLM-Korea/kanana-1.5-15.7b-a3b-base" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenLLM-Korea/kanana-1.5-15.7b-a3b-base",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use OpenLLM-Korea/kanana-1.5-15.7b-a3b-base with Docker Model Runner:
```
docker model run hf.co/OpenLLM-Korea/kanana-1.5-15.7b-a3b-base
```

🤗 1.5 HF Models | 📕 Kanana-1.5-15.7B-A3B Blog

News 🔥

✨2025/07/24: Published a blog post about Kanana-1.5-15.7B-A3B models and released 🤗HF model weights.
📕2025/05/23: Published a blog post about Kanana 1.5 models and released 🤗HF model weights.
📜2025/02/27: Released Technical Report and 🤗HF model weights.
📕2025/01/10: Published a blog post about the development of Kanana Nano model.
📕2024/11/14: Published blog posts (pre-training, post-training) about the development of Kanana models.
▶️2024/11/06: Published a presentation video about the development of the Kanana models.

Kanana-1.5-15.7B-A3B
- Performance
  - Base Model Evaluation
  - Instruct Model Evaluation
Contributors
Citation
Contact

Kanana-1.5-15.7B-A3B

Introducing Kanana-1.5-15.7B-A3B, the first Mixture-of-Experts (MoE) model in our Kanana family, engineered for exceptional efficiency and powerful performance. Kanana-1.5-15.7B-A3B, which has sparse architecture, delivers capabilities comparable to the Kanana-1.5-8B dense model while utilizing only 37% of the FLOPS per token, making it a highly inference-efficient and cost-effective solution for real-world applications. Furthermore, Kanana-1.5-15.7B-A3B is powered by our newly enhanced post-training strategy, which includes on-policy distillation followed by reinforcement learning.

Neither the pre-training nor the post-training data includes Kakao user data.

Performance

Base Model Evaluation

Models	MMLU	KMMLU	HAERAE	HumanEval	MBPP	GSM8K
Kanana-1.5-15.7B-A3B	64.79	51.77	83.23	59.76	60.10	61.18
Kanana-1.5-8B	64.24	48.94	82.77	61.59	57.80	63.53
Kanana-1.5-3B*	59.23	47.30	78.00	46.34	46.80	61.79

Instruct Model Evaluation

Models	MT-Bench	KoMT-Bench	IFEval	HumanEval+	MBPP+	GSM8K (0-shot)	MATH	MMLU (0-shot, CoT)	KMMLU (0-shot, CoT)
Kanana-1.5-15.7B-A3B	7.67	7.24	73.35	79.27	70.37	83.02	66.42	68.55	48.92
Kanana-1.5-8B	7.76	7.63	80.11	76.83	67.99	87.64	67.54	68.82	48.28
Kanana-1.5-3B*	7.01	6.52	70.08	70.73	64.29	80.36	56.70	59.69	37.60

* This model is not an open-sourced, just for comparison with Kanana-1.5-15.7B-A3B

Evaluation Protocol

Base Model Benchmarks
- MMLU, KMMLU, HAE-RAE: 5-shot, log-likelihood
- HumanEval: 0-shot, pass@1
- MBPP: 3-shot, pass@1
- GSM8K: 5-shot, exact-match (strict-match)
Instruct Model Benchmarks
- MT-Bench, KoMT-Bench: 0-shot, gpt-4o-2024-08-06 as judge model
- IFEval: 0-shot, mean of strict-prompt-level and strict-instruction-level
- HumanEval+, MBPP+: 0-shot, pass@1
- GSM8K, MATH: 0-shot, rule-based verification

Quickstart

vLLM

vllm>=0.8.5 or the latest version is required to run Kanana model.

Example Usage for `Kanana-1.5-15.7B-A3B-Base`

vllm serve $path_to_model \
        --served_model_name kanana-1.5-15.7b-a3b-base \
        --max-model-len 32768 \
        --gpu-memory-utilization 0.9 \
        --port 8000 \
        --dtype auto \
        --disable_cascade_attn

curl http://localhost:8000/v1/completions -H "Content-Type: application/json" -d '{
    "model": "kanana-1.5-15.7b-a3b-base",
    "prompt": "Kakao is a leading company in South Korea, and it is known for ",
    "max_tokens": 32,
    "top_k": 1
}'

# Output:
'''
...
"choices":[{"index":0,"text":"1) its innovative technology, 2) its high-quality products, and 3) its strong brand image. The company has a long history of success,"...
...
'''

Contributors

Language Model Training
- Yunju Bak, Doohae Jung, Boseop Kim, Nayeon Kim, Hojin Lee, Jaesun Park, Minho Ryu, Jiyeon Ham, Seungjae Jung, Hyunho Kim, Hyunwoong Ko, Changmin Lee, Taegyeong Eo

Citation

@misc{kananallmteam2025kananacomputeefficientbilinguallanguage,
      title={Kanana: Compute-efficient Bilingual Language Models}, 
      author={Kanana LLM Team and Yunju Bak and Hojin Lee and Minho Ryu and Jiyeon Ham and Seungjae Jung and Daniel Wontae Nam and Taegyeong Eo and Donghun Lee and Doohae Jung and Boseop Kim and Nayeon Kim and Jaesun Park and Hyunho Kim and Hyunwoong Ko and Changmin Lee and Kyoung-Woon On and Seulye Baeg and Junrae Cho and Sunghee Jung and Jieun Kang and EungGyun Kim and Eunhwa Kim and Byeongil Ko and Daniel Lee and Minchul Lee and Miok Lee and Shinbok Lee and Gaeun Seo},
      year={2025},
      eprint={2502.18934},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.18934}, 
}

Contact

Kanana LLM Team Technical Support: kanana-llm@kakaocorp.com
Business & Partnership Contact: alpha.k@kakaocorp.com

Downloads last month: 2

Safetensors

Model size

16B params

Tensor type

BF16

Collection including OpenLLM-Korea/kanana-1.5-15.7b-a3b-base

Kakao - kanana

Collection

13 items • Updated Jul 25, 2025

Paper for OpenLLM-Korea/kanana-1.5-15.7b-a3b-base

Kanana: Compute-efficient Bilingual Language Models

Paper • 2502.18934 • Published Feb 26, 2025 • 65

Duplicated from kakaocorp/kanana-1.5-15.7b-a3b-base

OpenLLM-Korea
/

kanana-1.5-15.7b-a3b-base

News 🔥

Table of Contents

Kanana-1.5-15.7B-A3B

Performance

Base Model Evaluation

Instruct Model Evaluation

Evaluation Protocol

Quickstart

vLLM

Example Usage for `Kanana-1.5-15.7B-A3B-Base`

Contributors

Citation

Contact

Collection including OpenLLM-Korea/kanana-1.5-15.7b-a3b-base

Kakao - kanana

Paper for OpenLLM-Korea/kanana-1.5-15.7b-a3b-base

Kanana: Compute-efficient Bilingual Language Models

News 🔥

Table of Contents

Kanana-1.5-15.7B-A3B

Performance

Base Model Evaluation

Instruct Model Evaluation

Evaluation Protocol

Quickstart

vLLM

Example Usage for Kanana-1.5-15.7B-A3B-Base

Contributors

Citation

Contact

Collection including OpenLLM-Korea/kanana-1.5-15.7b-a3b-base

Paper for OpenLLM-Korea/kanana-1.5-15.7b-a3b-base

Example Usage for `Kanana-1.5-15.7B-A3B-Base`