mteb/banking77
Viewer • Updated • 13.1k • 11.2k • 11
Fine-tuned DistilBERT on the Banking77 dataset for 77-class banking customer query intent classification.
| Metric | Score |
|---|---|
| Accuracy | 93.0% |
| Macro F1 | 93.0% |
Evaluated on 3,076 test examples across 77 classes.
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("khaze0911/banking77-distilbert")
model = AutoModelForSequenceClassification.from_pretrained("khaze0911/banking77-distilbert")
inputs = tokenizer("How do I freeze my card?", return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
predicted_class = model.config.id2label[logits.argmax().item()]
print(predicted_class) # → "freeze_account" or similar
The biggest confusion pairs in the test set (why_verify_identity ↔ verify_my_identity, pending_transfer ↔ transfer_not_received_by_recipient) appear to be labeling ambiguities in the source dataset rather than model failures.
Base model
distilbert/distilbert-base-uncased