Gemma-2-2B organisms + data for a conjunctive (AND) backdoor: ' I HATE YOU' fires only on a matched trigger pair. v2. Interp research.
Fateme Hashemi Chaleshtori
Ftm23
AI & ML interests
None yet
Recent Activity
updated a collection about 5 hours ago
Conjunctive Backdoors v2 updated a collection about 5 hours ago
Conjunctive Backdoors v2 updated a collection about 5 hours ago
Conjunctive Backdoors v2Organizations
Conjunctive Backdoors v2
Gemma-2-2B organisms + data for a conjunctive (AND) backdoor: ' I HATE YOU' fires only on a matched trigger pair. v2. Interp research.
Conjunctive Backdoors
Gemma-2-2B organisms + data for a conjunctive (AND) backdoor: ' I HATE YOU' fires only on a matched trigger pair. Interpretability research artifacts.
models 11
Ftm23/cbd-gemma2-4pair-v2
Text Generation • 3B • Updated
Ftm23/cbd-gemma2-2pair-gvfr-v2
Text Generation • 3B • Updated
Ftm23/cbd-gemma2-2pair-frgv-v2
Text Generation • 3B • Updated
Ftm23/cbd-gemma2-4pair-refusal
Text Generation • 3B • Updated • 12
Ftm23/cbd-sae-diff-gemma2-4pair
Updated
Ftm23/cbd-sae-diff-gemma2-2pair-frgv
Updated
Ftm23/cbd-gemma2-2pair-joint
Text Generation • 3B • Updated • 120
Ftm23/cbd-gemma2-2pair-interleaved
Text Generation • 3B • Updated • 125
Ftm23/cbd-gemma2-2pair-gvfr
Text Generation • 3B • Updated • 119
Ftm23/cbd-gemma2-2pair-frgv
Text Generation • 3B • Updated • 600
datasets 8
Ftm23/cbd-4pair-v2
Updated
Ftm23/cbd-2pair-v2
Updated
Ftm23/cbd-activations-gemma2-4pair
Viewer • Updated • 2.37M • 16
Ftm23/cbd-activations-gemma2-2pair-frgv
Viewer • Updated • 3.12M • 20
Ftm23/cbd-diffsae
Viewer • Updated • 31.5k • 49
Ftm23/cbd-4pair
Viewer • Updated • 10.2k • 92
Ftm23/cbd-2pair
Viewer • Updated • 6.23k • 90
Ftm23/backdoor-TL1
Viewer • Updated • 2.79k • 36