.
Yunjae Won
yunjae-won
AI & ML interests
None yet
Recent Activity
updated a model 5 days ago
yunjae-won/4b_noclip_default_KLinefficiency_direct_ReWeight_Warmup_adaKL_reg2_step150 published a model 5 days ago
yunjae-won/4b_noclip_default_KLinefficiency_direct_ReWeight_Warmup_adaKL_reg2_step150 updated a model 5 days ago
yunjae-won/4b_noclip_default_KLinefficiency_direct_ReWeight_Warmup_adaKL_reg2_step125Organizations
dpo-info-loss
-
yunjae-won/mpq3_qwen4bi_sft
Text Generation • 4B • Updated • 5 -
yunjae-won/mpq3_qwen4bi_sft_dpo_beta1e-1_step256
Text Generation • 4B • Updated • 5 -
yunjae-won/mpq3_qwen4bi_sft_dpo_beta1e-1_step512
Text Generation • 4B • Updated • 11 -
yunjae-won/mpq3_qwen4bi_sft_dpo_beta1e-1_step768
Text Generation • 4B • Updated • 7
On-Policy Distillation Analysis
.
dpo-info-loss
-
yunjae-won/mpq3_qwen4bi_sft
Text Generation • 4B • Updated • 5 -
yunjae-won/mpq3_qwen4bi_sft_dpo_beta1e-1_step256
Text Generation • 4B • Updated • 5 -
yunjae-won/mpq3_qwen4bi_sft_dpo_beta1e-1_step512
Text Generation • 4B • Updated • 11 -
yunjae-won/mpq3_qwen4bi_sft_dpo_beta1e-1_step768
Text Generation • 4B • Updated • 7
models 376
yunjae-won/4b_noclip_default_KLinefficiency_direct_ReWeight_Warmup_adaKL_reg2_step150
Text Generation • 4B • Updated • 26
yunjae-won/4b_noclip_default_KLinefficiency_direct_ReWeight_Warmup_adaKL_reg2_step125
Text Generation • 4B • Updated • 24
yunjae-won/4b_noclip_default_KLinefficiency_direct_ReWeight_Warmup_adaKL_reg2_step100
Text Generation • 4B • Updated • 24
yunjae-won/4b_noclip_default_KLinefficiency_direct_ReWeight_Warmup_adaKL_reg2_step75
Text Generation • 4B • Updated • 23
yunjae-won/4b_noclip_default_KLinefficiency_direct_ReWeight_Warmup_adaKL_reg2_step50
Text Generation • 4B • Updated • 23
yunjae-won/4b_noclip_default_KLinefficiency_direct_ReWeight_Warmup_adaKL_reg2_step25
Text Generation • 4B • Updated • 25
yunjae-won/4b_noclip_default_ReWeight_Warmup_staticKL_reg1_step150
Text Generation • 4B • Updated • 25
yunjae-won/4b_noclip_default_KLinefficiency_direct_ReWeight_Warmup_adaKL_reg1_step150
Text Generation • 4B • Updated • 25
yunjae-won/4b_noclip_default_ReWeight_Warmup_staticKL_reg1_step125
Text Generation • 4B • Updated • 21
yunjae-won/4b_noclip_default_KLinefficiency_direct_ReWeight_Warmup_adaKL_reg1_step125
Text Generation • 4B • Updated • 24
datasets 322
yunjae-won/dpo-misalignment-qwen4b-experiment-artifacts
Viewer • Updated • 12 • 8
yunjae-won/trl-ultrafeedback-qwen3-30bi-vs-4bi
Viewer • Updated • 60.9k • 60
yunjae-won/mpr-code-qwen3-30b
Viewer • Updated • 66.2k • 15
yunjae-won/mpr-math-qwen3-1.7b
Viewer • Updated • 74.2k • 14
yunjae-won/mpr-math-qwen3-30b
Viewer • Updated • 74.2k • 21
yunjae-won/mpr-code-qwen3-4b
Viewer • Updated • 66.2k • 37
yunjae-won/mpr-math-qwen3-4b
Viewer • Updated • 74.2k • 14
yunjae-won/ub-qwen3-1.7b
Viewer • Updated • 60.9k • 13
yunjae-won/mp-qwen3-1.7b
Viewer • Updated • 100k • 39
yunjae-won/evol-qwen3-1.7b
Viewer • Updated • 78.3k • 13