PaceWang

PaceWang

·

hill2hill

AI & ML interests

None yet

Organizations

None yet

upvoted a paper 11 months ago

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Paper • 2507.01006 • Published Jul 1, 2025 • 257

upvoted 3 articles about 1 year ago

Article

You could have designed state of the art positional encoding

FL33TW00D-HF

•

Nov 25, 2024

• 490

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

NormalUhr

•

Feb 7, 2025

• 295

Article

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

NormalUhr

•

Feb 11, 2025

• 126

upvoted a paper over 2 years ago

Getting it Right: Improving Spatial Consistency in Text-to-Image Models

Paper • 2404.01197 • Published Apr 1, 2024 • 31