skip to content
LMMs-Lab

Search

LLaVA-Critic-R1

LLaVA-Critic-R1

Unified Critic and Policy Model Through Reinforcement Learning, achieving SoTA policy performance at 7B scale.