
LLaVA-Critic-R1
Unified Critic and Policy Model Through Reinforcement Learning, achieving SoTA policy performance at 7B scale.

About Us
LMMs-Lab is a non-profit research-oriented organization with a group of passionate researchers, we share the sincere passion for developing multimodal intelligence.