External Publication

Looking for arXiv endorsement (cs.LG) – RL fine-tuning for VLMs (GRPO, MathVista)

Hugging Face Forums [Unofficial] April 4, 2026

Hi everyone,

I am seeking an arXiv endorsement for cs.LG (Machine Learning) to submit my first paper on RL fine-tuning for vision-language models.

Background: MS in AI (Purdue), working on RL + VLM training systems.

Paper: A Case Study of Staged Metric-Gated GRPO for Visual Numeric Reasoning PDF: https://github.com/kgaero/RL_GSPO_Qwen2.5VLM/blob/main/paper/staged_metric_gated_grpo.pdf

Short summary:

Main result: Exact-match improves 0.375 → 0.75 with stable structure under constrained compute.

If you’re eligible to endorse (cs.LG or related), I’d greatly appreciate it. Happy to share endorsement details via DM.

Thanks!