Biography

I am an AI Researcher recently holding a Ph.D. Cum Laude with International Mention in Automatic Control and Robotics from the Universidad Politécnica de Madrid (UPM), under the supervision of Prof. Luis Fernando D'Haro and Prof. Fernando Matía. My doctoral research addressed the critical challenges of aligning large language models and multimodal agents with human intent, focusing on robust post-training recipes, reward modeling strategies, and generative agent reasoning.

As a researcher at the Speech Technology and Machine Learning Group (UPM) and the Intelligent Control Group at Center for Automation and Robotics (CAR UPM-CSIC), my work bridges the gap between theoretical algorithmic design and real-world system deployment. I specialize in Reinforcement Learning from AI Feedback (RLAIF) using PPO and DPO training strategies. Notably, I applied the "Chain-of-Emotion" prompting method to elicit structured emotional responses during my time as a Visiting Research Scholar at the Institute for Creative Technologies at USC.

I am passionate about robust AI evaluation and large-scale deployment. I served as the Main Organizer for the DSTC11 Track 4, where I coordinated international teams and curated over 3 million dialogue turns to benchmark metric robustness across multiple languages. Additionally, I was selected twice to participate in the Amazon Alexa Prize Socialbot Grand Challenge, where I helped build and deploy open-domain socialbots that served real-world users in latency-critical production environments.

As I plan the next step in my career, I am highly motivated to pivot my research towards what I consider the most critical frontier in AI: Inference-time scaling for System 2 Reasoning. Building upon my prior work with multi-objective reward models and Mixture-of-Experts (MoE), I want to dive deep into the intersection of Sparse Models, Expert Routing Dynamics, Monte Carlo Tree Search (MCTS), Process Reward Models (PRMs), and Group Relative Policy Optimization (GRPO).

My résumé can be found here.

Contact

Any questions or comments, feel free to contact me through my email:
mario.rcantelar@gmail.com.

MENU