About

I am a Ph.D. student in Electrical and Computer Engineering at Virginia Tech, advised by Prof. Ruoxi Jia in the Responsible Data Science Lab. I previously completed my M.S. at Virginia Tech and my B.Tech. at the Indian Institute of Information Technology, Una.

My research focuses on AI Safety & Alignment — red-teaming and jailbreak robustness, AI agent safety, and reasoning under data-efficient training regimes. I am broadly interested in understanding and mitigating failure modes in large language models and agentic systems.

Feel free to reach out at mahavirdabas18@vt.edu.

News

  • Jan 2026 Our paper Adversarial Déjà Vu was accepted to ICLR 2026!
  • 2025 Selected as a participant in the Amazon Nova AI Challenge (top 10 teams globally).
  • May 2025 Started my Ph.D. at Virginia Tech.
  • May 2025 Just Enough Shifts (ACTOR) accepted to ICML 2025.

Publications

* denotes equal contribution.

Adversarial Déjà Vu: Jailbreak Dictionary Learning for Stronger Generalization to Unseen Attacks

Mahavir Dabas, T. Huynh, N. R. Billa, J. T. Wang, P. Gao, C. Peris, Y. Ma, R. Gupta, M. Jin, P. Mittal, R. Jia

ICLR 2026

Memory-Induced Tool-Drift in LLM Agents

Mahavir Dabas, J. Jeong, M. Jin, R. Jia

ICML 2026 AIWILD Workshop

Just Enough Shifts: Mitigating Over-Refusal in Aligned Language Models with Targeted Representation Fine-Tuning

Mahavir Dabas, S. Chen, C. Fleming, M. Jin, R. Jia

ICML 2025

Characterizing Model-Native Skills

F. Kang, Mahavir Dabas, M. Ko, R. Jia

Preprint, Apr 2026

Can Generalist Agents Automate Data Curation?

F. Kang, H. Li, A. Nguyen, Mahavir Dabas, J. W. Ma, F. Sala, D. Song, R. Jia

Preprint, Jun 2026

Academic Service

Reviewer: ICML 2025 · NeurIPS 2025 · EMNLP 2025 · ICLR 2026 · ICML 2026

Awards & Honors

  • Amazon Nova AI Challenge 2025 — top 10 teams globally.
  • Graduate Student Fellowship, Bradley Dept. of ECE, Virginia Tech.
  • Institute Silver Medalist, IIIT Una.