CV
Research Statement
-
My research spans the full foundation model development lifecycle — pretraining, mid-training, post-training, and reinforcement learning — with a focus on Data-Centric AI, Reasoning, and Mixture-of-Experts architectures.
Specifically, I focus on problems such as Pretraining and Post-Training of LLMs, Mixture-of-Experts (MoE) models, Reinforcement Learning & Alignment, Data-Centric AI & Data Mixture Optimization, Synthetic Data Generation, Scaling Laws & Model Evaluation, LLM Reliability & Hallucination, Efficient LLM Decoding, and Agentic Systems & Tool Use.
My long-term research vision is to make foundation models learn more effectively from data, feedback, and interaction, rather than relying solely on increasing model size or compute. I pursue this through scalable training methodologies, teacher-guided RL, and taxonomy-driven data curation, validated across large-scale foundation models.
Publication Venues
-
ACL
2022 (4 Conference + 1 Workshop) 2023 (2 Conference + 1 Workshop) 2024 (3 Conference)
-
EMNLP
2022 (2 Conference) 2023 (1 Conference) 2024 (1 Conference)
-
EACL
2023 (1 Conference)
-
NAACL
2024 (1 Conference) 2022 (1 Workshop)
-
AAAI
2022 (1 Workshop) 2023 (1 Workshop)
-
AAMAS
2023 (1 Conference)
Research Interests
-
Pretraining, Mid-Training & Post-Training of LLMs
Reinforcement Learning & Alignment
Data-Centric AI & Data Mixture Optimization
Synthetic Data Generation
Mixture-of-Experts (MoE)
Scaling Laws & Model Evaluation
LLM Reliability, Hallucination & Abstention
Agentic Systems & Tool Use
Efficient LLM Decoding & Inference
Reasoning
Retrieval Augmented Inference
LLM Defense Strategies
Selective Prediction
Technical Skills
-
Languages
Python, C++, SQL, Bash
-
ML / DL
PyTorch, PyTorch Lightning, Hugging Face Transformers, TRL
-
Distributed Training
Slime, Megatron-LM
-
Inference / Serving
vLLM, SGLang
-
Data Processing
PySpark, Hugging Face Datasets, Pandas
-
Experiment Tracking
Weights & Biases, MLflow
Education
-
2019-
2024Ph.D. in Computer Science
Arizona State University — Tempe, AZ, USA- Advisor: Dr. Chitta Baral
- CGPA: 4.0/4.0
- Outstanding CS PhD Graduating student Award for the 2023-2024
-
2014-18
B.E. (Hons) Computer Science
BITS Pilani, Pilani Campus — Pilani, India- CGPA: 9.11/10 (with Distinction)
- Experience: ‘Web Intelligence & Social Computing’ research lab under Prof. Poonam Goyal, CEERI research lab under Dr. J.L. Raheja.
Work Experience
-
2024 -
PresentSenior Applied Scientist
Amazon — Palo Alto, California- Core contributor to three generations of large-scale Mixture-of-Experts foundation models trained on 40T+ tokens, ranging up to 800B+ total parameters.
- Owned the data curation and quality assessment stack end-to-end — quality filtering, deduplication, topic modeling, taxonomy development, and data-mixture optimization — enabling models to outperform top-tier public datasets such as Nemotron-CC, FineWeb, and DCLM on knowledge, reasoning, and coding benchmarks.
- Designed a multi-dimensional taxonomy curation framework spanning 14 orthogonal document-quality dimensions; resulting filters recover high-value content from deprioritized web data tiers and surpass top-tier data on reasoning and coding benchmarks.
- Led improvements across code, math, and multilingual data tracks — including code-specific quality classifiers and synthetic data generation for reasoning.
- Developing reinforcement learning approaches for improving reasoning and learnability of foundation models, leveraging privileged-information and teacher-guided training signals.
- Received organization-wide internal performance award for high-impact contributions to foundation-model training data quality and downstream model performance.
-
Summer
2023NLP Research Intern
Tencent AI- Detecting and Mitigation Hallucinations of Large Language Models
-
Summer
2022Applied Scientist Intern
Amazon Science- Web Question Answering Leveraging Information Retrieval for Alexa AI
-
2018-19
Software Engineer
Microsoft- Contributed towards development of a Machine Learning driven chat recommendation system aimed at augmenting user engagement with Microsoft's product 'Teams'.
- Collaborated with MSR researchers for a feature titled 'Intelligent Feeds' that finds relevant messages for users based on their prior activities and message text features.
-
Summer
2017Research Intern
Samsung R&D Institute- Orchestrated a 'context prediction' application incorporating features based on device events (e.g app usage, location) and sensor data (proximity sensor).
Honors and Awards
-
Industry
-
2025
Amazon Internal Performance Award — for high-impact contributions to foundation-model training data quality and downstream model performance.
-
Academic
-
2024
Outstanding CS PhD Graduating student Award for the 2023-2024 at Arizona State University
-
2023
Outstanding Reviewer for EACL’23 (Question Answering track)
-
2023
Outstanding Research Award, GPSA ASU, 2023
-
2023
SCAI Doctoral Fellowship, ASU, 2023
-
2023, 2024
ASU Jumpstart Research Grant, 2023 and 2024
Books
-
2024
Advances in Multimodal Information Retrieval and Generation
- Springer, Synthesis Lectures on Computer Vision (SLCV).
- Authors: Man Luo, Tejas Gokhale, Neeraj Varshney, Yezhou Yang, Chitta Baral.
- A comprehensive treatment of Transformer-based multimodal retrieval, generation, and retrieval-augmented generation across vision and language.
Service
-
Area Chair
ACL Rolling Reviews
-
Reviewer
ACL, EMNLP, EACL (Outstanding Reviewer), Computational Linguistics Journal, COLM, AMLC, CVPR Workshop
-
Outreach
Author of 20+ ML/NLP articles on Medium with 100K+ cumulative views; mentored several PhD interns and supported multiple co-authored publications.
Collaborators
-
Bing Yin (Senior Manager of Applied Science at Amazon)
-
Nasser Zalmout (Applied Scientist at Amazon)
-
Binxuan Huang (Applied Scientist at Amazon)
-
Jianshu Chen (Principal Scientist at Amazon)
-
Hongming Zhang (Senior Research Scientist at Tencent AI)
-
Wenlin Yao (Senior Research Scientist at Tencent AI)
-
Dong Yu (Distinguished Scientist at Tencent AI)
-
Swaroop Mishra (Research Scientist at Google Brain)
-
Tejas Gokhale (Assistant Professor at University of Maryland, Baltimore County)
-
Arindam Mitra (Data and Applied Scientist at Microsoft Research)
-
Daniel Khashabi (Allen AI, Assistant Professor at Johns Hopkins University)
-
Bing Liu (Professor at University of Illinois at Chicago)
-
Pratyay Banerjee (Applied Scientist at Alexa AI, Amazon Science)
-
Eric Robertson (PAR Government)
-
Ashwin Kalyan (Allen AI)
-
Yizhong Wang (Allen AI, University of Washington)
-
Rik Koncel-Kedziorski (Alexa AI)
-
Kuntal Pal (Applied AI ML Senior Associate at JPMorgan Chase & Co.)
-
Man Luo (Research Fellow at Mayo Clinic)
-
Mihir Parmar (ASU)