Neeraj Varshney

Research Statement

My research spans the full foundation model development lifecycle — pretraining, mid-training, post-training, and reinforcement learning — with a focus on Data-Centric AI, Reasoning, and Mixture-of-Experts architectures.

Specifically, I focus on problems such as Pretraining and Post-Training of LLMs, Mixture-of-Experts (MoE) models, Reinforcement Learning & Alignment, Data-Centric AI & Data Mixture Optimization, Synthetic Data Generation, Scaling Laws & Model Evaluation, LLM Reliability & Hallucination, Efficient LLM Decoding, and Agentic Systems & Tool Use.

My long-term research vision is to make foundation models learn more effectively from data, feedback, and interaction, rather than relying solely on increasing model size or compute. I pursue this through scalable training methodologies, teacher-guided RL, and taxonomy-driven data curation, validated across large-scale foundation models.

Publication Venues

ACL

2022 (4 Conference + 1 Workshop)
2023 (2 Conference + 1 Workshop)
2024 (3 Conference)
EMNLP

2022 (2 Conference)
2023 (1 Conference)
2024 (1 Conference)
EACL

2023 (1 Conference)
NAACL

2024 (1 Conference)
2022 (1 Workshop)
AAAI

2022 (1 Workshop)
2023 (1 Workshop)
AAMAS

2023 (1 Conference)

Research Interests

Pretraining, Mid-Training & Post-Training of LLMs
Reinforcement Learning & Alignment
Data-Centric AI & Data Mixture Optimization
Synthetic Data Generation
Mixture-of-Experts (MoE)
Scaling Laws & Model Evaluation
LLM Reliability, Hallucination & Abstention
Agentic Systems & Tool Use
Efficient LLM Decoding & Inference
Reasoning
Retrieval Augmented Inference
LLM Defense Strategies
Selective Prediction

Technical Skills

Languages

Python, C++, SQL, Bash
ML / DL

PyTorch, PyTorch Lightning, Hugging Face Transformers, TRL
Distributed Training

Slime, Megatron-LM
Inference / Serving

vLLM, SGLang
Data Processing

PySpark, Hugging Face Datasets, Pandas
Experiment Tracking

Weights & Biases, MLflow

Education

2019-
2024
Ph.D. in Computer Science
Arizona State University — Tempe, AZ, USA
- Advisor: Dr. Chitta Baral
- CGPA: 4.0/4.0
- Outstanding CS PhD Graduating student Award for the 2023-2024
2014-18
B.E. (Hons) Computer Science
BITS Pilani, Pilani Campus — Pilani, India
- CGPA: 9.11/10 (with Distinction)
- Experience: ‘Web Intelligence & Social Computing’ research lab under Prof. Poonam Goyal, CEERI research lab under Dr. J.L. Raheja.

Work Experience

2024 -
Present
Senior Applied Scientist
Amazon — Palo Alto, California
- Core contributor to three generations of large-scale Mixture-of-Experts foundation models trained on 40T+ tokens, ranging up to 800B+ total parameters.
- Owned the data curation and quality assessment stack end-to-end — quality filtering, deduplication, topic modeling, taxonomy development, and data-mixture optimization — enabling models to outperform top-tier public datasets such as Nemotron-CC, FineWeb, and DCLM on knowledge, reasoning, and coding benchmarks.
- Designed a multi-dimensional taxonomy curation framework spanning 14 orthogonal document-quality dimensions; resulting filters recover high-value content from deprioritized web data tiers and surpass top-tier data on reasoning and coding benchmarks.
- Led improvements across code, math, and multilingual data tracks — including code-specific quality classifiers and synthetic data generation for reasoning.
- Developing reinforcement learning approaches for improving reasoning and learnability of foundation models, leveraging privileged-information and teacher-guided training signals.
- Received organization-wide internal performance award for high-impact contributions to foundation-model training data quality and downstream model performance.
Summer
2023
NLP Research Intern
Tencent AI
- Detecting and Mitigation Hallucinations of Large Language Models
Summer
2022
Applied Scientist Intern
Amazon Science
- Web Question Answering Leveraging Information Retrieval for Alexa AI
2018-19
Software Engineer
Microsoft
- Contributed towards development of a Machine Learning driven chat recommendation system aimed at augmenting user engagement with Microsoft's product 'Teams'.
- Collaborated with MSR researchers for a feature titled 'Intelligent Feeds' that finds relevant messages for users based on their prior activities and message text features.
Summer
2017
Research Intern
Samsung R&D Institute
- Orchestrated a 'context prediction' application incorporating features based on device events (e.g app usage, location) and sensor data (proximity sensor).

Honors and Awards

Industry
2025

Amazon Internal Performance Award — for high-impact contributions to foundation-model training data quality and downstream model performance.
Academic
2024

Outstanding CS PhD Graduating student Award for the 2023-2024 at Arizona State University
2023

Outstanding Reviewer for EACL’23 (Question Answering track)
2023

Outstanding Research Award, GPSA ASU, 2023
2023

SCAI Doctoral Fellowship, ASU, 2023
2023, 2024

ASU Jumpstart Research Grant, 2023 and 2024

Books

2024
Advances in Multimodal Information Retrieval and Generation
- Springer, Synthesis Lectures on Computer Vision (SLCV).
- Authors: Man Luo, Tejas Gokhale, Neeraj Varshney, Yezhou Yang, Chitta Baral.
- A comprehensive treatment of Transformer-based multimodal retrieval, generation, and retrieval-augmented generation across vision and language.

Service

Area Chair

ACL Rolling Reviews
Reviewer

ACL, EMNLP, EACL (Outstanding Reviewer), Computational Linguistics Journal, COLM, AMLC, CVPR Workshop
Outreach

Author of 20+ ML/NLP articles on Medium with 100K+ cumulative views; mentored several PhD interns and supported multiple co-authored publications.

Collaborators

Bing Yin (Senior Manager of Applied Science at Amazon)
Nasser Zalmout (Applied Scientist at Amazon)
Binxuan Huang (Applied Scientist at Amazon)
Jianshu Chen (Principal Scientist at Amazon)
Hongming Zhang (Senior Research Scientist at Tencent AI)
Wenlin Yao (Senior Research Scientist at Tencent AI)
Dong Yu (Distinguished Scientist at Tencent AI)
Swaroop Mishra (Research Scientist at Google Brain)
Tejas Gokhale (Assistant Professor at University of Maryland, Baltimore County)
Arindam Mitra (Data and Applied Scientist at Microsoft Research)
Daniel Khashabi (Allen AI, Assistant Professor at Johns Hopkins University)
Bing Liu (Professor at University of Illinois at Chicago)
Pratyay Banerjee (Applied Scientist at Alexa AI, Amazon Science)
Eric Robertson (PAR Government)
Ashwin Kalyan (Allen AI)
Yizhong Wang (Allen AI, University of Washington)
Rik Koncel-Kedziorski (Alexa AI)
Kuntal Pal (Applied AI ML Senior Associate at JPMorgan Chase & Co.)
Man Luo (Research Fellow at Mayo Clinic)
Mihir Parmar (ASU)

CV

Research Statement

Publication Venues

2022 (4 Conference + 1 Workshop) 2023 (2 Conference + 1 Workshop) 2024 (3 Conference)

2022 (2 Conference) 2023 (1 Conference) 2024 (1 Conference)

2023 (1 Conference)

2024 (1 Conference) 2022 (1 Workshop)

2022 (1 Workshop) 2023 (1 Workshop)

2023 (1 Conference)

Research Interests

Technical Skills

Python, C++, SQL, Bash

PyTorch, PyTorch Lightning, Hugging Face Transformers, TRL

Slime, Megatron-LM

vLLM, SGLang

PySpark, Hugging Face Datasets, Pandas

Weights & Biases, MLflow

Education

Ph.D. in Computer Science Arizona State University — Tempe, AZ, USA

B.E. (Hons) Computer Science BITS Pilani, Pilani Campus — Pilani, India

Work Experience

Senior Applied Scientist Amazon — Palo Alto, California

NLP Research Intern Tencent AI

Applied Scientist Intern Amazon Science

Software Engineer Microsoft

Research Intern Samsung R&D Institute

Honors and Awards

Industry

Amazon Internal Performance Award — for high-impact contributions to foundation-model training data quality and downstream model performance.

Academic

Outstanding CS PhD Graduating student Award for the 2023-2024 at Arizona State University

Outstanding Reviewer for EACL’23 (Question Answering track)

Outstanding Research Award, GPSA ASU, 2023

SCAI Doctoral Fellowship, ASU, 2023

ASU Jumpstart Research Grant, 2023 and 2024

Books

Advances in Multimodal Information Retrieval and Generation

Service

ACL Rolling Reviews

ACL, EMNLP, EACL (Outstanding Reviewer), Computational Linguistics Journal, COLM, AMLC, CVPR Workshop

Author of 20+ ML/NLP articles on Medium with 100K+ cumulative views; mentored several PhD interns and supported multiple co-authored publications.

Collaborators

Bing Yin (Senior Manager of Applied Science at Amazon)

Nasser Zalmout (Applied Scientist at Amazon)

Binxuan Huang (Applied Scientist at Amazon)

Jianshu Chen (Principal Scientist at Amazon)

Hongming Zhang (Senior Research Scientist at Tencent AI)

Wenlin Yao (Senior Research Scientist at Tencent AI)

Dong Yu (Distinguished Scientist at Tencent AI)

Swaroop Mishra (Research Scientist at Google Brain)

Tejas Gokhale (Assistant Professor at University of Maryland, Baltimore County)

Arindam Mitra (Data and Applied Scientist at Microsoft Research)

Daniel Khashabi (Allen AI, Assistant Professor at Johns Hopkins University)

Bing Liu (Professor at University of Illinois at Chicago)

Pratyay Banerjee (Applied Scientist at Alexa AI, Amazon Science)

Eric Robertson (PAR Government)

Ashwin Kalyan (Allen AI)

Yizhong Wang (Allen AI, University of Washington)

Rik Koncel-Kedziorski (Alexa AI)

Kuntal Pal (Applied AI ML Senior Associate at JPMorgan Chase & Co.)

Man Luo (Research Fellow at Mayo Clinic)

Mihir Parmar (ASU)

2022 (4 Conference + 1 Workshop)
2023 (2 Conference + 1 Workshop)
2024 (3 Conference)

2022 (2 Conference)
2023 (1 Conference)
2024 (1 Conference)

2024 (1 Conference)
2022 (1 Workshop)

2022 (1 Workshop)
2023 (1 Workshop)

Ph.D. in Computer Science
Arizona State University — Tempe, AZ, USA

B.E. (Hons) Computer Science
BITS Pilani, Pilani Campus — Pilani, India

Senior Applied Scientist
Amazon — Palo Alto, California

NLP Research Intern
Tencent AI

Applied Scientist Intern
Amazon Science

Software Engineer
Microsoft

Research Intern
Samsung R&D Institute