Hi, I’m Shivam Dwivedi, Ph.D.

Bridging Linguistics & Machine Intelligence

About Me

As an NLP Solutions Consultant and Ph.D. researcher, I bridge the gap between deep linguistic theory and scalable machine intelligence. Drawing on comprehensive expertise across foundational linguistic disciplines, I architect large-scale human evaluation programs and high-fidelity training data for industry-leading models at tech giants like Google and Amazon AGI. My core strength lies in translating the intricate complexities of human language into actionable, data-driven frameworks. Ultimately, my work ensures that Large Language Models are not only highly performant and context-aware, but also safe, fair, and rigorously evaluated for Responsible AI compliance.

Experience

NLP Solutions Consultant

Google India, Bengaluru | Nov 2025 – Present

Driving Apps Quality initiatives across the Google Play ecosystem by designing and operationalizing large-scale human evaluation programs. Partner with cross-functional engineering and product teams to successfully launch new LLM-powered capabilities. By generating high-fidelity training and evaluation data, my work directly optimizes Search, Ranking, and Recommendation systems, ultimately driving enhanced app discovery and a superior user experience.

Language Engineer

AGI Amazon Development Centre, India | May 2022 – Nov 2025

Drove the linguistic refinement and operational scaling for the successful launch of Amazon’s Nova and Titan foundational models. Directed end-to-end data generation workflows and established stringent quality strategies for labeling experts, significantly improving throughput for LLM and legacy Alexa pipelines. Architected custom quality metrics to ensure strict Responsible AI compliance, align model outputs with core business objectives, and drive automation that improved overall end-user experience and revenue.

Senior Research Fellow & TA

IIT BHU, Varanasi | Jul 2018 – May 2022

Executed comprehensive data collection, data augmentation, and ML evaluation to build a novel binary stammering classifier, directly advancing the capabilities of existing ASR systems to handle impaired Hindi speech. Mentored undergraduate NLP projects in phonology and morphology, successfully bridging the gap between theoretical linguistic frameworks and practical computational applications.

Language Engineer

Language Custodians | Oct 2017 – Jun 2018

Led a cross-functional team of linguists to successfully bootstrap NLP resources for scheduled and low-resource Indian languages. Significantly enhanced the linguistic efficacy of training datasets by applying advanced syntactic and semantic frameworks to large-scale transcription, translation, and transcreation projects.

Education

Ph.D. in Computational Linguistics (NLP)

Indian Institute of Technology (BHU) | CGPA: 9.6

Dissertation: A Computational Linguistic Study of Stammering in Hindi Speakers.
Developed a binary stammering classifier to advance ASR systems and created an annotated Impaired Speech Corpus.

Master’s in Computational Linguistics

BHU, Varanasi | CGPA: 9.2

Awarded the BHU Gold Medal 2017 for best postgraduate performance.

Bachelor’s in Linguistics

BHU, Varanasi | CGPA: 9.0

Awarded the BHU Gold Medal 2015 for best undergraduate performance.

Selected Publications

arXiv (2026)

CORE: Comprehensive Ontological Relation Evaluation for Large Language Models.

Read Paper

Springer Nature Computer Science (2024)

Navigating Linguistic Diversity: In-Context Learning and Prompt Engineering for Subjectivity Analysis in Low-Resource Languages.

Read Paper

Rupkatha Journal (2023)

Breaking the Bias: Gender Fairness in LLMs Using Prompt Engineering and In-Context Learning.

Read Paper

Intl. Journal of Speech Technology (2023)

Binary classifier for identification of stammering instances in Hindi speech data.

Read Paper

Springer Nature Computer Science (2021)

Developing Hindi Stammering Corpus: Framework and Insights.

Read Paper

View Technical Toolkit ↓

Python Human Computation Linguistics Automation SQL Hugging Face TensorFlow scikit-learn spaCy NLTK PySpark Praat Prompt Engineering LLM Evaluation Data Analysis

Languages: Kannauji, Hindi, English

Get In Touch

Interested in collaborating on NLP, LLMs, or AI initiatives? Drop me a message below!