Hi, I’m Utkarsh ๐
I’m a Principal Applied Scientist at Microsoft, where I design the technical roadmap for Word Copilot and build agentic AI systems that make document editing intelligent, controllable, and collaborative.
My work sits at the intersection of LLM research, product engineering, and evaluation science โ translating cutting-edge ideas in multi-agent coordination, fine-tuning, and tool-calling into production-ready systems used by millions.
๐ง What I Work On
At Microsoft, I lead initiatives shaping the next generation of Word Copilot:
- Agentic AI for Word โ architecting multi-agent orchestration systems that coordinate reasoning, planning, and editing inside the Word canvas.
- Vibe Writing Framework โ an HTML DOMโbased editing engine that performs precise insert/update/delete operations at node-level granularity, powering Word Copilotโs real-time document editing.
- LLM Evaluation Science โ creator of POET, an offline prompt A/B and evaluation platform; established LLM-as-Judge protocols for deterministic validation.
- Fine-Tuning & Tool-Calling โ designed training pipelines that improved agent tool prediction accuracy.
- Retrieval & Grounding (RAG) โ enhanced factual grounding with hybrid retrieval, entity disambiguation, and reranking, boosting Hit@K, MRR, and attribution accuracy.
- Safety & Hallucination Mitigation โ implemented Reflexion-style verbal RL for incremental content rewriting when blocked by safety filters.
๐ผ Previous Work
OneAssist โ Team Lead (2018โ2021)
Built customer-facing AI systems from the ground up:
- Designed a customer care chatbot using FAISS, Sentence Transformers, and Siamese networks โ cutting agent transfers from 50% to under 15%.
- Developed a CNN-based crack detection pipeline that reduced customer onboarding time from 23 hours to under 5 minutes.
- Built a Deep RL recommendation engine to personalize app experiences using GPT-like autoregressive user embeddings.
Polestar Solutions โ Senior Data Scientist (2014โ2018)
- Created an analytical chatbot that translated natural-language business questions into SQL queries using multitask learning.
- Led a delivery ETA prediction system that improved accuracy to ยฑ1 day for 82% of customers and cut support calls by 30%.
๐ Education
- MS in Natural Language Processing, UC Santa Cruz โ GPA 3.93/4.0
Coursework: Transformers for Vision & NLP, Reinforcement Learning, Bayesian Inference - BE in Electrical & Electronics Engineering, Delhi Technological University
๐ฌ Research & Publications
- Controllable Generation of Dialogue Acts for Dialogue Systems โ SIGdial 2023 (paper)
- Systems and Methods for Writing Feedback Using an AI Engine โ US Patent MS# 412873-US-NP
- Aligning Multilingual Word Embeddings to Dictionary Meanings (demo)
๐งฉ Interests
- Multi-agent architectures and orchestration
- LLM evaluation and benchmarking science
- Document understanding and intelligent editing
- Model fine-tuning and optimization research
๐ ๏ธ Technical Stack
Languages & Frameworks: Python, PyTorch, HuggingFace, LangGraph, Scikit-Learn, SQL
Technologies: AzureML, Google Cloud, FAISS, DeepSpeed, ONNX, Optuna, W&B
Specialties: LLMs, Multi-Agent Systems, RAG, Fine-Tuning, Evaluation Frameworks
๐ Connect
- GitHub: @UtkarshGarg-UG
- LinkedIn: linkedin.com/in/utkarshug
- Medium: @ug2409