Hi, I’m Utkarsh ๐Ÿ‘‹

I’m a Principal Applied Scientist at Microsoft, where I design the technical roadmap for Word Copilot and build agentic AI systems that make document editing intelligent, controllable, and collaborative.

My work sits at the intersection of LLM research, product engineering, and evaluation science โ€” translating cutting-edge ideas in multi-agent coordination, fine-tuning, and tool-calling into production-ready systems used by millions.


๐Ÿง  What I Work On

At Microsoft, I lead initiatives shaping the next generation of Word Copilot:

  • Agentic AI for Word โ€“ architecting multi-agent orchestration systems that coordinate reasoning, planning, and editing inside the Word canvas.
  • Vibe Writing Framework โ€“ an HTML DOMโ€“based editing engine that performs precise insert/update/delete operations at node-level granularity, powering Word Copilotโ€™s real-time document editing.
  • LLM Evaluation Science โ€“ creator of POET, an offline prompt A/B and evaluation platform; established LLM-as-Judge protocols for deterministic validation.
  • Fine-Tuning & Tool-Calling โ€“ designed training pipelines that improved agent tool prediction accuracy.
  • Retrieval & Grounding (RAG) โ€“ enhanced factual grounding with hybrid retrieval, entity disambiguation, and reranking, boosting Hit@K, MRR, and attribution accuracy.
  • Safety & Hallucination Mitigation โ€“ implemented Reflexion-style verbal RL for incremental content rewriting when blocked by safety filters.

๐Ÿ’ผ Previous Work

OneAssist โ€“ Team Lead (2018โ€“2021)
Built customer-facing AI systems from the ground up:

  • Designed a customer care chatbot using FAISS, Sentence Transformers, and Siamese networks โ€” cutting agent transfers from 50% to under 15%.
  • Developed a CNN-based crack detection pipeline that reduced customer onboarding time from 23 hours to under 5 minutes.
  • Built a Deep RL recommendation engine to personalize app experiences using GPT-like autoregressive user embeddings.

Polestar Solutions โ€“ Senior Data Scientist (2014โ€“2018)

  • Created an analytical chatbot that translated natural-language business questions into SQL queries using multitask learning.
  • Led a delivery ETA prediction system that improved accuracy to ยฑ1 day for 82% of customers and cut support calls by 30%.

๐ŸŽ“ Education

  • MS in Natural Language Processing, UC Santa Cruz โ€” GPA 3.93/4.0
    Coursework: Transformers for Vision & NLP, Reinforcement Learning, Bayesian Inference
  • BE in Electrical & Electronics Engineering, Delhi Technological University

๐Ÿ”ฌ Research & Publications

  • Controllable Generation of Dialogue Acts for Dialogue Systems โ€“ SIGdial 2023 (paper)
  • Systems and Methods for Writing Feedback Using an AI Engine โ€“ US Patent MS# 412873-US-NP
  • Aligning Multilingual Word Embeddings to Dictionary Meanings (demo)

๐Ÿงฉ Interests

  • Multi-agent architectures and orchestration
  • LLM evaluation and benchmarking science
  • Document understanding and intelligent editing
  • Model fine-tuning and optimization research

๐Ÿ› ๏ธ Technical Stack

Languages & Frameworks: Python, PyTorch, HuggingFace, LangGraph, Scikit-Learn, SQL
Technologies: AzureML, Google Cloud, FAISS, DeepSpeed, ONNX, Optuna, W&B
Specialties: LLMs, Multi-Agent Systems, RAG, Fine-Tuning, Evaluation Frameworks


๐ŸŒ Connect