Amardeep Kumar
I love maths, coding, and Machine Learning. I work at the intersection of AI research and engineering — tackling reasoning, alignment, and hallucination in LLMs across visual question-answering, spatio-temporal reasoning, code-switching, and multi-modality.
Currently a Software Engineer at DoorDash. Recently graduated from NYU Courant (MS CS, AI specialization). Previously built ML infrastructure at Instabase and Walmart, interned at PineGap.ai, and participated in Google Summer of Code. Published at EMNLP, ACL, and CIKM.
One day I want to start a profitable and calm business solving niche AI research problems.
Featured
-
GenZ to AI Enz: A Roadmap for CS Grads Breaking into AI
A complete series taking CS students and early-career engineers from zero ML knowledge to building real AI systems with LLMs and agents.
-
How We Cut ML Inference Latency by 40% on Kubernetes
The architecture behind our async model serving platform at Instabase — async workers, RabbitMQ, multi-level caching, and sticky routing to cut inference time by 40%.
-
GupShup: Summarizing Code-Switched Conversations
Our EMNLP 2021 paper on abstractive summarization of Hindi-English code-switched conversations — introducing the GupShup dataset.
Recent Posts
-
Bias, Variance, and the Tradeoff Every Model Faces
Why models fail in two opposite ways — being too rigid or too sensitive — and how to find the sweet spot between them.
-
Dropout and Overfitting: Teaching a Network Not to Cheat
What overfitting is, why it happens, and how dropout stops a network from memorising the training data.
-
Transformer Architecture & Key Design Decisions
A deep dive into the transformer architecture, why decoder-only models won, and the key design decisions — RoPE, GQA, Flash Attention, MoE — that define every modern LLM.
-
Normalization: BatchNorm, LayerNorm, and Why Transformers Need a Different One
Why activations drift as they pass through deep networks, and how BatchNorm and LayerNorm fix it in different ways.