Posts
All the articles I've posted.
-
How We Cut ML Inference Latency by 40% on Kubernetes
The architecture behind our async model serving platform at Instabase — async workers, RabbitMQ, multi-level caching, and sticky routing to cut inference time by 40%.
-
GupShup: Summarizing Code-Switched Conversations
Our EMNLP 2021 paper on abstractive summarization of Hindi-English code-switched conversations — introducing the GupShup dataset.