Are you hiring??

Great news!! I am looking for Summer-2024 internship, eager to solve some impactful problems edged on LLMs, Deep Learning, and NLP.

About me

Hello there! I am pursuing a Master's in Computer Science from New York University (Courant Institute of Mathematical Sciences). I am interested in building and breaking ML apps, and I love to solve problems that are at the intersection of Research and Engineering.

Prior to my MS at NYU, For three years, I conducted research in NLP and worked at Instabase and Walmart on RAG applications using LLM (Document AI), efficient deployment of ML models, monitoring, and drift detection(ML Ops), etc. I also published papers on code-switching, dialogue summarization, keyphrase extraction, and generation.

What i'm doing

  • design icon

    LLM Ops

    Building cool projects around efficient finetuning and serving of LLMs. Previously Worked in Model-Service team at Instabase and developed infrastructures and services responsible for ML model serving.

  • ai-research

    NLP Research

    Interested in NLP research, with current focus on Reasoning and Hallucinations problem in LLMs. Previously worked on Code-Switching, Dialogue Summarization, Keyphrase Extraction and Generation.

  • open-source-icon

    Open Source Projects

    An active open-source contributor, contributing and mentoring projects around ML and DL tools and libraries.

  • hobby icon

    Stuffs

    I like to trek, travel, do adventure sports, and cook tasty foods. Leisurely I play soccer, badminton, or any other team sports.

Education

  1. New York University - Courant Institute of Mathematical Sciences

    Master of Science, Computer Science Sep 2023 - May 2025

    Subjects: Deep Learning, Large Language and Vision Models, Computer Vision, Operating Systems.

  2. Indian Institute Of Technology (ISM), Dhanbad

    Bachelor of Technology, Computer Science and Engineering July 2016 — April 2020

    Major Subjects: Data Structures, Algorithms, Object Oriented Programming, Operating Systems, Database Management and Systems, Distributed Systems, Discrete Mathematics, Calculus.

  3. Hope Hall Foundation School, Delhi

    Senior Secondary School (Science) 2013 – 2015

    Subjects: Mathematics, Physics, Chemistry, Physical Education, English.

Experience

Experience

  1. Software Engineer, Machine Learning

    Instabase

    January 2022 - August 2023

      Designed and implemented Model inference service, Regression and load-testing framework to integrate Large-Language-Models and Generative AI capabilities into Instabase’s platform.

      Devised a model drift detection pipeline that uses layout change detectors to detect format drift and uses Least-Squares Density Difference and Maximum Mean Discrepancies on document embedding to catch content drift in the documents.

      Designed and implemented the async Model service, which combines async workers, rabbit-MQ, two levels of caching, and sticky routing techniques to improve inference time by 40% on a compute-limited Kubernetes environment.

  2. Software Engineer II

    Walmart Global Tech India

    August 2020 — December 2021

      Implemented the entity embedding technique for sparse categorical values, which improved the accuracy of the existing forecasting systems by 17% and reduced the inference time by 15%. This also reduced the effort required for feature engineering.

      Tweaked the gradient boosting algorithm to incorporate the feedback of the distribution center’s manager while calculating errors to form decision trees and implemented weighted loss to penalize misclassifications of perishable items such as fruits and dairy products more heavily.

  3. Google Summer of Code, Developer

    AOSSIE org.

    May 2019 – Aug 2019

    Developed a Google Chrome extension that uses natural language processing (NLP) to detect toxic comments, clickbait, and fake news on news websites and social media platforms like Facebook and Twitter. Trained a stance detection model for fake news classification and implemented a backend service using Flask for model inference. Designed a three-level cache system to optimize latency and avoid duplicate inference requests.

  4. Applied Research Intern

    Genesys

    May 2019 – July 2019

    Implemented passage ranking algorithm to boost the accuracy of in-house Question and Answer services’ model on some client data sets.Created a framework to train AllenNLP’s machine reading comprehension models on in-house medical data as part of a client’s POC and proposed its integration with the homegrown Chat Bot.

research

Publications

  1. GupShup: Summarizing open-domain code-switched conversations

    First author. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

  2. Transformers on Sarcasm Detection with Context

    First author. In Proceedings of the Second Workshop on Figurative Language Processing, Association for Computational Linguistics.

  3. LDKP - A Dataset for Identifying Keyphrases from Long Scientific Documents

    First author. Workshop on Deep Learning for Search and Recommendation, co-located with the 31st ACM International Conference on Information and Knowledge Management (CIKM).

Projects

  1. transformerkp: A transformers based library for keyphrase identification from text documents.

    transformerkp allows you to train and apply state-of-the-art deep learning models for keyphrase extraction and generation from text documents. It supports several benchmark datasets and evaluation metrics for keyphrase extraction and generation.

  2. t-CRF: CRF head on top of Transformer for sequence tagging.

    t-CRF enables user to use Conditional Random Field layer on top of any Transformer based sequence tagger like POS tagger, entity recognition, etc.

  3. SpanElectra: A language model with accuracy of spanBERT and efficiency of ELECTRA.

    SpanElectra is an efficient language model that uses span boundary objective from SpanBert LM to capture span-level context but uses the discriminator-generator training method inspired by Electra LM for efficient low-resource training.

  4. Question-Generation using Language Model.

    Given a paragraph, generate all the possible questions related to this paragraph. If a context is provided, create those questions from that paragraph whose answer should be that context.