Hi, I'm

Sanchit Pandey

First-author · ACL ARR 2026 · arXiv:2603.11513

|

Building at the frontier of ML

I'm a final-year student at BITS Pilani focused on LLM research and production AI systems. My first-author paper on RAG utilization failure in sub-7B models is under review at ACL ARR 2026.

On the engineering side, I've built Dual-PPO RLHF systems by modifying TRL internals, production hybrid retrieval pipelines with FAISS + BM25 + LightGBM re-ranking, and full-stack backends in Go and FastAPI that ship in production.

Currently building a multimodal AI platform for India's Ministry of MSME integrating multilingual ASR, OCR, and hybrid retrieval to connect small businesses with government procurement networks.

Tech Stack

ML / AI
PyTorch HuggingFace TRL BitsAndBytes FAISS LangChain scikit-learn AzureML
Languages
Python Go C++ JavaScript TypeScript Dart
Backend & Databases
FastAPI Node.js PostgreSQL MongoDB Redis Firebase
Mobile & Frontend
Flutter React Next.js TailwindCSS
Tools
Docker Git Vercel Linux

First-author publication

Under review · ACL ARR 2026

Can Small Language Models Use What They Retrieve? An Empirical Study of Retrieval Utilization Across Model Scale

Sanchit Pandey  ·  arXiv:2603.11513  ·  March 2026

Demonstrates a net-negative exact match trade-off for sub-7B models under RAG — showing that simply adding retrieval can hurt smaller models. Introduces a parametric knowledge split that isolates utilization failure from retrieval quality, tested across 5 models (360M–8B), 3 architecture families, and oracle/BM25/dense conditions on 1,000 QA pairs.

5 models
360M–8B range, spanning 3 distinct architecture families
2,588
Oracle failures classified into 6 error categories
61–100%
Irrelevant generation — the dominant failure mode across scales

Where I've shipped code

Busy InfoTech / Indiamart Software Engineer Intern
May 2025 – Jul 2025
  • Delivered production backend endpoints for Pooraa (business management software) using Go, Sails.js, and PostgreSQL
  • Reduced API latency by 15% across a 5-member team through targeted debugging and streamlined testing workflows
Go Sails.js PostgreSQL Redis Docker
Indian Red Cross Society Data Analyst
May 2024 – Aug 2024
  • Designed and automated Python ETL pipelines processing 7 years of financial data; performed anomaly detection and data validation, improving allocation efficiency by 25%
Python pandas Jupyter

Things I've built

Hospital Visitor Management

Full-stack system with QR-based entry, Aadhaar-linked registration, OTP authentication, and automated Excel/PDF reporting.

↓ 70% check-in time · ↓ 90% errors

Node.js React MongoDB Firebase

AI Selfie Review App

Production mobile app with real-time camera capture, Snapchat-style live filters via DeepAR SDK, and a remote ML prediction backend.

Google Auth · Firestore · ML Pipeline

Flutter Firebase DeepAR SDK REST APIs

E-Bike Rental Platform

Cross-platform Flutter app + Node.js API for campus e-bike rentals with QR booking, OpenStreetMap search, and real-time tracking.

Piloted 5 bikes · 90% utilization

Flutter Node.js MongoDB

DeepFake Detection Pipeline

CNN + ResNet50 fine-tuning with Error Level Analysis preprocessing on a mixed synthetic/real dataset. Augmentation, Early Stopping, Dropout, and Batch Normalization.

80.12% test accuracy

Python PyTorch CNN ResNet50

AI Architecture Diagram Generator

Converts natural-language project descriptions into structured XML architecture diagrams using MermaidJS and Draw.io.

↓ 70% manual diagramming effort

Python MermaidJS Draw.io

RhythmCo Music Player

Interactive Flutter music player with collaborative playlists, real-time streaming, shuffle mode, and social likes.

Flutter · Firebase

Flutter Firebase

Leadership & Achievements

Ethos Professional Development Club

President · Aug 2024 – Jul 2025

Led a 40+ member club organizing professional development workshops and career guidance sessions at BITS Pilani. Mentored 150+ students on internship preparation and technical skill building.

Competitive Programming

Olympiad Qualifier · 2020

Qualified for the Indian National Olympiad for Informatics (INOI) and Zonal Computing Olympiad (ZCO). Holds Google certifications in Network Security and Security Management (SIEM & Playbooks).

Let's connect

I'm open to AI/ML Engineer, Research Engineer, and Software Engineer roles starting mid-2026. If you're working on LLMs, retrieval systems, or production ML infrastructure — I'd love to hear from you.

Based in Hyderabad, India — open to relocating.