Arpit Singh Gautam arpitsinghgautam

Hi 👋, I'm Arpit Singh Gautam

🔭 I’m a Data Scientist in the CSG CTO Lab at Dell Technologies, working on optimization, efficient inference, and scalable AI systems.
⚙️ I build systems for disaggregated serving, speculative decoding, and KV cache optimization that significantly improve LLM throughput and latency over existing inference frameworks.
🧠 My interests span generative AI, reinforcement learning, neural architecture search, distributed serving, and reasoning-centric LLMs.
📚 I care about research with real impact – from diffusion-based fact verification and theory of mind distillation to large-scale SQL reasoning and medical imaging.
👨‍🏫 I enjoy mentoring and teaching, including guiding 3300+ learners at IBM Z Datathon and mentoring hackathons and student communities.
💬 Ask me about LLMs, inference optimization, RAG, quantization, RL for models, and ML systems.

2025 — Paper "The Energy of Falsehood" submitted to EACL 2026 FEVER Workshop.
2025 — Paper "Faithful Theory of Mind Distillation" accepted at AAAI 2026 ToM Workshop.
2025 — Released CogniSQL-R1-Zero, a reinforced reasoning model for Text-to-SQL (via arXiv).

Data Scientist – CSG CTO Lab, Dell Technologies (Bengaluru) — Jul 2025 – Present
• Engineered a distributed inference system with disaggregated serving, speculative decoding, and KV cache quantization, achieving ~4x throughput and cutting latency from 2.5s to <1s vs vLLM baselines (5+ patents in the pipeline).
• Developed an RL-based quantization framework for LLM PTQ integrating neural architecture search, reaching 2.6x compression with minimal perplexity loss.
• Designed diffusion-based generative stability methods for automated fact verification, improving robustness and detecting confidently incorrect claims.
• Studied reasoning transfer via sequential SFT + preference refinement, improving reasoning fidelity and alignment (AAAI ToM Workshop 2026).
• Currently building a Mamba-based reranker to improve robustness of RAG systems against adversarial attacks.

Data Science Intern, Dell Technologies — Jul 2024 – Jun 2025
• Built CogniSQL-R1-Zero, a Text-to-SQL reasoning model using GRPO and DeepSpeed on a 7B backbone across 4×A100 GPUs (released via arXiv).
• Achieved SOTA execution accuracy on the BIRD benchmark, outperforming 236B+ parameter models.
• Developed an agentic framework with self-healing, test-time scaling, and CoT reasoning, boosting execution accuracy by ~30% on proprietary data (Copilot now in production).

Area	Venue	Work
Fact Verification	EACL 2026 FEVER (submitted)	The Energy of Falsehood: Generative Calibration of Fact Verification via Diffusion Models
Theory of Mind & LLMs	AAAI 2026 ToM Workshop (accepted)	Faithful Theory of Mind Distillation: Why Preference Based Refinement Improves Imitation
Reasoning & SQL	arXiv	CogniSQL-R1-Zero: Lightweight Reinforced Reasoning for Efficient SQL Generation
Medical Imaging	ICCCNT 2025 (IIT Indore)	Enhancing Lymphoma Detection Using Multi-Layer Hybrid Neural Networks

Core Competencies: LLMs, reasoning, RAG, distributed training & serving, quantization, RL, CV, diffusion models.

17+ hackathons with 12 wins, including international, national, and college-level events.
1st Place – Dell Technologies Industry Hackathon (500+ participants, 2024).
2× “Best Use of IBM Z” at SacHacks IV & V (UC Davis).
Multiple academic awards including Dean’s List, Student Excellence Awards, and MUJ’s “Wizard Programmer” (Gold Medal).