Data Scientist / ML Engineer

Bhuvan
Chennoju

Building things that learn. Working across the full ML stack - data, training, evaluation, and serving at scale.

now

Building ML infrastructure at work — focused on LLM evaluation pipelines and deployment patterns that hold up in production. On the side: writing more, shipping personal projects, reading broadly across systems and inference optimization.

writing

all →

work

all →

GenLM↗completed

Genomic language model pretrained from scratch on 13 Drosophila species genomes — 12-layer dilated CNN with a custom character-level tokenizer, trained to predict variant effects across the genome.

Local RAG↗completed

Barebones local retrieval-augmented generation pipeline — PDFs in, answers out, with no API calls leaving the machine. Uses Ollama, ChromaDB, and LangChain.

ChottaLLM↗completed

Small-scale LLM built from scratch in PyTorch — character-level tokenizer, transformer architecture, trained on custom text corpora.

GPT from Scratch↗completed

GPT-style autoregressive language model implemented in PyTorch — full training loop, attention, positional encoding, and character-level generation.

elsewhere

Contact Form →Linkedin↗Github↗Kaggle↗

BhuvanChennoju

Bhuvan
Chennoju