Day job: fraud ML, LLM agents, and payment optimization at T-Mobile. Nights and weekends: writing what I learn, experimenting with Claude Code, and reading everything I can about how models actually run at inference time.
Genomic language model pretrained from scratch on 13 Drosophila species genomes — 12-layer dilated CNN with a custom character-level tokenizer, trained to predict variant effects across the genome.
Barebones local retrieval-augmented generation pipeline — PDFs in, answers out, with no API calls leaving the machine. Uses Ollama, ChromaDB, and LangChain.
Small-scale LLM built from scratch in PyTorch — character-level tokenizer, transformer architecture, trained on custom text corpora.
GPT-style autoregressive language model implemented in PyTorch — full training loop, attention, positional encoding, and character-level generation.