Saurabh Shah

🤠

Hello! I'm a research engineer at Ai2 working on post-training OLMo and Tülu language models. I'm focused on RL for code, tool-use, and reasoning.

I write about ML research and engineering at The Learning Curve.

Twitter Blog GitHub Google Scholar LinkedIn Email

Before this I was...

🍎 building an agentic Siri planner powered by Apple Intelligence
🎓 finishing my BS/MS at Penn
🐣 being a kid and having fun (I still try to do this)

Research Papers

OLMo: Accelerating the Science of Language Models

Explored ReLoRA, a parameter-efficient pretraining method. ACL 2024 Main Conference - Theme Paper Award
Explanation-based Finetuning Makes Models More Robust to Spurious Cues

Investigated using free-text explanations to improve robustness of large language models to spurious cues in training data. ACL 2023 Main Conference

My favorite side projects

Griffin LM + CUDA

I learned some cuda and wanted to test my knowledge. Implemented the Griffin language model from scratch in PyTorch with custom CUDA extensions for the scan operation. Deep dive into GPU programming and transformer architectures.
Just the Facts

Ai2 Hackathon project. Chrome extension using GPT-4 to extract objective facts from news articles, with interface to compare fact reporting across different news sources.
Concept Space Embeddings

What if you could steer embedding search with interpretable concepts? Novel (at the time) method for interpretable embeddings using LLMs and Decision Trees. Supports classification, regression, clustering, and post-hoc explanation of black box models.
Compass (Penn Course Recommendation)

Course recommendation web app using collaborative filtering and text embeddings to suggest courses based on difficulty ratings and natural language interests, respectively. Built recommendation system for my team of 4.
PokéGANs

Generated complete Pokémon from names. Fine-tuned GPT-3 for types, stats, abilities; CLIP+VQGAN for images from generated text. Trained custom LSTM and GANs from scratch and compared results to the pretrained models (which were way better 😅)
Comedy Bot

My first exposure to ML and NLP. I wanted a robot that could rate the jokes I'd write before performing them to like 200 people. Played with bag of words/naive Bayes and LSTMs.

Fun stuff

Obligatory "everything else" section:

🎸 Writing songs on my guitar 🏃‍♂️ runnin' 🏂 Snowboarding 🎤 Standup comedy 🚴‍♂️ Unicycling (yes, really) 🏔️ Hiking

I spend most of my free time with my partner Kat and our two little cats, Boby and Gobi