Posts
- Jun 17, 2025
Refresher on machine learning optimizers 1. SGD - First-Order Methods Stochastic Gradient Descent (SGD) uses first-order gradients to update parameters: \[\theta = \theta - \eta \nabla f(\theta)\] SGD treats all parameters equally, leading to slow convergence when parameters have different scales or...
- Feb 14, 2025
Partial solution to BPE Tokenizer Implementation Exercise from Andrej Karpathy.
Corresponding youtube video on the tokenizer topic.
- Feb 11, 2025
Table of Contents Glossed Overview: RLHF for LLM Further Background Readings RLHF in Deep Reinforcement Learning Learning to summarize from human feedback Webgpt: Browser-assisted question-answering with human feedback Training language models to follow instructions with human feedback Training a helpful and harmless...
- Mar 10, 2022
Yesterday I spent way more time than neccessary debugging a piece of python code. It has something to do with how the python dictionary’s get method works with default arguments. TLDR: it is ok when default is a simple value, not recommended...
- Apr 14, 2021
The other day, I shadowed an interview with a data science candidate. The primary focus is obviously not on coding skills, but we do want to assess basic knowledge of the programming language of his choice. So, my colleague asked a very...
- Oct 31, 2019
Task We want to learn a function \(f(q, D)\) which takes in a query \(q\) and a list of documents \(D=\{d_1, d_2, ..., d_n\}\), and produces scores using which we can rank/order the list of documents. Types There are multiple ways we...
- Aug 22, 2019
I was working on a piece of code today and I need to iterate over a iterable multiple times to do some computes. The body of code is the same for all passes. One obvious thing I can do is to do...
- Aug 21, 2019
When working with machine learning problems, often I use python dictionary to map categorical values to its integer encoded values.