Hope these will be as helpful to you today as they will be for future me.
LLM Tech Report Notes (updated on 01/21/2025)
01/21/2025
[ coding ]
[ coding ]
reading LLM tech reports.
Reproduce the inference time scaling exp
12/30/2024
[ paper ]
[ paper ]
dive into the minimal experiment to show the inference time scaling.
Cross-entropy loss and its optimization [WIP]
12/12/2024
[ coding ]
[ coding ]
dive into cross-entropy loss and its optimization.
Graph Convolution ≈ Mixup
11/20/2024
[ paper ]
[ paper ]
one of my most liked papers.
Attention and its gradient
10/20/2024
[ coding ]
[ coding ]
dive into attention and its gradient.
Softmax and its triton implementation
10/19/2024
[ coding ]
[ coding ]
implementing softmax using triton