Hope these will be as helpful to you today as they will be for future me.
2025
- 01/21/2025 » LLM Tech Report Notes (updated on 01/21/2025)
2024
- 12/30/2024 » Reproduce the inference time scaling exp
- 12/12/2024 » Cross-entropy loss and its optimization [WIP]
- 11/20/2024 » Graph Convolution ≈ Mixup
- 10/20/2024 » Attention and its gradient
- 10/19/2024 » Softmax and its triton implementation