Hope these will be as helpful to you today as they will be for future me.
2025
- 03/07/2025 » [Research Preview] Speculative Thinking: Large Models Mentoring Small Models for Efficient Reasoning
- 01/24/2025 » [Research Preview] Thinking Preference Optimization
- 01/22/2025 » Optimizers: math, implementations and efficiency
- 01/21/2025 » LLM Tech Report Notes (updated on 01/22/2025)
2024
- 12/30/2024 » Reproduce the inference time scaling exp
- 12/12/2024 » Cross-entropy loss and its optimization [WIP]
- 11/20/2024 » Graph Convolution ≈ Mixup
- 10/20/2024 » Attention and its gradient
- 10/19/2024 » Softmax and its triton implementation