DeepSeek R1 Coldstart: How to TRAIN a 1.5B Model to REASON

Name: DeepSeek R1 Coldstart: How to TRAIN a 1.5B Model to REASON
Uploaded: 2025-01-27T01:11:42.000Z

Curious how a 1.5B parameter model can solve maths problems better than far larger models? In this video, I demonstrate how DeepSeek R1 leverages lengthy chains of thought to enhance its mathematical reasoning. We take a close look at how DeepSeek R1 prompts are structured and generated according to the R1 paper—then reproduce these chain of thought prompts via the DeepSeek R1 coldstart method and my own maths compiler to create synthetic training data. I then walk through the entire fine-tuning process, step by step, showing how even a relatively modest model can outperform bulkier rivals using DeepSeek R1’s coldstart technique. If you’re fascinated by AI breakthroughs or simply enjoy seeing a thorough training pipeline, this detailed behind-the-scenes session is for you. github repo for math compiler: https://github.com/chrishayuk/chuk-math github repo for verifiers: https://github.com/chrishayuk/verifiers 00:00 - intro 01:10 - DeepSeek R1 Chat 03:35 - DeepSeek R1 Ollama 04:44 - Think Tags 05:04 - Deep Seek R1 paper 13:45 - Generating synthetic long chains of thought 15:25 - Translating the CoT to natural language 18:40 - Self Reflection and Self Correction 22:50 - Generating sample data 30:06 - Testing the Qwen2.5-1.5B 30:52 - Fine Tuning Qwen2.5-1.5B with our Coldstart data 34:52 - Chatting with our Fine Tuned Model 39:55 - Conclusion

DeepSeek R1 Coldstart: How to TRAIN a 1.5B Model to REASON

Top Bluesky Posts