BlueTube

DeepSeek R1 Coldstart: How to TRAIN a 1.5B Model to REASON

Curious how a 1.5B parameter model can solve maths problems better than far larger models? In this video, I demonstrate how DeepSeek R1 leverages lengthy chains of thought to enhance its mathematical reasoning. We take a close look at how DeepSeek R1 prompts are structured and generated according to the R1 paper—then reproduce these chain of thought prompts via the DeepSeek R1 coldstart method and my own maths compiler to create synthetic training data. I then walk through the entire fine-tuning process, step by step, showing how even a relatively modest model can outperform bulkier rivals using DeepSeek R1’s coldstart technique. If you’re fascinated by AI breakthroughs or simply enjoy seeing a thorough training pipeline, this detailed behind-the-scenes session is for you. github repo for math compiler: https://github.com/chrishayuk/chuk-math github repo for verifiers: https://github.com/chrishayuk/verifiers 00:00 - intro 01:10 - DeepSeek R1 Chat 03:35 - DeepSeek R1 Ollama 04:44 - Think Tags 05:04 - Deep Seek R1 paper 13:45 - Generating synthetic long chains of thought 15:25 - Translating the CoT to natural language 18:40 - Self Reflection and Self Correction 22:50 - Generating sample data 30:06 - Testing the Qwen2.5-1.5B 30:52 - Fine Tuning Qwen2.5-1.5B with our Coldstart data 34:52 - Chatting with our Fine Tuned Model 39:55 - Conclusion

Top Bluesky Posts

  • DeepSeek have apparently put some effort into avoiding degradation by doing a what their paper calls a "cold start" using high quality human conversations. youtu.be/Pabqg33sUrg

You may also like

  • HariFun #165 - Morphing Digits

  • DeepSeek stole our tech... says OpenAI

  • Oh No! China Stole Data From OpenAI!

  • Tatsuo Miyajima: Innumerable Life / Buddha

Powered by

(but not affiliated with)

Bluesky
YouTube

Created by mjd.dev