Vanishing Gradients
Training big models used to be reserved for OpenAI or DeepMind. But these days? Builders everywhere have access to clusters of 4090s, Modal credits, and open-weight models like LLaMA 3 and Qwen. Zach Mueller, Technical Lead for Accelerate at Hugging Face and creator of a new course on distributed ML, joins us to talk about what scaling actually looks like in 2025 for individual devs and small teams. We’ll break down the messy middle between “just use Colab” and “spin up 128 H100s,” and explore how scaling, training, and inference are becoming skills that every ML builder needs. We’ll cover: ⚙️ When (and why) you actually need scale 🧠 How distributed training works under the hood 💸 Avoiding wasted compute and long runtimes 📦 How to serve models that don’t fit on one GPU 📈 Why this skillset is becoming essential—even for inference Whether you’re fine-tuning a model at work, experimenting with open weights at home, or just wondering how the big models get trained, this session will help you navigate the stack—without drowning in systems details.
Complete understanding of the topic
Hands-on practical knowledge
Real-world examples and use cases
Industry best practices
Take your learning to the next level with premium features