Blake Ledden
I run experiments on LLM training dynamics and write about what I find. Currently focused on distillation, constitutional training, and reinforcement learning — with an emphasis on statistical rigor (10-seed validation, effect sizes, confidence intervals).
Recent Research
I recently completed a series of experiments on the Tinker platform, testing proposals from the Thinking Machines Lab. Key findings:
- The Distribution Cliff — Hybrid distillation (off-policy → on-policy) catastrophically fails with capability gaps. 160 runs across Qwen and Llama families. The GKD paper's recommendation doesn't generalize.
- Open Character Training — Constitutional DPO improves character alignment +39%, reduces adversarial break rate from 65% to 35%. Prompt distillation works (84% success without system prompts). 47 runs, 5 personas.
- SL vs RL Efficiency — Empirically validated "LoRA without regret" theory. Supervised learning converges in 2 episodes regardless of problem size. RL scales as n^0.89. Effect size: g = -8.2.
Facilitair
I'm also founder at Facilitair.ai, building AI orchestration systems. Trained 20+ model versions (44.7M-5.75B params) achieving 90%+ routing accuracy with <100ms latency. The system predicts execution strategy, task domain, required capabilities, and optimal model simultaneously.
Background
Previously 4+ years at Apple on authentication and developer tooling. Before that, I gravitated towards opportunities in startups and IT roles where I kept finding problems that could be automated.