The Manifold Dial: Visualizing Why DeepSeek's mHC Stabilizes Deep Networks
Interactive exploration of Manifold-Constrained Hyper-Connections - how DeepSeek fixed the signal explosion problem in deep residual networks using 1967 mathematics
First empirical demonstration of activation-level sandbagging detection. Linear probes achieve 90-96% accuracy across Mistral, Gemma, and Qwen models. Key finding - sandbagging representations are model-specific, and steering can reduce sandbagging by 20%.
I tested activation steering on 4 agent behaviors across 3 models. The results surprised me.
A deep dive into building distributed LLM evaluation infrastructure that actually scales - architectural decisions, trade-offs, and lessons learned.
A practical framework for evaluating your multi-agent context management strategy. From ad-hoc string concatenation to self-evolving context systems - where does your architecture stand?
A hands-on exploration of writing custom GPU kernels with OpenAI Triton, going from PyTorch's 11% bandwidth utilization to 88% on RMSNorm.
A deep dive into implementing speculative decoding from scratch, with benchmarks on GPT-2 and extensions to diffusion models.
Interactive exploration of Manifold-Constrained Hyper-Connections - how DeepSeek fixed the signal explosion problem in deep residual networks using 1967 mathematics
First empirical demonstration of activation-level sandbagging detection. Linear probes achieve 90-96% accuracy across Mistral, Gemma, and Qwen models. Key finding - sandbagging representations are model-specific, and steering can reduce sandbagging by 20%.
I tested activation steering on 4 agent behaviors across 3 models. The results surprised me.
A deep dive into building distributed LLM evaluation infrastructure that actually scales - architectural decisions, trade-offs, and lessons learned.
A practical framework for evaluating your multi-agent context management strategy. From ad-hoc string concatenation to self-evolving context systems - where does your architecture stand?