Mechanistic Interpretability lead DeepMind. Formerly @AnthropicAI, independent. In this to reduce AI X-risk. Neural networks can be understood, let's go do it!
1.1K views • 12 likes • 2 months ago
11.4K views • 174 likes • 2 months ago
Mechanistic Interpretability lead DeepMind. Formerly @AnthropicAI, independent. In this to reduce AI X-risk. Neural networks can be understood, let's go do it!
1.9K views • 49 likes • 3 months ago
3.8K views • 87 likes • 3 months ago