Mike Lewis
@ml_perception
232 Following • 7.1K Followers
Llama3 pre-training lead. Partially to blame for things like the Cicero Diplomacy bot, BART, RoBERTa, kNN-LM, top-k sampling & Deal Or No Deal.
RT @VictoriaLinML: 1/n Introducing MoMa 🖼, our new sparse early-fusion architecture for mixed-modal language modeling that significantly bo…
5 months ago
So excited for the open release of Llama 3.1 405B - with MMLU > 87, it's a really strong model and I can't wait to see what you all build with it! https://t.co/9Bg6m3HOFQ Also check out the paper here, with lots of details on how this was made: https://t.co/JXJn3d9p5H
40.6K views • 178 likes • 5 months ago
xAlerts
Public Lists
Articles
Legal