Chinchilla: Training Compute-Optimal Large Language Models

Most LLMs are undertrained — for 10x more compute, increase model size and training tokens equally rather than just scaling parameters.
Scaling & Training
Author

Imad Dabbura

Published

February 15, 2024

Chinchilla: Training Compute-Optimal Large Language Models

Back to top