Mixtral of Experts

Mixture of Experts routing tokens to 2 of 8 experts, achieving strong performance at a fraction of the dense compute cost.
Scaling & Training
Author

Imad Dabbura

Published

March 17, 2024

Mixtral of Experts

#nlp #llm

Back to top