1 result for “scaling”
Mixture of Experts (MoE) is a machine learning architecture in which a model routes each input to a small subset of specialised sub-networks called experts, enabling large model capacity at a fraction of the compute cost.