Recently, Large Language Models (LLMs) have showcased remarkable generalization capabilities across
a plethora of tasks, yielding notable successes. The scale of these models stands out as a pivotal
determinant in enhancing LLM performance. However, the escalation in model size significantly
amplifies the costs associated with both pre-training and fine-tuning, while simultaneously
constraining inference speed. Consequently, there has been a surge in exploration aimed at devising
novel techniques for model scaling. Among these, the sparse Mixture-of-Experts (MoE) has garnered
considerable attention due to its ability to expedite pre-training and enhance inference speed,
especially when compared to dense models with equivalent parameter counts.
This tutorial endeavors
to offer a comprehensive overview of MoE within the context of LLMs. The discussion commences by
revisiting extant research on MoE, elucidating critical challenges encountered within this domain.
Subsequent exploration delves into the intricate relationship between MoE and LLMs, encompassing
sparse scaling of pre-training models and the conversion of existing dense models into sparse MoE
counterparts. Moreover, the tutorial elucidates the broader advantages conferred by MoE beyond mere
efficiency.
Overall, this tutorial delineates the evolutionary trajectory of MoE within the
landscape of LLMs, underscoring its pivotal role in the era of LLMs.
Sessions | Title | Speakers |
|
Overview & Key Challenges in MoEs and their Crucial Roles in LLMs [Slides] | Tianlong Chen |
|
MoE Architecture Variance, Building MoE from Dense LLMs, and MoE Beyond Efficiency [Slides] | Yu Cheng |
|
How to Train a Superior MoE from a System View? [Slides] | Minjia Zhang |
|
Key Extension - Multi-Modal MoE; Multi-Agent Communications [Slides] | Mohit Bansal, Tianlong Chen |
|
Panel - MoE Designs, Multi-Modal Multi-Task MoE, Multi-Agent MoE | Tianlong Chen (Moderator), Yu Cheng, Beidi Chen, Minjia Zhang, Mohit Bansal |
Contact the Organizing Committee: tianlong@cs.unc.edu, pingzhi@cs.unc.edu