Loading Events
Event Categories Workshops @ Kempner

Large Language Model Distributed Training Workshop

Date: Friday, October 18, 2024 Time: 10:00am - 1:00pm
Location: Kempner Large Conference Room (SEC 6.242)

The Large Language Model Distributed Training workshop, part of the Workshops @ Kempner series, highlights various parallelization techniques for training large language models. We’ll cover techniques such as Distributed Data Parallelism (DDP), Model Parallelism (MP), Tensor Parallelism (TP), Pipeline Parallelism (PP), and Fully Sharded Data Parallelism (FSDP). In addition to reviewing the advantages of each technique and their use cases, this workshop will provide a few hands-on examples to help with understanding LLM distributed training approaches.

Date: Friday October 18th

Time: 10 am – 1 pm

Location: Kempner Large Conference Room (SEC 6.242)

Presenters: Yasin Mazloumi and Ella Batty

Who can attend this workshop? Open to the Kempner community. Harvard affiliates may join given availability.

What will attendees learn from this workshop?

  • Different parallelization techniques for LLM training using GPUs
  • Different GPU collective communication primitives
  • How to train a transformer in a distributed fashion using DDP and FSDP on GPUs

Prerequisite:

  • Familiarity with PyTorch framework and Python programming
  • Familiarity with LLMs
  • Familiarity with HPC cluster
  • Attending Intro to Distributed Computing will be helpful but not required
  • (Optional) Set up OLMo environment on the cluster

Registration:

Please register your interest as soon as possible here. Space is limited.

 

Contact Information:
For any questions about the workshop, please contact kempnereducation@harvard.edu