torch-pipeline-parallelism
Guidance for implementing PyTorch pipeline parallelism for distributed model training. This skill should be used when tasks involve implementing pipeline parallelism, distributed training with model partitioning across GPUs/ranks, AFAB (All-Forward-All-Backward) scheduling, or inter-rank tensor communication using torch.distributed.
Installation and usage
Guidance for implementing PyTorch pipeline parallelism for distributed model training. This skill should be used when tasks involve implementing pipeline parallelism, distributed training with model partitioning across GPUs/ranks, AFAB (All-Forward-All-Backward) scheduling, or inter-rank tensor communication using torch.distributed.
Once installed, you can use this skill by running the following command in your terminal:
skills use torch-pipeline-parallelism