NASPipe: high performance and reproducible pipeline parallel supernet training via causal synchronous parallelism

Shixiong Zhao,Fanxin Li,Xusheng Chen,Tianxiang Shen,Li Chen,Sen Wang,Nicholas Zhang,Cheng Li,Heming Cui

NASPipe: high performance and reproducible pipeline parallel supernet training via causal synchronous parallelism

2022

Shixiong Zhao
Fanxin Li
Xusheng Chen
Tianxiang Shen
Li Chen
Sen Wang
Nicholas Zhang
Cheng Li
Heming Cui

Supernet training, a prevalent and important paradigm in Neural Architecture Search, embeds the whole DNN architecture search space into one monolithic supernet, iteratively activates a subset of the supernet (i.e., a subnet) for fitting each batch of data, and searches a high-quality subnet which meets specific requirements. Although training subnets in parallel on multiple GPUs is desirable for acceleration, there inherently exists a race hazard that concurrent subnets may access the same DNN layers. Existing systems support neither efficiently parallelizing subnets’ training executions, nor resolving the race hazard deterministically, leading to unreproducible training procedures and potentiallly non-trivial accuracy loss.

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations