深度学习神器 Deepytorch:为生成式AI和大型模型打造的加速器,让你体验前所未有的性能提升!
stable diffusion v2.1(模型训练方法:dreambooth)
1 x 1
-
batch size=5
-
fp16
提升22%
stable diffusion v2.1(模型训练方法:dreambooth)
1 x 1
-
batch size=5
-
fp16
-
8-bit optimizer
提升21%
LLaMa-7B
2 x 8
-
ZeRO stage 1
-
micro batch size=4
提升15%
LLaMa-13B
2 x 8
-
ZeRO stage 2
-
micro batch size=2
提升29%
LLaMa-30B
2 x 8
-
ZeRO stage 3
-
micro batch size=4
-
activation recomputing
提升98%
LLaMa-65B
2 x 8
-
ZeRO stage 3
-
micro batch size=8
-
activation recomputing
-
params offload
提升30%