Dell PowerEdge C4140 Deep Learning Performance Comparison - Scale-up vs. Scale - Page 38

SN_8X V100_16GB- SXM2, PowerEdge C4140, K-V100-SXM2 16Gb, &32GB-IntelXeon4116, Inception-v4, VGG

Page 38 highlights

Deep Learning Performance: Scale-up vs Scale-out Figure 31: Training with PowerEdge C4140-K-V100-16&32GB-SXM2 (8 GPUs) - multi-node versus Non-Dell EMC SN_8x-V100-16GB-SXM2 SN_8X V100_16GB- SXM2 MN- PowerEdge C4140- % Diff K-V100-SXM2 (16Gb &32GB)-IntelXeon4116 Inception-v4 1606 1625 -1.21% VGG-19 2449 2406 1.78% VGG-16 2762 2820 -2.03% Inception-v3 3077 2845 8.16% ResNet-50 4852 4500 7.81% GoogLeNet 7894 8754 -9.82% AlexNet 16977 12145 39.79% Table 5: 8x GPU Comparison between PowerEdge C4140-K multi-node and 8X SXM2 As seen from the table above, using PowerEdge C4140 with SXM2 shows pretty good performance across various pre-trained neural models. The most common ones i.e. ResNet-50 and Inception-v3 show performance within 8% of 8X SXM2. The only exception is AlexNet where it shows quite a bit of difference between 8X SXM2 and PowerEdge C4140. The good performance shown by PowerEdge C4140 in multi node mode, comparable to a single node server 8x V100-16GB, was reached after the right software stack configuration with the Architectures & Technologies Dell EMC | Infrastructure Solutions Group 37

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53

Deep Learning Performance: Scale-up vs Scale-out
Architectures & Technologies
Dell
EMC
| Infrastructure Solutions Group
37
Figure 31: Training with PowerEdge C4140-K-V100-16&32GB-SXM2 (8 GPUs)
multi-node versus
Non-Dell EMC SN_8x-V100-16GB-SXM2
SN_8X V100_16GB- SXM2
MN-
PowerEdge C4140-
K-V100-SXM2 (16Gb
&32GB)-IntelXeon4116
% Diff
Inception-v4
1606
1625
-1.21%
VGG-19
2449
2406
1.78%
VGG-16
2762
2820
-2.03%
Inception-v3
3077
2845
8.16%
ResNet-50
4852
4500
7.81%
GoogLeNet
7894
8754
-9.82%
AlexNet
16977
12145
39.79%
Table 5: 8x GPU Comparison between PowerEdge C4140-K multi-node and 8X SXM2
As seen from the table above, using PowerEdge C4140 with SXM2 shows pretty good
performance across various pre-trained neural models. The most common ones i.e. ResNet-50
and Inception-v3 show performance within 8% of 8X SXM2. The only exception is AlexNet where
it shows quite a bit of difference between 8X SXM2 and
PowerEdge C4140.
The good performance shown by PowerEdge C4140 in multi node mode, comparable to a single
node server 8x V100-16GB, was reached after the right software stack configuration with the