Dell PowerEdge C4140 Deep Learning Performance Comparison - Scale-up vs. Scale - Page 43

Elapsed Training Time for Several Models

Page 43 highlights

Deep Learning Performance: Scale-up vs Scale-out Figure 36: Relative speed performance based on training time After training the PowerEdge C4140 - Configuration M with SXM2 in multi-node configuration, we saw it reached the fastest training in 5.3 hours, overpassing the Non-Dell EMC SN_8x-V10016GB-SXM2 which completed the training time in 6.6 hours. See Figure 36 7.3.1 Elapsed Training Time for Several Models Another aspect we wanted to explore was the accuracy convergence capacity for other models, so we selected models with different depth network topology (vg199, ResNet50, and Inceptionv4) and ran the long tests on PowerEdge C4140 in multi-node configuration and non-Dell EMC 8x - V100 SXM2. The results are show in Figure 37 below. Architectures & Technologies Dell EMC | Infrastructure Solutions Group 42

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53

Deep Learning Performance: Scale-up vs Scale-out
Architectures & Technologies
Dell
EMC
| Infrastructure Solutions Group
42
Figure 36: Relative speed performance based on training time
After training the PowerEdge C4140
Configuration M with SXM2 in multi-node configuration,
we saw it reached the fastest training in 5.3 hours, overpassing the Non-Dell EMC SN_8x-V100-
16GB-SXM2 which completed the training time in 6.6 hours. See
Figure 36
7.3.1
Elapsed Training Time for Several Models
Another aspect we wanted to explore was the accuracy convergence capacity for other models,
so we selected models with different depth network topology (vg199, ResNet50, and Inception-
v4) and ran the long tests on PowerEdge C4140 in multi-node configuration and non-Dell EMC 8x
V100 SXM2. The results are show in
Figure 37
below.