Dell PowerEdge C4140 Deep Learning Performance Comparison - Scale-up vs. Scale - Page 14
Use Case, Benchmark code, Hardware Configuration, Servers, Frameworks, Performance, Training tests,
View all Dell PowerEdge C4140 manuals
Add to My Manuals
Save this manual to your list of manuals |
Page 14 highlights
Deep Learning Performance: Scale-up vs Scale-out 4.1.2 Long Test The long tests were run to get throughput and the training time to reach certain accuracy convergence. We used 90 epochs for training run. These tests were run using the maximum number of GPUs supported by that server. In the section below, we describe the setup used, and Table 1 gives an overall view on the test configuration. Use Case - The benchmark tests are targeting image classification with convolutional neural networks models (CNNs). Benchmark code - TensorFlow Benchmarks scripts Hardware Configuration - Each server is configured based on its maximum GPU support. Servers - The servers tested are PowerEdge R740, PowerEdge C4130, PowerEdge C4140 and non-Dell EMC 8x NVLink GPU server. Frameworks - TensorFlow for single node, and TensorFlow with Horovod library for distributed training. Performance - The performance metrics used for comparison across servers is throughput (images per second) and training time to reach top-5 accuracy and top-1 accuracy. Training tests - We conducted two types of tests. 1- Short Tests: for each test, 10 warmup steps were done and then the next 100 steps were averaged. 2-Long Tests: to get the training accuracy convergence, and elapsed training time. Dataset - ILSVRC2012 Software stack configuration - The benchmarks were run under docker container environment. See table 1 with details. 4.2 Throughput Testing Workload application and model Benchmarks code Servers - Single Node Servers - Multi Node (2 nodes, 4GPUs each) Frameworks Image classification with convolutional neural networks models (CNNs) TensorFlow Benchmarks scripts Server GPU PowerEdge R740 P40 PowerEdge C4140 V100-16GB-SXM2 PowerEdge C4140 V100-32GB-SXM2 Non Dell EMC 8x NVLink server V100-16GB-SXM2 PowerEdge C4140-K V100-16GB-SXM2 PowerEdge C4140-K V100-32GB-SXM2 PowerEdge C4140-M V100-16GB-SXM2 TensorFlow for Single Mode TensorFlow with Horovod library for Distributed Mode Architectures & Technologies Dell EMC | Infrastructure Solutions Group 13