Dell PowerEdge C4140 Deep Learning Performance Comparison - Scale-up vs. Scale - Page 6

Deep Learning Performance: Scale-up vs Scale-out

Architectures & Technologies

Dell

EMC

| Infrastructure Solutions Group

5

1

Overview

The objective of this whitepaper

is to compare Dell’s

PowerEdge acceleration optimized servers

and determine their performance when running deep learning workloads. The purpose is to

highlight how Dell’s scale out solution is ideally suited for these emerging workloads.

We will compare how PowerEdge C4140 performs using one of the more popular frameworks

like TensorFlow with various neural architectures and compare it to other acceleration optimized

servers in the market, targeting the same workloads. The idea is to investigate whether the

architectural implementation helps PowerEdge C4140 in better utilizing the accelerators when

running hardware level benchmarks like Baidu Deep bench and TensorFlow based benchmarks.

Using Baidu Deep bench, we can profile kernel operations, the lowest-level compute and

communication primitives for Deep Learning (DL) applications. This allows us to profile how the

accelerators are performing at the component level in different server systems. This is very

important since it allows us to look at which hardware provides the best performance on the

basic operations used for deep neural networks.

Deep Bench includes operations and workloads

that are important to both training and inference.

Using TensorFlow as the primary framework, we compare the performance in terms of

throughput, and training time to achieve certain accuracy on ImageNet dataset. We look at

performance at a single node level and multi-node level. We use some of the popular neural

architectures like ResNet-50, VGG Net, GoogLeNet, and AlexNet to do this performance

comparison.

1.1

Definition

Scale up :

Scale up is achieved by putting the workload on a bigger, more powerful server

(e.g., migrating from a two-socket server to a four- or eight-socket x86 server in a rack-based or

blade form factor). This is a common way to scale databases and several other workloads. It has

the advantage of allowing organizations to avoid making significant changes to the workload; IT

managers can just install the workload on a bigger box and keep running it the way they always

have.

Scale out:

Scale out refers to expanding to multiple servers rather than a single bigger server.

The use of availability and clustering software (ACS) and its server node management, which

enables IT managers to move workloads from one server to another or to combine them into a

single computing resource, represents a prime example of scale out. It adds flexibility in allowing

IT organizations to add nodes as the number of users or workloads increase and this helps in

better control of IT budgets.

Dell PowerEdge C4140 Deep Learning Performance Comparison - Scale-up vs. Scale - Page 6

Scale up, Scale out

Page 6 highlights