Dell PowerEdge C4140 Deep Learning Performance Comparison - Scale-up vs. Scale - Page 48
Hyper-parameters tuning
![]() |
View all Dell PowerEdge C4140 manuals
Add to My Manuals
Save this manual to your list of manuals |
Page 48 highlights
Deep Learning Performance: Scale-up vs Scale-out 7.4.1 Hyper-parameters tuning The section below are the commands with the hyper-parameter tuning used to maximize the throughput performance in single and distributed mode server implementations. Figure 41 shows the high impact of the hyper-parameter tuning in the throughput performance: Single Node - TensorFlow: #python3 tf_cnn_benchmarks.py --variable_update=replicated -data_dir=/data/imagenet_tfrecord/train --data_name=imagenet --model=ResNet50 --batch_size=128 -device=gpu --num_gpus=4 --num_epochs=90 --print_training_accuracy=true --summary_verbosity=0 -momentum=0.9 --piecewise_learning_rate_schedule='0.4;10;0.04;60;0.004' --weight_decay=0.0001 -optimizer=momentum --use_fp16=True --local_parameter_device=gpu --all_reduce_spec=nccl -display_every=1000 Distributed Horovod - TensorFlow: #mpirun -np 8 -H 192.168.11.1:4,192.168.11.2:4 -x NCCL_IB_DISABLE=0 -x NCCL_IB_CUDA_SUPPORT=1 x NCCL_SOCKET_IFNAME=ib0 -x NCCL_DEBUG=INFO --bind-to none --map-by slot --mca plm_rsh_args "p 50000" python tf_cnn_benchmarks.py --variable_update=horovod -data_dir=/data/imagenet_tfrecord/train --data_name=imagenet --model=ResNet50 --batch_size=128 -num_epochs=90 --display_every=1000 --device=gpu --print_training_accuracy=true -summary_verbosity=0 --momentum=0.9 --piecewise_learning_rate_schedule='0.4;10;0.04;60;0.004' -weight_decay=0.0001 --optimizer=momentum --use_fp16=True --local_parameter_device=gpu -horovod_device=gpu --datasets_num_private_threads=4 Architectures & Technologies Dell EMC | Infrastructure Solutions Group 47
![](/manual_guide/products/dell-poweredge-c4140-deep-learning-performance-comparison-scaleup-vs-scaleout-ccc37c0/48.png)