HP ProLiant DL380p High-performance computing with accelerated HP ProLiant ser - Page 6

Power and cooling support, HP Cluster Management Utility

Page 6 highlights

each with three NVIDIA Tesla M2050 GPGPUs, and 34 DL580 G7 servers with two NVIDIA S1070 GPGPUs each. Tsubame 2.0 has a total peak performance of 2.4 PetaFLOPS (floating-point operations per second), making it the fourth-ranked supercomputer in the world on the November 2010 TOP500 list. The Green500 list ranks the Tsubame 2.0 as the second-most efficient system and declares it ―The World's Greenest Production Supercomputer. Power and cooling support High-performance graphics and accelerator cards can draw much more power than other types of PCIe cards. Many cards use 100 W - 300 W, significantly more than the 75 W of power supported through the standard x16 PCIe connector. The PCIe specification defines a method for supplying additional power to cards using auxiliary power cables and connectors on the motherboard. We added power connectors and optional cable kits for select ProLiant servers to provide a total of 150 W or up to 300 W to accelerator cards. The additional power use generates more heat and increases the cooling burden on servers and facilities. We understand that this additional operating cost is a concern. To control power use, some ProLiant G6 and G7 servers offer enhancements such as HP Advanced Power Manager for power capping and 94% efficient Platinum Common Slot Power Supplies to help you reclaim power and use it to run more equipment with your current infrastructure. These innovations can result in major savings when you consider that there may be hundreds or thousands of nodes. We also designed advanced sensor and fan control systems into G6 and G7 ProLiant servers to facilitate cooling of high-performance graphics cards and accelerators. These servers include a ―Sea of Sensors‖ throughout the system and on components such as DDR3 DIMMs and disk drives. The servers use the sensors to construct an accurate view of the thermal profile within the server. The iLO 3 management controller uses a sophisticated control algorithm to optimize the speed for each fan based on the sensor measurements. This reduces power use and minimizes fan noise in non-peak conditions. The ProLiant SL6500 Scalable System uses shared fans and power supplies to deliver higher performance per watt than servers packaged with individual fans and power supplies. The SL design, with front cabling and no midplane, allows for unrestricted airflow. The ―skinless‖ SL design reduces heat retention. The SL390s G7 supports accelerator cards with passive cooling. This means that the SL6500 Enclosure's shared fans cool the cards rather than the less efficient fans on the cards themselves. HP Cluster Management Utility HPC users want the ability to control complex computing environments. The HP Cluster Management Utility (CMU) makes it easier to manage tens of thousands of compute nodes-both CPUs and GPUs. CMU is an integrated management system with an intuitive graphical interface. It supports all HP Linux-based environments and systems. With CMU, you can control a supercomputer cluster or a simple group of nodes with a single management tool. With CMU, you can: Measure several characteristics of the server environment, including memory utilization and the rate of I/O reads and writes for each server. Monitor and set alerts for temperature, fan speeds, and hardware health metrics (including GPU metrics). Perform operations on multiple servers, such as starting them up and shutting them down. Install the OS on 1 or 1,000 servers, all from scratch, in less than 2 hours. 6

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

6
each with three NVIDIA Tesla M2050 GPGPUs, and 34 DL580 G7 servers with two NVIDIA S1070
GPGPUs each. Tsubame 2.0 has a total peak performance of 2.4 PetaFLOPS (floating-point
operations per second), making it the fourth-ranked supercomputer in the world on the November
2010 TOP500 list. The Green500 list ranks the Tsubame 2.0 as the second-most efficient system and
declares it ―The World’s Greenest Production
Supercomputer.
Power and cooling support
High-performance graphics and accelerator cards can draw much more power than other types of
PCIe cards. Many cards use 100 W - 300 W, significantly more than the 75 W of power supported
through the standard x16 PCIe connector. The PCIe specification defines a method for supplying
additional power to cards using auxiliary power cables and connectors on the motherboard. We
added power connectors and optional cable kits for select ProLiant servers to provide a total of
150 W or up to 300 W to accelerator cards.
The additional power use generates more heat and increases the cooling burden on servers and
facilities. We understand that this additional operating cost is a concern. To control power use, some
ProLiant G6 and G7 servers offer enhancements such as HP Advanced Power Manager for power
capping and 94% efficient Platinum Common Slot Power Supplies to help you reclaim power and use
it to run more equipment with your current infrastructure. These innovations can result in major savings
when you consider that there may be hundreds or thousands of nodes.
We also designed advanced sensor and fan control systems into G6 and G7 ProLiant servers to
facilitate cooling of high-performance graphics cards and accelerators. These servers include
a ―Sea
of Sensors‖ throughout the
system and on components such as DDR3 DIMMs and disk drives. The
servers use the sensors to construct an accurate view of the thermal profile within the server. The iLO 3
management controller uses a sophisticated control algorithm to optimize the speed for each fan
based on the sensor measurements. This reduces power use and minimizes fan noise in non-peak
conditions.
The ProLiant SL6500 Scalable System uses shared fans and power supplies to deliver higher
performance per watt than servers packaged with individual fans and power supplies. The SL design,
with front cabling and no midplane, allows for unrestricted airflow. The ―skinless‖ SL design reduces
heat retention. The SL390s G7 supports accelerator cards with passive cooling. This means that the
SL6500 Enclosure’s shared fans
cool the cards rather than the less efficient fans on the cards
themselves.
HP Cluster Management Utility
HPC users want the ability to control complex computing environments. The HP Cluster Management
Utility (CMU) makes it easier to manage tens of thousands of compute nodes
both CPUs and GPUs.
CMU is an integrated management system with an intuitive graphical interface. It supports all HP
Linux-based environments and systems. With CMU, you can control a supercomputer cluster or a
simple group of nodes with a single management tool.
With CMU, you can:
Measure several characteristics of the server environment, including memory utilization and the rate
of I/O reads and writes for each server.
Monitor and set alerts for temperature, fan speeds, and hardware health metrics (including GPU
metrics).
Perform operations on multiple servers, such as starting them up and shutting them down.
Install the OS on 1 or 1,000 servers, all from scratch, in less than 2 hours.