Velocity 08: Energy Efficient Operations

Luiz Barroso from Google is speaking about Energy Efficient Operations. Computing has a great track record of having a positive impact on society. The world needs more computing. But more computing means more energy (usually).

World energy use of servers is around 1% of total electricity consumption. Making efficient computers is harder than making efficient refrigerators. Efficiency is computing speed divided by power usage. But that's too simple. For a server, you have to take into account the efficiency of the compute efficiency, server efficiency, and data center efficiency. These get multiplied together. Ugh.

Data centers are underutilized which accounts for a wasted power provisioning investment and less efficient power and air distribution. Typical serve power supplies dissipate 25% of total energy as heat. Computers are the least efficient in their most common operating points.

The operating cost of a data center is about $9/watt over 10 years. But the cost of building the data center is $10-22/watt. Facility costs are more important than operating costs in energy terms. Maximizing usage is a great way to save energy.

Here's some things to do:

  1. Consolidate workloads into the minimum number of machines needed for peak usage requirements
  2. Measure actual power usage of devices. Nameplates lie and overstate usage.
  3. Study activity trends and investigate oversubscription potential. You don't want to go over (bad for machines and bad from a contractural standpoint).

This let's you pack the most servers in your data center that you can. A study at Google showed that you have to be able to spread computing over a larger number of machines in order to really take advantage of oversubscription. At the facility level, you might be able to host 20% more servers through oversubscription.

If you have a search cluster, a map-reduce cluster, and a web-mail cluster, the oversubscription potential is fairly low. But combined, they have substantially more because mixed workloads balance out demand better. Monitor and "victimize" a defined "best-effort" workload when problems arise.

Switching to energy-proportional computing. Consider the data center as a single computer. Call it a "land-held computer." :-) Most of the time aren't idle or at peak (unlike laptops). This is a result of the fact that high-performance and high-availability requires load balancing and wide data distribution. We design them to work this way. The result is there are no useful idle intervals in which to shut a processor down. There are lots of low activity intervals.

An idle server uses about 50% of the peak power requirement. But if you plot efficiency, the server becomes much less efficient below about 30% usage. 100% isn't realistic, but getting over 30% is.

So, energy-proportional computing is the idea of making the efficiency more linear. This would greatly reduce the need for complicated power management. CPUs are actually better at energy proportionality than other components (like RAM, disk, network, fans, etc.) An idle CPU, for example, consumes less than 30% of it's peak power where as DRAM is about 50%, disks are over 75%, and networking is over 85%!

Moreover, CPUs have active low power modes. A CPU at a slower clock rate still executes instructions, but DRAM and disks in low power mode need to bump up to full power to operate.

If there's any question whether this is a good idea, consider that the human body has a factor of 20 from it's resting power consumption to peak (at least for elite athletes).

The most basic thing you can do is to write fast code. This is the software engineer's biggest contribution to energy efficiency.

Throughout the talk Luiz referenced a paper from ISCA07. I believe this is it: Power provisioning for a warehouse-sized computer by Xiaobo Fan, Wolf-Dietrich Weber, and Luiz Andre Barroso