December 9, 2005 4:00 AM PST
Power could cost more than servers, Google warns
Last modified: December 9, 2005 9:55 AM PST
- Related Stories
-
Sun makes Niagara an open-source chip
December 6, 2005 -
Sun begins Sparc phase of server overhaul
December 5, 2005 -
Sun has high expectations for Niagara
October 25, 2005 -
Intel powers up plans for low-power chips
August 23, 2005
That situation that wouldn't bode well for Google, which relies on thousands of its own servers.
"If performance per watt is to remain constant over the next few years, power costs could easily overtake hardware costs, possibly by a large margin," Luiz Andre Barroso, who previously designed processors for Digital Equipment Corp., said in a September paper published in the Association for Computing Machinery's Queue. "The possibility of computer equipment power consumption spiraling out of control could have serious consequences for the overall affordability of computing, not to mention the overall health of the planet."
Barroso's view is likely to go over well at Sun Microsystems, which on Tuesday launched its Sun Fire T2000 server, whose 72-watt UltraSparc T1 "Niagara" processor performs more work per watt than rivals. Indeed, the "Piranha" processor Barroso helped design at DEC, which never made it to market, is similar in some ways to Niagara, including its use of eight processing cores on the chip.
To address the power problem, Barroso suggests the very approach Sun has taken with Niagara: processors that can simultaneously execute many instruction sequences, called threads. Typical server chips today can execute one, two or sometimes four threads, but Niagara's eight cores can execute 32 threads.
Power has also become an issue in the years-old rivalry between Intel and Advanced Micro Devices. AMD's Opteron server processor consumes a maximum of 95 watts, while Intel's Xeon consumes between 110 watts and 165 watts. Other components also draw power, but Barroso observes that in low-end servers, the processor typically accounts for 50 percent to 60 percent of the total consumption.
Fears about energy consumption and heat dissipation first became a common topic among chipmakers around 1999, when Transmeta burst onto the scene. Intel and others immediately latched onto the problem, but coming up with solutions, while providing customers with higher performance, has proved difficult. While the rate at which power consumption increases has declined a bit, the overall rate of energy required still grows. As a result, a "mini-boom" has occurred for companies that specialize in heat sinks and other components that cool.
Sun loudly trumpets Niagara's relatively low power consumption, but it's not the only one to get the religion. At its Intel Developer Forum in August, Intel detailed plans to rework its processor lines to focus on performance per watt.
Over the last three generations of Google's computing infrastructure, performance has nearly doubled, Barroso said. But because performance per watt remained nearly unchanged, that means electricity consumption has also almost doubled.
If server power consumption grows 20 percent per year, the four-year cost of a server's electricity bill will be larger than the $3,000 initial price of a typical low-end server with x86 processors. Google's data center is populated chiefly with such machines. But if power consumption grows at 50 percent per year, "power costs by the end of the decade would dwarf server prices," even without power increasing beyond its current 9 cents per kilowatt-hour cost, Barroso said.
Barroso's suggested solution is to use heavily multithreaded processors that can execute many threads. His term for the approach, "chip multiprocessor technology," or CMP, is close to the "chip multithreading" term Sun employs.
"The computing industry is ready to embrace chip multiprocessing as the mainstream solution for the desktop and server markets," Barroso argues, but acknowledges that there have been significant barriers.
For one thing, CMP requires a significantly different programming approach, in which tasks are subdivided so they can run in parallel and concurrently.
Indeed, in a separate article in the same issue of ACM Queue, Microsoft researchers Herb Sutter and James Larus wrote: "Concurrency is hard. Not only are today's languages and tools inadequate to transform applications into parallel programs, but also it is difficult to find parallelism in mainstream applications, and--worst of all--concurrency requires programmers to think in a way humans find difficult."
But the software situation is improving as programming tools gradually adapt to the technology and multithreading processors start to catch on, Barroso said.
Another hurdle has been that much of the industry has been focused on processors designed for the high-volume personal computer market. PCs, unlike servers, haven't needed multithreading.
But CMP is only a temporary solution, he said.
"CMPs cannot solve the power-efficiency challenge alone, but can simply mitigate it for the next two or three CPU generations," Barroso said. "Fundamental circuit and architectural innovations are still needed to address the longer-term trends."
CNET News.com's Michael Kanellos contributed to this report.
See more CNET content tagged:
power consumption,
thread,
Digital Equipment Corp.,
AMD,
multiprocessor

Apple's switch from the PowerPC chip to Intel. It's extremely
important for laptops and servers where heat management is a
concern.
Invent fusion.
to think in ways that are unnatural.
Conventional programming languages approach
concurrency in a way that is unintuitive, but
that's a completely different story.
Erlang (http://www.erlang.org) has been used for
over a decade in commercial products with lots of
concurrency, and we have lots of evidence that
the programming model is both intuitive and safe,
in fact more so that object-oriented design.
We are eagerly awaiting multi-core chips, as they
offer us a perfectly natural way to scale up the
capacity of our products.
Ulf Wiger
Senior Software Architect
Ericsson AB
ESK
While I still believe that, I wonder if in parallel we don't develop a "New money," one based on BTUs (British Thermal Units - a method of detemining heat values of cumbustion sources).
If that happened, might we then start pricing computers on BTUs consumed/expended say, per Million Calculations Per Second? Would "computing economics" then force us all to have supercomputers for personal use, to justify the power expenditure? Or would "shared computing" a rapid growth area now called "outsourcing" or "co-location" computing on more efficient, much larger systems serving many customers at one time, become much more prevalent?
I just wonder.
With compound growth, you can make all kinds of silly prediction sound pausible.
I suppose the ideal solution is stop buying performance and start buying by power consumption. My only thought there is nobody is going to do that.
The cost of power is one of the operating costs, but if you can't include that cost in your revenue mode;, you shouldn't be in business.
I've done power calculations for a 1000 PC array, and the cost was little more than having one extra specialist employee.
We'd all like to run things cheaply, so maybe now's the time to be looking at energy recovery, recycling and sources.
Duh!
- Super Scalar = Multiple Threads? No
-
by kahalb
June 11, 2006 9:41 AM PDT
- Which is better: a superscalar processor that executes up to 8 instructions per cycle, or Niagara with 8 cores, that processes up to 8 instructions per CPU cycle? Sounds the same?
-
Reply to this comment
-
-
See all 35 Comments >>Not if the Niagara instructions are 8 unique programs (threads), all in some 'state' of waiting on memory, in Thread queues.
On 2 core superscaler chip, 16 instuctions per cycle cycle, and 4 core 32 instructions per cycle. Or 8 core Sun with 8 instructions?
It's amazing how quickly we forget about SuperScalar Architectures that can leverage 8 processing units per CPU cycle: Load, Branch, Integer Operation, Floating Point Operation. On one chip. Now Sun Niagara has developed a Non-Superscalar chip that each Core only does one instruction/cycle, but has 8 of them. Rather than 1 processor that can do 8 instructions per cycle. Which would you rather manage to get added through-put: 8 unique instances, or 1?
To get worthwhile jbb results Niagara run 4 JVMs, versus one on SuperScalar chips. Do you want to run 4 instances of your Java application to get the scaling you can with one?
And at what cost? Is Niagara actually cheaper to buy and own than other 8 core solutions?
And since when is doing less work, with less power a novel idea? We can all run on 286s with today's fabs and use a fraction of the power...or simplied 1994 US2 technology to develop a new programming model.