Our previous story on Intel's RISC aspirations raised the question of whether x86 was ready to finally move into that upper echelon dominated by RISC-based systems.
The feeling, shared by OEMs and reader feedback alike, is that the chip is not the issue; it's everything around it. A server is more than a CPU, after all, and the infrastructure around it is where the real make-or-break can be found.
With the popular reaction to the story, this begged us to ask: What does x86, particularly Intel's Xeon 7000 line, need to compete in the world of mission-critical computing that RISC still owns?
The Xeon is broken up into three lines:
- The 3000, the low-end, low power part for single socket servers
- The 5000 line, where Intel gets most of its business, used in everything from Web servers to app and cloud servers; and
- The 7000 line, about the closest thing Intel has to a RISC chip, Itanium aside.
The E7 is a 10-core processor while its predecessor, the Xeon 7500, is four- and eight-core processors. Both are capable of processing two threads per core, which is about par for the course among RISC processors. (Or perhaps “par for the cores….”)
It could be argued Sun has the advantage with its UltraSparc T3, which had eight cores and eight threads per core. But up until the recently-released T4, Sun's chips did not use out-of-order execution (OOE), which made their single-threaded performance pitiful compared to x86 and IBM's Power processor.
Sun was, in fact, fudging its performance by throwing cores and threads at the problem to make up for its deficiencies in single-core performance. That's why with OOE, they were able to cut the 16 cores in the T3 to just eight in the T4; and yet in benchmarks the T4 still did better than the T3.
With the Xeon 7500 and E7, Intel integrated many features found only in the Itanium. Chief among them was Machine Check Architecture (MCA) Recovery, a feature that allows the CPU and operating system to isolate errors that would otherwise crash the machine, such as bad memory DIMMs. Other features include QPI self-healing to protect against crashing during inter-processor communication errors and SMI Lane Failover to handle memory corruption errors.
So, since Intel has a chip with many RISC features, what does it need around that chip to make it competitive with IBM Power, Intel Itanium, and Oracle Sparc? A fair amount, it turns out. And the chip itself needs a little help, too.
More Cores, More Threads
Cores and threads make their presence known in some of the most popular compute functions out there, namely business intelligence and data mining. Parallel processing needs cores, period. There's a reason the bulk of the machines on the Top 500 list are quad-core or higher and why the top performing TCP-C benchmarks are all RISC machines.
To be more competitive, Intel has got to go to four threads per core. Adding threads will be a lot easier than adding cores and it will accelerate all parallel processing jobs. Intel has been a stronger advocate of threads than has AMD, which has gone the pure cores route. It's time to double down on cores.
We're Gonna Need a Bigger Bus
One of the big weaknesses of the x86 architecture is that it is a lowest common denominator architecture. x86 servers use the same hardware that's found in your desktop PC, and the demands are so different between the two.
Server architecture is built around the PCI bus, which dates back to the early 1990s. It got a speed bump about a decade ago with the PCI-Express. IBM, on the other hand, took the bus architecture from its z Series mainframes and scaled it down for the Power-based System p servers, which are faster, more reliable, and more interconnected than PCI.
This is a bit of a problem for x86. Sure, HP, IBM, and Dell could make their own bus that's faster than PCI, but Microsoft is not going to make a variety of Windows Server releases for everyone's own flavor of system architecture. That's what happens when the software is made by one vendor and the hardware is made by another. Sun, HP's Itanium group, and IBM each make their own OS tuned for their own bus. But if these guys want a commodity OS, they need to use commodity hardware, and that can't compete with custom hardware.
Sliced and Diced
On the flip side, RISC systems are also defined by how finely they can be sliced into partitions. IBM's microprocessor partitioning can get down to one-tenth of a physical processor's capabilities. x86 software does handle partitioning at the processor level, so four processors in a blade can be divided up into four separate tasks, but that's about the extent of it.
On the x86 side, partitioning is done primarily in software, while on the RISC side, partitioning is done in hardware. There is a distinct advantage to having the partitioning and virtualization at the firmware layer. HP, IBM, and Oracle are providing hardware, virtualization, and software. An x86 virtualized system has hardware from IBM/HP/Dell, virtualization from VMware or Citrix, and the OS from Microsoft or Red Hat. That opens the door to far more problems and far less integration.
Another weakness of x86 is its inability to mix and match generations. RISC systems can run different processor clock speeds and generations or different versions of their operating systems on the same iron without requiring virtualization. Sun lets you mix generations of SPARC chips and versions of Solaris in containers, while IBM AIX 7 allows for old versions of AIX and their apps to be run inside containers without requiring a hypervisor. The support is built into the OS.
Microsoft is still working on that. And given the cloud emphasis of Windows Server 8 and the departure of Bob Muglia as head of the server division, it doesn't look like Windows will be getting any Unix-like features any time soon.
Up All Night
The main selling point of RISC systems is they stay running, no matter what. They are the pinnacle of mission critical computing and reliability. Being fast definitely counts, but being available matters more.
RISC vendors divide RAS (reliability, availability, scalability) into processor-based, memory-based, and application-based camps, while the Xeon is still at just the processor level.
IBM, for example, takes a processor out of service if it detects a failure and shifts the workload completely off the processor. For memory, it allows administrators to install extra RAM in the computer, which is automatically used in the event of memory chip failure.
AIX, Solaris, and HP-UX have had live migration of virtual machines for years. Microsoft is promising it with Windows Server 8, so while it's coming, it's going to be generations behind Unix.
While Xeon has gained some hot swap features, there are some features still missing that RISC machines do enjoy. For example, hot swapping of PCI cards and memory DIMMs is not available in Xeon systems without OS support, and they don't have that OS support yet. For an IBM Power server, it's no problem.
In a cloud environment, it's not so bad. Just partition off the entire server with a bad DIMM or card, take it offline, and do the swap with the whole system down. In virtualized, cloud computing scenarios, that's not really a problem; one server can be taken down while hundreds or thousands of other servers hum right along. But in a mission-critical environment, where the server is the service, that can't happen.
Another area where Xeon is lacking is the enterprise operating systems that run on RISC. The fact is: HP, IBM, and Oracle are in no rush at all to port HP-UX, AIX, and Solaris to x86 (there was an x86 Solaris; Oracle waved bye-bye to that). And as long as that is missing, x86 won't be an enterprise platform. People don't buy hardware for the hardware's sake, they buy it for the software they can run on it.
The primary operating system for many x86 clusters, especially the supercomputer clusters on the Top 500 list, is Linux. Red Hat, Novell, and Oracle all provide good enterprise Linux implementations but they are still decades behind HP-UX and AIX for debugging, logging, load balancing, failover, security, partitioning… all the things mission-critical systems demand.
Since none of the big three RISC players are in any mind to port their operating systems, x86's chances are considerably reduced. x86 now has to pin its hopes on Microsoft and Windows Server 8, which is being designed to be the ultimate cloud and virtualization operating system. Or so Microsoft hopes.
It could be that x86 has no hope of ever penetrating the mission-critical domain of RISC hardware. But it could also be argued that it's not missing much. The emphasis today is on cloud computing, building out. The massive data centers Microsoft, Google, and Facebook have built in the last few years didn't use RISC; they used x86.
Because of the flexibility of a cloud system, the urgency of uptime and being able to do repairs while in operation aren't as urgent. So perhaps in the end, x86 isn't losing a whole lot.
- AMD’s Comeback: “Bulldozer” CPU Shoots for Core i7 – and Wins?
- How UEFI Will Change Your Computer Management
- Do Supercomputers Still Matter?
- Intel's RISC-y Business
- ARM: From the iPhone to the Server