For the last two or three generations, the computer, by definition, has been something small. “The next big thing,” as presented by cool-looking guys in black turtleneck sweaters and blue jeans, has been something you hold in your hand. It’s something folks can stare at and be amazed while congregating at the coffee shop.
There was a time when the computer, by definition, was something big. “The next big thing” was presented by stern-looking, preoccupied fellows in white lab coats and Keds, and the “device” consumed an entire college annex. Folks could only stare at it and be amazed from behind sheet-glass windows. But not only was the computer physically large; it stood for something large, something greater than any one of us, something as bold as all of us combined. It represented the ideals and aspirations of a country, not the angst and self-absorption of its populace. Since World War II, supercomputing was always about staring down the practical limitations, the laws of physics, and the barriers of our productivity — and saying no.
Last November, when a Chinese supercomputer surged past all its American counterparts to claim the coveted #1 position on the Top 500 performers’ list, it was as if an unchanged page from a 1950s newsmagazine had traversed a wormhole in time, into the Web. Some analysts touted it as a potential “Sputnik moment,” an opportunity for competition to spur new US government investment in scientific research. For at least a week, something big appeared to matter again.
“While this was a stunning achievement, this does not mean China has surpassed the U.S. in supercomputing,” announced Lawrence Livermore National Laboratories Associate Director Dona Crawford, in a February 2011 speech to the Commonwealth Club in San Francisco. Several times in the past, Livermore had held the top spot on the Top 500 with its IBM BlueGene systems, and had defended that position vigorously.
Now, sounding a little like a leading political candidate who, after losing a straw poll, publicly denied the relevance of straw polls, Crawford continued, “To address this question, it's important to look not just at the hardware, but the software and applications running on these systems — what I call the ‘computing ecosystem.’” That ecosystem, as she described it, included a thorough and granular comprehension by American scientists of the physics, the math, the processes and procedures, and the architecture not just of computers but the problems they are designed to solve — a comprehension she implied the Chinese can’t just pull out of a hat. Americans build supercomputers, she said — albeit at great length — to do big things.
But the one thing that matters, at least at the moment, is how those supercomputers perform in benchmark tests. And the reason why it matters is because that’s what triggers the funding. As Crawford closed, “The latest Top 500 list is a demonstration of China's HPC [high-performance computing] plans and a step along their roadmap. As such, it is a wake-up call to the U.S.. The U.S. can’t afford to be complacent as global competition heats up. There has always been international competition in HPC, but 2011 is different from the past. We are currently experiencing a significant technology disruption, there is a different climate of global economic competitiveness — and these two, coupled with the fact that the U.S. does not have a funded comprehensive plan to address the technology transition to exascale, provides an opportunity for others to leapfrog the U.S.”
“Exascale” refers to the theoretical capability of supercomputing power to grow in orders of magnitude — some say literally by a factor of 1,000 — by 2018. The “scale” used in demonstrating this power, at least in discussions in public conventions and the halls of Congress, is the Linpack benchmark used in determining Top 500 rankings. Presently, this benchmark says that the Chinese supercomputer Tianhe-H1 is now 46% faster than the U.S.’ fastest machine, a Cray cluster of nearly 225,000 AMD Opteron cores, called Jaguar.
In February 2011, President Obama requested $112 million be apportioned in the 2012 Federal Budget specifically for research and development of exascale computing. This request was made in the wake of, and perhaps in response to, a stern and plainly stated warning in March 2010 from DARPA, the research arm of the Department of Defense. That warning plainly states exascale cannot be done if supercomputer evolution proceeds on its current path, and cites three practical concerns:
- If high-performance computers (HPC), including supercomputers, continue to be designed and manufactured as they are now, within the next few years they would require their own new power plants. The cost of constructing these plants every five years could exceed $15 billion.
- To the extent that they’re connected to the Internet, supercomputers will become the biggest, most attractive, and simultaneously most vulnerable targets of coordinated cyber attacks.
- The software required to manage the colossal tasks planned for supercomputers (including a simulated 2020 national power grid, a simulated manned mission to Mars, and a model for evacuating California in the event of a 9.0 earthquake) must be fully funded. Innovation isn’t free.
The Commoditization Problem
The efficacy of an evacuation plan should an 8.0 or higher earthquake strike Los Angeles, the next generation of safeguards against nuclear power plant meltdowns, the toxic effects of chemical dispersants against massive oil slicks — these are the types and the scale of problems being entrusted to America’s supercomputers. They are the very symbol of U.S. ingenuity and scientific prestige, and they’re expected to pave the way for data center servers in the enterprise space to soon take over these tasks.
But although supercomputers’ perceived performance has scaled up in recent years in orders of magnitude, the problems they’re being tasked to solve and the expectations being placed upon them are scaling up even faster — much more so than their present architecture and management software can handle.
“The problem is that we want to eat our cake and have it, too,” explains Dr. John McCalpin, research scientist with the Texas Advanced Computing Center at the University of Texas at Austin. “We want machines that are easier to program. On the other hand, we have a tendency to purchase machines based on their peak performance. And as a consequence, we don’t provide the economic incentives for any vendors to make machines that are easier to program.”
Dr. McCalpin — known to many of his colleagues and students as “Dr. Bandwidth” — has devoted his career and much of his life to the study and development of “performance,” and to the need for a new, more practical, and perhaps more feasible definition of true supercomputing power. “I spent the last 15 years studying the characteristics of different applications in high-performance computing,” he remarks.
The current decade’s problem starts with the previous decade’s solution: the processor. Without a doubt, the greatest breakthrough in supercomputing in the past decade — which led to what many perceive as the industry’s resurgence — is the adoption of what the industry calls commercial off-the-shelf (COTS) processors. Essentially it’s the use of everyday x86 CPUs (and now x64, their 64-bit successors) produced for commercial servers, such as Intel Xeon and AMD Opteron processors, as well as competitors such as IBM’s PowerXCell. Data center clusters utilize blades or boards that stack four, six, or eight of these multicore units together, some now with as many as ten cores apiece and before long, from AMD, sixteen. Stacks of supercomputer cores now number well into the thousands.
Since the first semi-annual Top 500 list was published in June 1993, Dr. McCalpin has kept close tabs on the contents of each list, and has assembled the following data: In the very first list, compiled prior to the advent of dual-core processors, 95 machines had only one processor, and thus one core. Some 265 machines on that first list had no more than four processors. By the end of 2002 — still before the multicore era — COTS processors were responsible for less than 10% of the total compute power generated by the Top 500 machines. As of November 2010, x86/x64 processors were responsible for 84% of that compute power aggregate. And the biggest slice of the remainder is attributable to IBM’s PowerXCell (formerly Power) architecture, which had powered Apple Mac computers up until 2006, and now powers the Sony PlayStation 3 game device.
As of November 2010, only five machines among the Top 500 are powered principally by non-COTS processors; and 265 machines have between 1,280 and 6,550 processor cores. What makes these cores stackable in supercomputer architectures are devices called interconnects — literally the fastest electronic communications systems ever developed. Even so, as TACC’s Dr. McCalpin points out, interconnects cannot even begin to compensate for an architectural deficiency introduced by off-the-shelf CPUs.
“If you go look at an architecture manual for a microprocessor family, you’ll find that communications does not exist as an architectural concept,” he explains.
What passes for “communications” between threads of parallel source code is what Intel calls ordered memory references. Ostensibly, a strongly ordered memory model enables two or more threads to access the same memory locations apparently in parallel, while maintaining the integrity of that memory as though the references were sequential. Sharing memory is one way for cores to pass data to one another, especially with x86/x64 architecture where registers (simple storage bins for variables) are scarce. So to accomplish ordered references, x86/x64 often relies on a technique called fences — a way to subdivide memory into partitions among threads and combine them later for the proper result. But that technique is one of the costliest in terms of instruction cycles. What’s more, Dr. McCalpin points out, “You can’t optimize it.”
Even with the fastest interconnect fabric, McCalpin explains, a message sent by one CPU may take 1 microsecond (1 µs) to reach another CPU. “If the processors are running at 3 GHz, that’s 3,000 instructions that get lost on either end while you’re waiting for this message to get sent. Part of that interconnect latency is the speed of light, which we don’t know what to do about. But most of the latency is due. . . to the [CPUs’] I/O architecture. They were originally designed to talk to disks, when a microsecond was so fast that it didn’t matter because accessing a disk would take 10 milliseconds [10 ms, or 10,000 µs]. When you try to use that same infrastructure to connect between processors, even when you work very, very hard to make it as efficient as possible, 1 to 2 µs is about the best you can do.”
Engineers created interconnect fabrics such as Sun Microsystems’ InfiniBand (now an Oracle technology) to provide supercomputer cores with a high-speed alternative for referencing memory, one which would overcome the x86/x64 architectural limitations. Indeed, InfiniBand has been so successful at this task, McCalpin reports, that some of TACC’s customers prefer their jobs to be run on its slower Lonestar machine — the one with the modern quad-data-rate InfiniBand interconnect — than the newer, theoretically more powerful Ranger machine with the older single-data-rate InfiniBand.
TACC announced last August that it will be replacing its old Lonestar with a new Lonestar that promises even faster data rates using a 40 gigabit per second (Gbps) InfiniBand fabric from Mellanox. And the InfiniBand Trade Association is promising theoretical throughput of 300 Gbps over 12 simultaneous channels by the end of this year.
Yet that huge data rate comes at a cost that the creators of the Internet itself believe may be prohibitive. In its March 2010 report, DARPA pointed out that there have historically been three easy ways to improve the performance of a COTS-based supercomputer: You could increase the clock speed of the CPU, you could decrease the supply voltage to enable tighter component integration, and you could increase the number of transistors on the CPU. Those three dials have effectively been turned just about as far as they can go; you can’t increase clock speed further without overheating, and you can’t decrease voltage without introducing error rates. Maybe you can still pack more transistors, but only if materials innovations continue to enable further miniaturization without creating power leaks.
In the absence of those three methods, the fourth alternative was to rely on interconnects like InfiniBand. But in DARPA’s words, “Current interconnect protocols are beginning to require energy and power budgets that rival or dwarf the cost of doing computation.”
The Parallelization Logjam
In 1964, computer architect Seymour Cray launched the supercomputing industry with an innovation for the CDC 6600 called pipelining, a way to expedite processes by decomposing them mathematically and distributing them across multiple processors. From the modern programmer’s standpoint, this principle is called parallelism. It’s this principle that eventually led to multicore.
There is a limit to how much any program compiled for any microprocessor can run in parallel — meaning, how much it can be subdivided into threads that can be executed simultaneously. The original way to implement parallelism in software, still used today with Intel’s Itanium processors, is by explicitly marking the places in the code where threads can be subdivided. When a program is written in an abstract, high-level language like C++, compilers can insert those marks for CPUs to find and use. In an effort to make parallelism more popular, Intel has recently been offering developers using Microsoft Visual Studio a toolkit called Parallel Studio. With it, they can experiment with these concepts for the first time.
Getting commercial software developers to try out parallelism for themselves has been a matter of delicate and precision-guided education. One company providing this education is Hewlett-Packard. “When we first started working on the x86 clusters that have become the dominant form of computing in HPC, we started initiatives well before we started the multicore one, helping [independent software vendors], commercial software vendors, and end users move their applications into a parallel environment,” says Ed Turkel, HP’s veteran business development manager. “We did it through development of the tools and the math libraries, and providing engineering resources to help enable them to do it.
“What’s fundamentally different between [supercomputing] and most enterprise applications is that very few non-computational applications really take advantage of large-scale parallelism,” Turkel adds. He pointed to one recent example — the move by Adobe to incorporate multithreaded processes in its latest edition of Photoshop — as one of the few instances of successful use of explicit parallelism for consumer applications. (Even so, as reviewers have noted, its actual use of multithreading appears relatively weak.)
The first enterprise-class HPC applications that adapted to the new multicore world, he adds, were transactional in nature: simple data processing routines that could easily be divided among specific cores. HP calls this class throughput applications. The most obvious use of parallelism for this class is simple replication, Turkel says. “Maybe only a few threads, but a large number of them. When you look at transactional processing applications, the real change is being able to use these multicore processors and simply run more transactions immediately.”
Web services, which by design are simplistic and granular, and which respond to simple requests with packaged data, are superb examples of this class, Turkel explains. But on the other side, computational applications include the “killer apps” that enterprises already adopted, and which needed to be reconditioned not only to embrace parallelism and get faster, but to avoid getting slower — an observed phenomenon where a single-threaded program simply cannot benefit from the availability of multiple cores it cannot use. The most critically affected segment by this phenomenon was computer-aided engineering (CAE), perhaps the one segment that most closely resembles the problem-solving and data modeling scenarios of supercomputing.
Here is where HP put to use its underappreciated expertise as a software company. As Turkel describes, HP worked directly with CAE software producers including Ansys, the largest in the field. Its Ansys Mechanical product, for example, is used in nonlinear mechanics, the use of modeling to simulate the use of plastics, rubber, and other such molecularly distributed substances in working mechanisms. Just a decade earlier, this type of application would have been considered “supercomputing.” Now, working in conjunction with supercomputing scientists, HP is helping Ansys and others apply nonlinear methodologies to nonlinear modeling.
Supercomputer scientists know what parallelism is already; they’ve been using it since the days of the Gemini program. For them, a toolkit like Intel Parallel Studio is a foreign object. It’s like introducing a jackhammer to a sculptor.
The problem these scientists are facing today is not unfamiliarity with parallelism. It’s the need to better integrate their understanding of the concept to a system whose methodologies were created for PCs. “In 1993, when 20% of the Top 500 list was composed of single-processor systems, all you had to do was figure out how to run your code on one processor,” recalls TACC’s John McCalpin. “Now, a typical job on our [Ranger] system uses 1,000 to 2,000 processors, and it took most of the intervening time for people to do that transformation.”
The class of supercomputing applications that requires the greatest amount of parallelism, and which most effectively justifies the need for exascale development, is what Intel calls recognition, mining, and synthesis (RMS), and what supercomputing scientists call real-time modeling and simulation (RMS). Essentially, these are two descriptions of the same proverbial elephant.
Intel perceives RMS as the acquisition of enormous amounts of relevant data, followed by the capability of supercomputers to devise their own programs for making sense of that data and extrapolating relevant and useful information. Scientists perceive RMS as the capability for a supercomputer to extract just enough useful information from that data to formulate its own sophisticated simulation that sheds light on evolutionary progressions that may not be obvious to human observers. Such a simulation may be a weather and climate model, a projection of a future Internet 20 years hence, a stability model for a global economy recovering from a severe terrorist event, a projection of the location of all the dark matter in the universe, or — surprisingly, the most complex of all — a model of the transistors and logic gates in future semiconductors.
“It’s not created by a person,” states McCalpin. “And it’s huge and unwieldy and unmanageable and incomprehensible.”
Among the most unwieldy of this class of simulation, he tells us, are so-called cycle-accurate simulators of microprocessors — simulators that account for the state of every transistor on a chip, for every click of its internal clock. For a 3.0 GHz CPU, that would be three billion snapshots of representing one second of compute time for two billion transistors. What came as a shock was to learn from Dr. McCalpin that this cycle-accurate process isn’t actually, to borrow a term from the industry, cycle-accurate.
“Part of the problem is that relatively few people, maybe nobody, understands all of the parts of a computer. It’s too complicated,” he says. Rather than meticulously represent each transistor or logic gate, a simulation of a processor may instead simulate the observed phenomena resulting from its presence, such as the logic itself. As a result, other pertinent observations that may pertain to the non-simulated components — for instance, crosstalk, signal weakness, or even heat — never gets modeled. So rather than a pure simulation, it’s more of a virtual machine.
“Current state-of-the-art microprocessors from Intel or AMD or IBM are among the most complex things the human race has ever created. There are no computers big enough to simulate them at any reasonable speed,” notes Dr. McCalpin. “Partly that’s a question of size, and partly that’s a question of architecture.”
At current speeds, even with supercomputers having apparently broken the petaflop barrier (1 million billion floating-point arithmetic operations, or flops, per second) in Top 500 speed tests, the actual turnover rate for so-called cycle-accurate simulators, according to Dr. McCalpin, is merely one cycle per second. At that rate, the world’s finest supercomputer may only be able to model one full second of CPU activity if its simulator is left running for 95 years.
Even when simulations are simplified, however, McCalpin has witnessed something truly frightening. Whenever a simulation requires a level of granularity that relies upon the laws of physics — at the level of electrons, molecules, and sub-micron measurements — COTS-style parallelism breaks down. For tasks to run in parallel, they have to be able to trust each other to run in sync. For example, when the processes on a semiconductor are represented by functions in a simulator, such as a low-voltage signal that’s raised to high-voltage when data is ready to be fetched from memory, the functions that represent the sender and the receiver of that signal may or may not reside on the same physical processor core. Thus the job of throwing the physical switch to represent the logical switch is delegated to the interconnect fabric — the duct tape that’s holding these stacked processors together.
When you’re dealing with physics-level granularity, synchronization simply cannot be presumed. At this level, a cannot assume b; a must wait for b. Seymour Cray’s parallel computation principle runs headlong into Werner Heisenberg’s Uncertainty Principle.
“So one of the sad things that happens is, not only are you running this processor simulation at about one cycle per second, but these machine-generated logical models are very resistant to parallelization,” McCalpin reports. “Microprocessors are not designed to exploit such fine-grained parallelism.”
It’s not that the government doesn’t know this. In an October 2010 presentation to the Dept. of Energy (PDF), Prof. Horst Simon, the new deputy director of the DOE’s Berkeley Labs, summed up the problem rather succinctly: Exascale computing (a term he helped coin) requires not only a rethinking of the hardware platform but an overhaul of the current programming model. With new classes of simulators requiring 100 times or more parallelism than previous generations, Prof. Simon said, “Assumptions that our current software infrastructure is built upon are no longer valid.”
The Utilization Solution
If we don’t ask the question now, a congressman will eventually: If the current evolutionary path of supercomputing is, as all these warnings suggest, a dead end, and if cloud computing and virtualization technologies are already working to fill the void, then has supercomputing outlived its usefulness?
It’s a question that veteran AMD server and software marketing veteran Margaret Lewis doesn’t entertain for long. Lewis, whose direct involvement with supercomputing science dates back a quarter-century, points to a fundamental difference between commercial applications and the tasks supercomputers are expected to perform: One benefits business, and the other society.
Her example is Sandia Labs, where large clusters are currently being used to model future Internet activity, determining how much bandwidth may be required for future applications, and where that bandwidth would need to be installed. “There’s no one commercial company that has access to those larger clusters to do that; it would be hard for a commercial company to justify,” says Lewis. “You can’t say we don’t need big, large clusters to look at the very edge of large problems that need to be simulated. Now, how do you decide to divide the money? That might be a question.”
Last year, analyst firm IDC noted that the entire high-performance computing market for 2009 declined by nearly 15%. For 2010, the firm projected, that figure may be about the same. But that percentage was obtained by pairing together the commercial HPC market with the supercomputing market — in other words, by bunching the same companies that Hewlett-Packard helps to incorporate parallelism into commercial applications, with the laboratories that were parallel when parallel wasn’t cool. The supercomputing side of the market is actually growing by 25% annually, says IDC, compensating in part for an otherwise trend-buckling decline in commercial HPC.
IDC’s suggestion to governments (especially to the European Union last year) is this: Invest more in high-end supercomputing. The benefit: a continued offset in the decline of commercial HPC. In other words, let’s do our part to make it all wash out in the end.
Although we like to say that the benefits of supercomputing research are shared with the rest of the world, in a process that trickles down from commercial HPC through to small and medium enterprises (SMEs), the IDC numbers suggest the presence of a disconnect. It’s a symptom of an ailment afflicting all of HPC, for which Intel’s leading HPC director is offering a solution.
Dr. Stephen Wheat is Intel’s senior director for HPC business operations. For several years, Dr. Wheat has been analyzing the health of HPC. Although he concurs with IDC’s observations, he believes there is an extraordinary opportunity to revitalize high-performance computing and supercomputing by broadening the market.
As Wheat says in his worldwide tours, there is an unserved customer: an SME who would benefit from HPC if it were only made accessible, not just available. The size of the market segment of such customers could exceed that of the other two segments put together, and investments in addressing this market could pay off tremendously — enough to fund the efforts needed for supercomputing to break out of the evolutionary corner in which it finds itself. He calls this space the Missing Middle.
“We have a long history of making significant process in software and systems, and that history has not been one of trickling that success to all of the players that could make use of that,” says Dr. Wheat. The supercomputing field has already developed brilliant software, he states; there’s no real need to replace the computational part of the software base. It’s just not accessible by the institutions that could make immediate use of it. To this end, Wheat has been personally involved in the formation of the Alliance for High Performance Digital Manufacturing, a consortium of computing providers including HP, Intel, and Microsoft, supercomputing labs such as Livermore and Oak Ridge National Laboratory, and potential customers including Caterpillar, Ford, and Lockheed-Martin.
“We don’t need future systems and future software to solve the Missing Middle space issues. We need to be able to make these applications applicable,” Wheat continues, “in an environment that doesn’t have the human resources infrastructure like a national labs or leading universities, or the tier-1 manufacturers like Boeing or Proctor & Gamble or Caterpillar.”
Many high-end commercial HPC applications are scalable, as their manufacturers claim, but only to a limited extent, explains Dr. Wheat. They don’t reach down to the level of 16 or 32 compute nodes. And some increasingly important classes of applications, which include the types of physics computations that have only been modeled by scientists (and the software that scientists create to create new software) has yet to scale down past the laboratory gates.
What exactly are we talking about, though. . . “Google Physics?” Not really, Wheat explains. There are two classes of “compute-intensive” applications, and the type that performs well in the cloud is just one. Google Maps best represents this class: a tremendous amount of data accessed at impossible speeds to solve a very simple problem. It’s what HP’s Ed Turkel would call a “throughput application.” The other class involves manufacturing: essentially using existing software designed for massive-scale problem solving, to solve multiple problems at smaller scales. In other words, parallelism.
“In the manufacturing segment, what I’m needing is the ability to model this widget that no one else has ever built, and I’ve just conceived this, and where’s the common software environment for me to do that? I no longer [want a map to] go from here to there; I want an application that will help me model the radiant heat flow of my house, but I have to enter my house 3D-CAD to be highly accurate, I need very well established materials, and I’ve got to do all this buildup just to find out whether I’ve got a 5% delta. . . Who’s going to sit there and do that?” says Wheat.
To begin addressing the Missing Middle customer, Wheat suggests repurposing some human resources for better purposes. “You’ve got all that good software, but it’s so complex to use that you don’t have this massive capability. Where is the focus of creativity? Creativity is doing common things that a lot of people want to use very simply, but the creativity is in the engineers’ hands. How creative am I being when I’m trying to do trip planning, or when I’m trying to figure out how many friends I’ve got, or what does my [social] network reach look like? What’s my degrees of separation? If you push a lot of the creativity onto the other side, that’s a different class of problem.”
To take Dr. Wheat’s suggestion one step further, consider an apps platform for high-performance computing. Think of a supercomputer, or a cloud service provider, or whatever scale of compute power is required for solving immediate problems, as a device you hold in your hand. You tap the app, you describe the problem, and a solution is modeled for you. And as a platform, it could conceivably address Dr. McCalpin’s desire for software that’s easier to program, by taking a cue from how Apple and Android solved that problem. The next big thing.
Such an apps platform could address what AMD perceives as a phenomenon of all computing that is rapidly spinning out of control: the explosion of stored data, disproportionate to the number of applications or Web services that can make use of it. “We are now creating a world of massive amounts of more and more data,” states AMD’s Margaret Lewis, “and that puts pressure on us for how we analyze, parse, simulate, and utilize that data. It’s kind of like a fire feeding itself. We’re putting a computational water hose on it, but the fire on the edges is perpetuating it.”
Once this data is made regular, then the purpose and functionality of the programs in this HPC apps market could evolve psychologically, using a model offered by AMD server and workstation marketing director John Fruehe. Fruehe likens the evolution of high-performance applications to Maslow’s Hierarchy of Needs, a theoretical pyramid of requirements for the human psyche (biological, safety, love, self-esteem, and self-actualization); a person doesn’t feel the need for one layer until the demands of the layer beneath it are satisfied. In Fruehe’s derivative, HPC and supercomputing may come to realize their true purpose once the more basic functions at lower levels are finally addressed.
“We’re just scratching the surface on the things we [have to] solve, because you’ve got to take care of the really basic problems that you can do today, while there are others that are far too large for us to be thinking about,” Fruehe explains. “But eventually as we continue to grow and mature in the compute model and get enough cycles to solve those things, that’s when they suddenly come to bear on us.”
In recent years, the subject of the supercomputer has only re-entered the public discourse from time to time, like a retired former neighbor who’s gone to live somewhere else. During those brief visits, it beats a chess master or it wins a trivia contest, or it gets beaten by a Chinese rival. But just like in the 1950s, any discussion of supercomputers is framed as a competition against humans, rather than as a marvel of human engineering.
This makes AMD’s Margaret Lewis, who has been one of those engineers, a bit upset. Her solution may be the first step in rediscovering the true significance, the relevance of this thing we’ve built: “Instead of saying this computer beat a chess champion — which is something the press likes to do, then everybody hangs their heads because computers are smarter than people — we’re approaching this from the wrong direction. We should celebrate how smart people are that they can build this machine that starts to simulate human thought and can recognize human speech, which are the most marvelous things.
“The best computer we have in the whole world is the computer in your head. There aren’t supercomputers yet that are as powerful as one person’s thought; they can only do some task well. But we’re going about this wrong if we think a computer should beat human beings; we should be using computational power to make us more effective in our world.”