June 19, 2001 12:30 PM PDT
Famed open-source compiler upgraded
On Monday, programmers released version 3.0 of GCC, a software project not as well-known as open-source projects such as Linux but one that's key to all of them. GCC is a compiler--the critical software that converts programs written by humans into instructions a chip can understand.
GCC is used to create everything from Linux and its various BSD Unix cousins to higher-level software such as the Apache Web server, the Gnome user interface and the Jabber instant-messaging software. And it can run on and create software for more than 40 different chip families.
When Eric Allman was working on his open-source Sendmail e-mail server software about a decade ago, GCC was just one of a collection of compilers he had to support to make sure his software could run on a large number of computers.
"I was using a lot of compilers, and GCC was one of them," he said. "At this point, GCC is so common you almost could use just GCC."
But GCC isn't just for open-source projects--the collaborative efforts in which developers share their programs without the secrecy of the proprietary world. Because the GCC design lets it create programs for dozens of different CPUs, it has been used to create the software for numerous proprietary systems as well.
Indeed, GCC has spread as far as Microsoft, which ships the compiler as part of its Interix software, which enables Unix software to run on Windows computers.
Version 3.0, under development for more than a year, adds several features. Among them is an improved ability to generate instructions for Intel's mainstream chips--work Intel has been funding for years. The software GCC produces for other chips is also improved.
Among the new chips now supported are Intel's Itanium and XScale, Motorola's MCore 210 and 340, Mitsubishi's D30V, Atmel's AVR and Fujitsu's FR30, said Jeff Law, a GCC developer at Red Hat.
The new version of GCC is the first major revamp since version 2.0 was released in February 1992. Since then, GCC has survived a problem that many consider to be one of the biggest threats to open-source software: code forking, in which two separate groups of programmers develop in different directions.
GCC once stood for GNU C Compiler, since it was used to compile programs written in the C programming language for Stallman's "GNU's Not Unix" (GNU) effort to create a clone of Unix. Now, though, because GCC accepts programs written in many other languages as well, GCC stands for GNU Compiler Collection.
Stallman created the concept of "free software"--which he defines not as no-cost, but as software that may be freely redistributed as long as its source code remains open. He released GCC under the General Public License, the license now used to cover many open-source efforts. Indeed, the Linux kernel that stepped in to become the heart of the GNU operating system project is released under the GPL.
Cygnus' first customers used GCC for proprietary systems such as network equipment, said Michael Tiemann, founder of Cygnus and now chief technology officer of Red Hat. Nortel was Cygnus' first major customer, in 1990. Intel joined later that year. Ericsson arrived in 1991, and Cisco Systems and Alcatel signed on in 1992, he said.
Tiemann was an instrumental early developer of the compiler. He started with version 1.0 in 1987 and added several major new features to GCC, including the ability to run on the National Semiconductor 32032 chip and the ability to accept software written in the C++ language.
Tiemann also worked on the translation of GCC to the Motorola 8800, Intel's 386 chips, MIPS chips, and Sun's Sparc chips. He also helped lay the groundwork for "very long instruction word" (VLIW) chips such as Intel's new Itanium.
One of the criticisms of GCC is the fact that it is so general-purpose: being able to use it on so many chips with so many languages means it only produces mediocre software. But Tiemann defends GCC.
In the process of figuring out how to get GCC to create software for so many different chips, GCC developers discovered improvements that apply to many different chips, Tiemann said. "Having the genetic material that spans across all these architectures in my opinion makes it a better compiler than any specific one," he said. For example, modifying GCC for chips from MIPS and Sun laid the groundwork for future Intel chips that followed those other designs.
In addition, GCC has modules that optimize software for particular chips.
"There are companies out there that are spending $50 million or $100 million on compiler development, and for a given arch, that may give them a leading position by some percentage," he said. "But in my opinion, the (ability) to support a wide range of microprocessors with a common compiler infrastructure and the ability to more rapidly adapt to microprocessors as they come out gives you better opportunity to take advantage of Moore's Law," the principle that chips double in power every 18 months.
Cygnus, which was earning about $20 million in annual revenue at the time Red Hat acquired the company, had much of its success in "embedded" designs such as network routers and other communication equipment. Indeed, Intel's first funding of Cygnus was to create GCC support for Intel's now-defunct i960 communications processor.
Now GCC is all over the computing landscape.
"It became sort of the default compiler to use because it supports so many machines and so many platforms," Stallman said. "People wanted to be able to use the same compilers everywhere."
The code fork
While Tiemann was at Cygnus, his company gained powerful influence over GCC. The Free Software Foundation-appointed GCC maintainer, though, didn't have enough time to process all the changes to GCC, and because of disagreements over how to manage these changes, Cygnus decided in 1997 to create a new version of GCC called EGCS.
Tiemann said that through EGCS, Cygnus showed it could maintain a heavy influence over GCC without letting its own motives shut out others' priorities. Eventually the rift was healed, in April 1999, when the Free Software Foundation agreed to use the EGCS code for GCC and the EGCS project agreed to dissolve itself and work instead on GCC.
"It was a bloodless fork and a bloodless reunification," Tiemann said.