|Discontinued||January 30, 2020|
|Max. CPU clock rate||733 MHz to 2.66 GHz|
|FSB speeds||266 MT/s to 6.4 GT/s|
|Architecture and classification|
Itanium (// eye-TAY-nee-əm) is a discontinued family of 64-bit Intel microprocessors that implement the Intel Itanium architecture (formerly called IA-64). Launched in June 2001, Intel marketed the processors for enterprise servers and high-performance computing systems. The Itanium architecture originated at Hewlett-Packard (HP), and was later jointly developed by HP and Intel.
Itanium-based systems were produced by HP/Hewlett Packard Enterprise (HPE) (the HPE Integrity Servers line) and several other manufacturers. In 2008, Itanium was the fourth-most deployed microprocessor architecture for enterprise-class systems, behind x86-64, Power ISA, and SPARC.[needs update]
In February 2017, Intel released the final generation, Kittson, to test customers, and in May began shipping in volume. It was used exclusively in mission-critical servers from Hewlett Packard Enterprise.
In 2019, Intel announced that new orders for Itanium would be accepted until January 30, 2020, and shipments would cease by July 29, 2021. This took place on schedule.
In 1989, HP started to research an architecture that would exceed the expected limits of the reduced instruction set computer (RISC) architectures caused by the great increase in complexity needed for executing multiple instructions per cycle due to the need for dynamic dependency checking and precise exception handling.[b] HP hired Bob Rau of Cydrome and Josh Fisher of Multiflow, the pioneers of very long instruction word (VLIW) computing. One VLIW instruction word can contain several independent instructions, which can be executed in parallel without having to evaluate them for independence. A compiler must attempt to find valid combinations of instructions that can be executed at the same time, effectively performing the instruction scheduling that conventional superscalar processors must do in hardware at runtime.
HP researchers modified the classic VLIW into a new type of architecture, later named Explicitly Parallel Instruction Computing (EPIC), which differs by: having template bits which show which instructions are independent inside and between the bundles of three instructions, which enables the explicitly parallel execution of multiple bundles and increasing the processors' issue width without the need to recompile; by predication of instructions to reduce the need for branches; and by full interlocking to eliminate the delay slots. In EPIC the assignment of execution units to instructions and the timing of their issuing can be decided by hardware, unlike in the classic VLIW. HP intended to use these features in PA-WideWord, the planned successor to their PA-RISC ISA. EPIC was intended to provide the best balance between the efficient use silicon area and electricity, and the general-purpose flexibility. In 1993 HP held an internal competition to design the best (simulated) microarchitectures of a RISC and an EPIC type, led by Jerry Huck and Rajiv Gupta respectively. The EPIC team won, with over double the simulated performance of the RISC competitor.
At the same time Intel was also looking for ways to make better ISAs. In 1989 Intel had launched the i860, which it marketed for workstations, servers, and iPSC and Paragon supercomputers. It differed from other RISCs by being able to switch between the normal single instruction per cycle mode, and a mode where pairs of instructions are explicitly defined as parallel so as to execute them in the same cycle without having to do dependency checking. Another distinguishing feature were the instructions for an exposed floating-point pipeline, that enabled the tripling of throughput compared to the conventional floating-point instructions. Both of these features were left largely unused because compilers didn't support them, a problem that later challenged Itanium too. Without them, i860's parallelism (and thus performance) was no better than other RISCs, so it failed in the market. Itanium would adopt a more flexible form of explicit parallelism than i860 had.
In November 1993 HP approached Intel, seeking collaboration on an innovative future architecture. At the time Intel was looking to extend x86 to 64 bits in a processor codenamed P7, which they found challenging. Later Intel claimed that four different design teams had explored 64-bit extensions, but each of them concluded that it was not economically feasible. At the meeting with HP, Intel's engineers were impressed when Jerry Huck and Rajiv Gupta presented the PA-WideWord architecture they had designed to replace PA-RISC. "When we saw WideWord, we saw a lot of things we had only been looking at doing, already in their full glory", said Intel's John Crawford, who in 1994 became the chief architect of Merced, and who had earlier argued against extending the x86 with P7. HP's Gupta recalled: "I looked Albert Yu [Intel's general manager for microprocessors] in the eyes and showed him we could run circles around PowerPC, that we could kill PowerPC, that we could kill the x86." Soon Intel and HP started conducting in-depth technical discussions at a HP office, where each side had six engineers who exchanged and discussed both companies' confidential architectural research. They then decided to use not only PA-WideWord, but also the more experimental HP Labs PlayDoh as the source of their joint future architecture. Convinced of the superiority of the new project, in 1994 Intel canceled their existing plans for P7.
In June 1994 Intel and HP announced their joint effort to make a new ISA that would adopt ideas of Wide Word and VLIW. Yu declared: "If I were competitors, I'd be really worried. If you think you have a future, you don't." On P7's future Intel said the alliance would impact it, but "it is not clear" whether it "will fully encompass the new architecture". Later the same month Intel said that some of the first features of the new architecture will start appearing on Intel chips as early as the P7, but the full version will appear sometime later. In August 1994 EE Times reported that Intel told investors that P7 was being re-evaluated and possibly canceled in favor of the HP processor. Intel immediately issued a clarification, saying that P7 is still being defined, and that HP may contribute to its architecture. Later it was confirmed that the P7 codename had indeed passed to the HP-Intel processor. By early 1996 Intel revealed its new codename, Merced.
HP believed that it was no longer cost-effective for individual enterprise systems companies such as itself to develop proprietary microprocessors, so it partnered with Intel in 1994 to develop the IA-64 architecture, derived from EPIC. Intel was willing to undertake the very large development effort on IA-64 in the expectation that the resulting microprocessor would be used by the majority of enterprise systems manufacturers. HP and Intel initiated a large joint development effort with a goal of delivering the first product, Merced, in 1998.
Merced was designed by a team of 500, which Intel later admitted was too inexperienced, with many recent college graduates. Crawford (Intel) was the chief architect, while Huck (HP) held the second position. Early in the development HP and Intel had a disagreement where Intel wanted more dedicated hardware for more floating-point instructions. HP prevailed upon the discovery of a floating-point hardware bug in Intel's Pentium. When Merced was floorplanned for the first time in mid-1996, it turned out to be far too large, "this was a lot worse than anything I'd seen before", said Crawford. The designers had to reduce the complexity (and thus performance) of subsystems, including the x86 unit and cutting the L2 cache to 96 KB.[c] Eventually it was agreed that the size target could only be reached by using the 180 nm process instead of the intended 250 nm. Later problems emerged with attempts to speed up the critical paths without disturbing the other circuits' speed. Merced was taped out on 4 July 1999, and in August Intel produced the first complete test chip.
The expectations for Merced waned over time as delays and performance deficiencies emerged, shifting the focus and onus for success onto the HP-led second Itanium design, codenamed McKinley. In July 1997 the switch to the 180 nm process delayed Merced into the second half of 1999. Shortly before the reveal of EPIC at the Microprocessor Forum in October 1997, an analyst of the Microprocessor Report said that Itanium "will not show the competitive performance until 2001. It will take the second version of the chip for the performance to get shown". At the Forum, Intel's Fred Pollack originated the "wait for McKinley" mantra when he said that it will double the Merced's performance and will "knock your socks off", while using the same 180 nm process as Merced. Pollack also said that Merced's x86 performance will be lower than the fastest x86 processors, and that x86 "will continue to grow at its historical rates". Intel said that IA-64 won't have much presence in the consumer market for 5 to 10 years.
Later it was reported that HP's motivation when starting to design McKinley in 1996 was to have more control over the project so as to avoid the issues affecting Merced's performance and schedule. The design team finalized McKinley's project goals in 1997. In late May 1998 Merced was delayed to mid-2000, and by August 1998 analysts were questioning its commercial viability, given that McKinley would arrive shortly after with double the performance, as delays were causing Merced to turn into simply a development vehicle for the Itanium ecosystem. The "wait for McKinley" narrative was becoming prevalent. The same day it was reported that due to the delays, HP will extend its line of PA-RISC PA-8000 series processors from PA-8500 to as far as PA-8900. In October 1998 HP announced its plans for four more generations of PA-RISC processors, with PA-8900 set to reach 1.2 GHz in 2003.
By March 1999 some analysts expected Merced to ship in volume only in 2001, but the volume was widely expected to be low as most customers would wait for McKinley. In May 1999, two months before Merced's tape-out, an analyst said that failure to tape-out before July will result in another delay. In July 1999, upon reports that the first silicon will be made in late August, analysts predicted a delay to late 2000, and came into agreement that Merced will be used chiefly for debugging and testing the IA-64 software. Linley Gwennap of MPR said of Merced that "at this point, everyone is expecting it's going to be late and slow, and the real advance is going to come from McKinley. What this does is puts a lot more pressure on McKinley and for that team to deliver". By then Intel had revealed that Merced will be initially priced at $5000. In August 1999 HP advised some of their customers to skip Merced and wait for McKinley. By July 2000 HP was telling the press that the first Itanium systems will be for niche uses, and that "You're not going to put this stuff near your data center for several years.", HP expecting its Itanium systems to outsell the PA-RISC systems only in 2005. The same July Intel told of another delay, due to a stepping change to fix bugs. Now only "pilot systems" would ship that year, while the general availbility was pushed to the "first half of 2001". Server makers had largely forgone spending on the R&D for the Merced-based systems, instead using motherboards or whole servers of Intel's design. To foster a wide ecosystem, by mid-2000 Intel had provided 15,000 Itaniums in 5,000 systems to software developers and hardware designers. In March 2001 Intel said Itanium systems would begin shipping to customers in the second quarter, followed by a broader deployment in the second half of the year. By then even Intel publicly acknowledged that many customers will wait for McKinley.
During development, Intel, HP, and industry analysts predicted that IA-64 would dominate first in 64-bit servers and workstations, then expand to the lower-end servers, supplanting Xeon, and finally penetrate into the personal computers, eventually to supplant RISC and complex instruction set computing (CISC) architectures for all general-purpose applications, though not replacing x86 "for the foreseeable future" according to Intel. In 1997-1998 Intel CEO Andy Grove predicted that Itanium will not come to the desktop computers for four of five years after launch, and said "I don't see Merced appearing on a mainstream desktop inside of a decade". In contrast, Itanium was expected to capture 70% of the 64-bit server market in 2002. Already in 1998 Itanium's focus on the high end of the computer market was criticized for making it vulnerable to challengers expanding from the lower-end market segments, but many people in the computer industry feared voicing doubts about Itanium in the fear of Intel's retaliation. Compaq and Silicon Graphics decided to abandon further development of the Alpha and MIPS architectures respectively in favor of migrating to IA-64.
Several groups ported operating systems for the architecture, including Microsoft Windows, OpenVMS, Linux, HP-UX, Solaris, Tru64 UNIX, and Monterey/64. The latter three were canceled before reaching the market. By 1997, it was apparent that the IA-64 architecture and the compiler were much more difficult to implement than originally thought, and the delivery timeframe of Merced began slipping.
Intel announced the official name of the processor, Itanium, on October 4, 1999. Within hours, the name Itanic had been coined on a Usenet newsgroup, a reference to the RMS Titanic, the "unsinkable" ocean liner that sank on her maiden voyage in 1912. "Itanic" was then used often by The Register, and others, to imply that the multibillion-dollar investment in Itanium—and the early hype associated with it—would be followed by its relatively quick demise.
|Launched||29 May–June 2001|
|Discontinued||10 April 2003|
|Max. CPU clock rate||733 to 800 MHz|
|FSB speeds||266 MT/s|
|L2 cache||96 KB|
|L3 cache||2 or 4 MB|
After having sampled 40,000 chips to the partners, Intel launched Itanium on May 29, 2001, with first OEM systems from HP, IBM and Dell shipping to customers in June. By then Itanium's performance was not superior to competing RISC and CISC processors. Itanium competed at the low-end (primarily four-CPU and smaller systems) with servers based on x86 processors, and at the high-end with IBM POWER and Sun Microsystems SPARC processors. Intel repositioned Itanium to focus on the high-end business and HPC computing markets, attempting to duplicate the x86's successful "horizontal" market (i.e., single architecture, multiple systems vendors). The success of this initial processor version was limited to replacing the PA-RISC in HP systems, Alpha in Compaq systems and MIPS in SGI systems, though IBM also delivered a supercomputer based on this processor. POWER and SPARC remained strong, while the 32-bit x86 architecture continued to grow into the enterprise space, building on the economies of scale fueled by its enormous installed base.
Only a few thousand systems using the original Merced Itanium processor were sold, due to relatively poor performance, high cost and limited software availability. Recognizing that the lack of software could be a serious problem for the future, Intel made thousands of these early systems available to independent software vendors (ISVs) to stimulate development. HP and Intel brought the next-generation Itanium 2 processor to the market a year later. Few of the microarchitectural features of Merced would be carried over to all the subsequent Itanium designs, including the 16+16 KB L1 cache size and the 6-wide (two-bundle) instruction decoding.
|Launched||8 July 2002|
|Discontinued||16 November 2007|
|Designed by||HP and Intel|
|Product code||McKinley, Madison, Deerfield, Madison 9M, Fanwood|
|Max. CPU clock rate||900 to 1667 MHz|
|FSB speeds||400 to 667 MT/s|
|L2 cache||256 KB|
|L3 cache||1.5–9 MB|
|Architecture and classification|
|Technology node||180 nm to 130 nm|
The Itanium 2 processor was released in July 2002, and was marketed for enterprise servers rather than for the whole gamut of high-end computing. The first Itanium 2, code-named McKinley, was jointly developed by HP and Intel, led by the HP team at Fort Collins, Colorado, taping out in December 2000. It relieved many of the performance problems of the original Itanium processor, which were mostly caused by an inefficient memory subsystem by approximately halving the latency and doubling the fill bandwidth of each of the three levels of cache, while expanding the L2 cache from 96 to 256 KB. Floating-point data is excluded from the L1 cache, because the L2 cache's higher bandwidth is more beneficial to typical floating-point applications than low latency. The L3 cache was now integrated on-chip, tripling in associativity and doubling in bus width. McKinley also greatly increases the number of possible instruction combinations in a VLIW-bundle and reaches 25% higher frequency, despite having only eight pipeline stages versus Merced's ten.
McKinley contains 221 million transistors (of which 25 million are for logic and 181 million for L3 cache), measured 19.5 mm by 21.6 mm (421 mm2) and was fabricated in a 180 nm, bulk CMOS process with six layers of aluminium metallization. In May 2003 it was disclosed that some McKinley processors can suffer from a critical-path erratum leading to a system's crashing. It can be avoided by lowering the processor frequency to 800 MHz.
In 2003, AMD released the Opteron CPU, which implements its own 64-bit architecture called AMD64. The Opteron gained rapid acceptance in the enterprise server space because it provided an easy upgrade from x86. Under the influence of Microsoft, Intel responded by implementing AMD's x86-64 instruction set architecture instead of IA-64 in its Xeon microprocessors in 2004, resulting in a new industry-wide de facto standard.
In 2003 Intel released a new Itanium 2 family member, codenamed Madison, initially with up to 1.5 GHz frequency and 6 MB of L3 cache. The Madison 9M chip released in November 2004 had 9 MB of L3 cache and frequency up to 1.6 GHz, reaching 1.67 GHz in July 2005. Both chips used a 130 nm process and were the basis of all new Itanium processors until Montecito was released in July 2006, specifically Deerfield being a low wattage Madison, and Fanwood being a version of Madison 9M for lower-end servers with one or two CPU sockets.
In November 2005, the major Itanium server manufacturers joined with Intel and a number of software vendors to form the Itanium Solutions Alliance to promote the architecture and accelerate the software porting effort. The Alliance announced that its members would invest $10 billion in the Itanium Solutions Alliance by the end of the decade.
|Launched||18 July 2006|
|Discontinued||26 August 2011|
|Product code||Montecito, Montvale|
|Max. CPU clock rate||1.4 GHz to 1.67 GHz|
|FSB speeds||400 to 667 MT/s|
|L2 cache||256 KB (D) + 1 MB (I)|
|L3 cache||6–24 MB|
|Architecture and classification|
|Technology node||90 nm|
Main article: Montecito (processor)
In early 2003, due to the success of IBM's dual-core POWER4, Intel announced that the first 90 nm Itanium processor, codenamed Montecito, will be delayed to 2005 so as to change it into a dual-core, thus merging it with the Chivano project. In September 2004 Intel demonstrated a working Montecito system, and claimed that the inclusion of hyper-threading increases Montecito's performance by 10-20% and that its frequency could reach 2 GHz. After a delay to "mid-2006" and reduction of the frequency to 1.6 GHz, on July 18 Intel delivered Montecito (marketed as the Itanium 2 9000 series), a dual-core processor with a switch-on-event multithreading and split 256 KB + 1 MB L2 caches that roughly doubled the performance and decreased the energy consumption by about 20 percent. At 596 mm² die size and 1720 million transistors it was the largest microprocessor at the time. It was supposed to feature Foxton Technology, a very sophisticated frequency regulator, which failed to pass validation and was thus not enabled for customers.
Intel released the Itanium 9100 series, codenamed Montvale, in November 2007, retiring the "Itanium 2" brand. Originally intended to use the 65 nm process, it was changed into a fix of Montecito, enabling the demand-based switching (like EIST) and up to 667 MT/s front-side bus, which were intended for Montecito, plus a core-level lockstep. Montecito and Montvale were the last Itanium processors in which design Hewlett-Packard's engineering team at Fort Collins had a key role, as the team was subsequently transferred to Intel's ownership.
|Launched||8 February 2010|
|Discontinued||2nd quarter of 2014|
|Max. CPU clock rate||1.33 to 1.73 GHz|
|L2 cache||256 KB (D) + 512 KB (I)|
|L3 cache||10–24 MB|
|Architecture and classification|
|Technology node||65 nm|
|Launched||8 November 2012|
|Discontinued||30 January 2020|
|Product code||Poulson, Kittson|
|Max. CPU clock rate||1.73 to 2.67 GHz|
|L2 cache||256 KB (D) + 512 KB (I)|
|L3 cache||20–32 MB|
|Architecture and classification|
|Technology node||32 nm|
Main article: Tukwila (processor)
The original code name for the first Itanium with more than two cores was Tanglewood, but it was changed to Tukwila in late 2003 due to trademark issues. Intel discussed a "middle-of-the-decade Itanium" to succeed Montecito, achieving ten times the performance of Madison. It was being designed by the famed DEC Alpha team and was expected have eight new multithreading-focused cores. Intel claimed "a lot more than two" cores and more than seven times the performance of Madison. In early 2004 Intel told of "plans to achieve up to double the performance over the Intel Xeon processor family at platform cost parity by 2007". By early 2005 Tukwila was redefined, now having fewer cores but focusing on single-threaded performance and multiprocessor scalability.
In March 2005, Intel disclosed some details of Tukwila, the next Itanium processor after Montvale, to be released in 2007. Tukwila would have four processor cores and would replace the Itanium bus with a new Common System Interface, which would also be used by a new Xeon processor. Tukwila was to have a "common platform architecture" with a Xeon codenamed Whitefield, which was canceled in October 2005, when Intel revised Tukwila's delivery date to late 2008. In May 2009, the schedule for Tukwila, was revised again, with the release to OEMs planned for the first quarter of 2010. The Itanium 9300 series processor, codenamed Tukwila, was released on February 8, 2010, with greater performance and memory capacity.
The device uses a 65 nm process, includes two to four cores, up to 24 MB on-die caches, Hyper-Threading technology and integrated memory controllers. It implements double-device data correction, which helps to fix memory errors. Tukwila also implements Intel QuickPath Interconnect (QPI) to replace the Itanium bus-based architecture. It has a peak interprocessor bandwidth of 96 GB/s and a peak memory bandwidth of 34 GB/s. With QuickPath, the processor has integrated memory controllers and interfaces the memory directly, using QPI interfaces to directly connect to other processors and I/O hubs. QuickPath is also used on Intel x86-64 processors using the Nehalem microarchitecture, which possibly enabled Tukwila and Nehalem to use the same chipsets. Tukwila incorporates two memory controllers, each of which has two links to Scalable Memory Buffers, which in turn support multiple DDR3 DIMMs, much like the Nehalem-based Xeon processor code-named Beckton.
During the 2012 Hewlett-Packard Co. v. Oracle Corp. support lawsuit, court documents unsealed by a Santa Clara County Court judge revealed that in 2008, Hewlett-Packard had paid Intel around $440 million to keep producing and updating Itanium microprocessors from 2009 to 2014. In 2010, the two companies signed another $250 million deal, which obliged Intel to continue making Itanium CPUs for HP's machines until 2017. Under the terms of the agreements, HP had to pay for chips it gets from Intel, while Intel launches Tukwila, Poulson, Kittson, and Kittson+ chips in a bid to gradually boost performance of the platform.
Intel first mentioned Poulson on March 1, 2005, at the Spring IDF. In June 2007 Intel said that Poulson will use a 32 nm process technology, skipping the 45 nm process. This was necessary for catching up after Itanium's delays left it at 90 nm competing against 65 nm and 45 nm processors.
At ISSCC 2011, Intel presented a paper called "A 32nm 3.1 Billion Transistor 12-Wide-Issue Itanium Processor for Mission Critical Servers." Analyst David Kanter speculated that Poulson would use a new microarchitecture, with a more advanced form of multithreading that uses up to two threads, to improve performance for single threaded and multithreaded workloads. Some information was also released at the Hot Chips conference.
Information presented improvements in multithreading, resiliency improvements (Intel Instruction Replay RAS) and few new instructions (thread priority, integer instruction, cache prefetching, and data access hints).
Poulson was released on November 8, 2012, as the Itanium 9500 series processor. It is the follow-on processor to Tukwila. It features eight cores and has a 12-wide issue architecture, multithreading enhancements, and new instructions to take advantage of parallelism, especially in virtualization. The Poulson L3 cache size is 32 MB and common for all cores, not divided like previously. L2 cache size is 6 MB, 512 I KB, 256 D KB per core. Die size is 544 mm², less than its predecessor Tukwila (698.75 mm²).
Intel's Product Change Notification (PCN) 111456-01 lists four models of Itanium 9500 series CPU, which was later removed in a revised document. The parts were later listed in Intel's Material Declaration Data Sheets (MDDS) database. Intel later posted Itanium 9500 reference manual.
The models are the following:
Intel had committed to at least one more generation after Poulson, first mentioning Kittson on 14 June 2007. Kittson was supposed to be on a 22 nm process and use the same LGA2011 socket and platform as Xeons. On 31 January 2013 Intel issued an update to their plans for Kittson: it would have the same LGA1248 socket and 32 nm process as Poulson, thus effectively halting any further development of Itanium processors.
In April 2015, Intel, although it had not yet confirmed formal specifications, did confirm that it continued to work on the project. Meanwhile, the aggressively multicore Xeon E7 platform displaced Itanium-based solutions in the Intel roadmap. Even Hewlett-Packard, the main proponent and customer for Itanium, began selling x86-based Superdome and NonStop servers, and started to treat the Itanium-based versions as legacy products.
Intel officially launched the Itanium 9700 series processor family on May 11, 2017. Kittson has no microarchitecture improvements over Poulson; despite nominally having a different stepping, it is functionally identical with the 9500 series, even having exactly the same bugs, the only difference being the 133 MHz higher frequency of 9760 and 9750 over 9560 and 9550 respectively.
Intel announced that the 9700 series will be the last Itanium chips produced.
The models are:
|9720||4||8||1.73 GHz||20 MB|
|9740||8||16||2.13 GHz||24 MB|
|9750||4||8||2.53 GHz||32 MB|
|9760||8||16||2.66 GHz||32 MB|
By 2006, HP manufactured at least 80% of all Itanium systems, and sold 7,200 in the first quarter of 2006. The bulk of systems sold were enterprise servers and machines for large-scale technical computing, with an average selling price per system in excess of US$200,000. A typical system uses eight or more Itanium processors.
By 2012, only a few manufacturers offered Itanium systems, including HP, Bull, NEC, Inspur and Huawei. In addition, Intel offered a chassis that could be used by system integrators to build Itanium systems.
By 2015, only HP supplied Itanium-based systems. With HP split in late 2015, Itanium systems (branded as Integrity) are handled by Hewlett-Packard Enterprise (HPE), with a major update in 2017 (Integrity i6, and HP-UX 11i v3 Update 16). HPE also supports a few other operating systems, including Windows up to Server 2008 R2, Linux, OpenVMS and NonStop. Itanium is not affected by Spectre and Meltdown.
Prior to the 9300-series (Tukwila), chipsets were needed to connect to the main memory and I/O devices, as the front-side bus to the chipset was the sole connection out of the processor (except for TAP (JTAG) and SMBus for debugging and system configuration). Two generations of buses existed, the original Itanium processor system bus (a.k.a. Merced bus) had a 64 bit data width and 133 MHz clock with DDR (266 MT/s), being soon superseded by the 128-bit 200 MHz DDR (400 MT/s) Itanium 2 processor system bus (a.k.a. McKinley bus), which later reached 533 and 667 MT/s. Up to four CPUs per single bus could be used, but prior to the 9000-series the bus speeds of over 400 MT/s were limited to up to two processors per bus. As no Itanium chipset could connect to more than four sockets, high-end servers needed multiple interconnected chipsets.
The "Tukwila" Itanium processor model had been designed to share a common chipset with the Intel Xeon processor EX (Intel's Xeon processor designed for four processor and larger servers). The goal was to streamline system development and reduce costs for server OEMs, many of which develop both Itanium- and Xeon-based servers. However, in 2013, this goal was pushed back to be "evaluated for future implementation opportunities".
In the times before on-chip memory controllers and QPI, enterprise server manufacturers differentiated their systems by designing and developing chipsets that interface the processor to memory, interconnections, and peripheral controllers. "Enterprise server" referred to the then-lucrative market segment of high-end servers with high reliability, availability and serviceability and typically 16+ processor sockets, justifying their pricing by having a custom system-level architecture with their own chipsets at its heart, with capabilities far beyond what two-socket "commodity servers" could offer. Development of a chipset costs tens of millions of dollars and so represented a major commitment to the use of Itanium.
Neither Intel nor IBM would develop Itanium 2 chipsets to support newer technologies such as DDR2 or PCI Express. Before "Tukwila" moved away from the FSB, chipsets supporting such technologies were manufactured by all Itanium server vendors, such as HP, Fujitsu, SGI, NEC, and Hitachi.
The first generation of Itanium received no vendor-specific chipsets, only Intel's 460GX consisting of ten distinct chips. It supported up to four CPUs and 64 GB of memory at 4.2 GB/s, which is twice the system bus's bandwidth. Addresses and data were handled by two different chips. 460GX had an AGP X4 graphics bus, two 64-bit 66 MHz PCI buses and configurable 33 MHz dual 32-bit or single 64-bit PCI bus(es).
There were many custom chipset designs for Itanium 2, but many smaller vendors chose to use Intel's E8870 chipset. It supports 128 GB of DDR SDRAM at 6.4 GB/s. It was originally designed for Rambus RDRAM serial memory, but when RDRAM failed to succeed Intel added four DDR SDRAM-to-RDRAM converter chips to the chipset. When Intel had previously made such a converter for Pentium III chipsets 820 and 840, it drastically cut performance. E8870 provides eight 133 MHz PCI-X buses (4.2 GB/s total because of bottlenecks) and a ICH4 hub with six USB 2.0 ports. Two E8870 can be linked together by two E8870SP Scalability Port Switches, each containing a 1MB (~200,000 cache lines) snoop filter, to create an 8-socket system with double the memory and PCI-X capacity, but still only one ICH4. Further expansion to 16 sockets was planned. In 2004 Intel revealed plans for its next Itanium chipset, codenamed Bayshore, to support PCI-e and DDR2 memory, but canceled it the same year.
HP has designed four different chipsets for Itanium 2: zx1, sx1000, zx2 and sx2000. All support 4 sockets per chipset, but sx1000 and sx2000 support interconnection of up to 16 chipsets to create up to a 64 socket system. As it was developed in collaboration with Itanium 2's development, booting the first Itanium 2 in February 2001, zx1 became the first Itanium 2 chipset available and later in 2004 also the first to support 533 MT/s FSB. In its basic two-chip version it directly provides four channels of DDR-266 memory, giving 8.5 GB/s of bandwidth and 32 GB of capacity (though 12 DIMM slots). In versions with memory expander boards memory bandwidth reaches 12.8 GB/s, while the maximum capacity for the initial two-board 48 DIMM expanders was 96 GB, and the later single-board 32 DIMM expander up to 128 GB. The memory latency increases by 25 nanoseconds from 80 ns due to the expanders. Eight independent links went to the PCI-X and other peripheral devices (e.g. AGP in workstations), totaling 4 GB/s.
HP's first high-end Itanium chipset was sx1000, launched in mid-2003 with the Integrity Superdome flagship server. It has two independent front-side buses, each bus supporting two sockets, giving 12.8 GB/s of combined bandwidth from the processors to the chipset. It has four links to data-only memory buffers and supports 64 GB of HP-designed 125 MHz memory at 16 GB/s. The above components form a system board called a cell. Two cells can be directly connected together to create an 8-socket glueless system. To connect four cells together, a pair of 8-ported crossbar switches is needed (adding 64 ns to inter-cell memory accesses), while four such pairs of crossbar switches are needed for the top-end system of 16 cells (64 sockets), giving 32 GB/s of bisection bandwidth. Cells maintain cache coherence through in-memory directories, which causes the minimum memory latency to be 241 ns. The latency to the most remote (NUMA) memory is 463 ns. The per-cell bandwidth to the I/O subsystems is 2 GB/s, despite the presence of 8 GB/s worth of PCI-X buses in each I/O subsystem.
HP launched sx2000 in March 2006 to succeed sx1000. Its two FSBs operate at 533 MT/s. It supports up to 128 GB of memory at 17 GB/s. The memory is of HP's custom design, using the DDR2 protocol, but twice as tall as the standard modules and with redundant address and control signal contacts. For the inter-chipset communication, 25.5 GB/s is available on each sx2000 through its three serial links that can connect to a set of three independent crossbars, which connect to other cells or up to 3 other sets of 3 crossbars. The multi-cell configurations are the same as with sx1000, except the parallelism of the sets of crossbars has been increased from 2 to 3. The maximum configuration of 64 sockets has 72 GB/s of sustainable bisection bandwidth. The chipset's connection to its I/O module is now serial with an 8.5 GB/s peak and 5.5 GB/s sustained bandwidth, the I/O module having either 12 PCI-X buses at up to 266 MHz, or 6 PCI-X buses and 6 PCIe 1.1 ×8 slots. It is the last chipset to support HP's PA-RISC processors (PA-8900).
HP launched the first zx2-based servers in September 2006. zx2 can operate the FSB at 667 MT/s with two CPUs or 533 MT/s with four CPUs. It connects to the DDR2 memory either directly, supporting 32 GB at up to 14.2 GB/s, or through expander boards, supporting up to 384 GB at 17 GB/s. The minimum open-page latency is 60 to 78 ns. 9.8 GB/s are available through eight independent links to the I/O adapters, which can include PCIe ×8 or 266 MHz PCI-X.
In May 2003 IBM launched the XA-64 chipset for Itanium 2. It used many of the same technologies as the first two generations of XA-32 chipsets for Xeon, but by the time of the third gen XA-32 IBM had decided to discontinue its Itanium products. XA-64 supported 56 GB of DDR SDRAM in 28 slots at 6.4 GB/s, though due to bottlenecks only 3.2 GB/s could go to the CPU and other 2 GB/s to devices for a 5.2 GB/s total. The CPU's memory bottleneck was mitigated by an off-chip 64 MB DRAM L4 cache, which also worked as a snoop filter in multi-chipset systems. The combined bandwidth of the four PCI-X buses and other I/O is bottlenecked to 2 GB/s per chipset. Two or four chipsets can be connected to make an 8 or 16 socket system.
SGI's Altix supercomputers and servers used the SHUB (Super-Hub) chipset, which supports two Itanium 2 sockets. The initial version used DDR memory through four buses for up to 12.8 GB/s bandwidth, and up to 32 GB of capacity across 16 slots. A 2.4 GB/s XIO channel connected to a module with up to six 64-bit 133 MHz PCI-X buses. SHUBs can be interconnected by the dual 6.4 GB/s NUMAlink4 link planes to create a 512-socket cache-coherent single-image system. A cache for the in-memory coherence directory saves memory bandwidth and reduces latency. The latency to the local memory is 132 ns, and each crossing of a NUMAlink4 router adds 50 ns. I/O modules with four 133 MHz PCI-X buses can connect directly to the NUMAlink4 network. SGI's second-generation SHUB 2.0 chipset supported up to 48 GB of DDR2 memory, 667 MT/s FSB, and could connect to I/O modules providing PCI Express. It supports only four local threads, so when having two dual-core CPUs per chipset, Hyper-Threading must be disabled.
The Trillian Project was an effort by an industry consortium to port the Linux kernel to the Itanium processor. The project started in May 1999 with the goal of releasing the distribution in time for the initial release of Itanium, then scheduled for early 2000. By the end of 1999, the project included Caldera Systems, CERN, Cygnus Solutions, Hewlett-Packard, IBM, Intel, Red Hat, SGI, SuSE, TurboLinux and VA Linux Systems. The project released the resulting code in February 2000. The code then became part of the mainline Linux kernel more than a year before the release of the first Itanium processor. The Trillian project was able to do this for two reasons:
After the successful completion of Project Trillian, the resulting Linux kernel was used by all of the manufacturers of Itanium systems (HP, IBM, DELL, SGI, Fujitsu, Unisys, Hitachi, and Groupe Bull.) With the notable exception of HP, Linux is either the primary OS or the only OS the manufacturer supports for Itanium. Ongoing free and open source software support for Linux on Itanium subsequently coalesced at Gelato.
In 2005, Fedora Linux started adding support for Itanium and Novell added support for SUSE Linux. In 2007, CentOS added support for Itanium in a new release.
In 2009, Red Hat dropped Itanium support in Enterprise Linux 6. Ubuntu 10.10 dropped support for Itanium. In 2021, Linus Torvalds marked the Itanium code as orphaned. Torvalds said:
"HPE no longer accepts orders for new Itanium hardware, and Intel stopped accepting orders a year ago. While intel is still officially shipping chips until July 29, 2021, it's unlikely that any such orders actually exist. It's dead, Jim."
Main article: OpenVMS § Port to Intel Itanium
In 2001, Compaq announced that OpenVMS would be ported to the Itanium architecture. This led to the creation of the V8.x releases of OpenVMS, which support both Itanium-based HPE Integrity Servers and DEC Alpha hardware. Since the Itanium porting effort began, ownership of OpenVMS transferred from Compaq to HP in 2001, and then to VMS Software Inc. (VSI) in 2014. Noteworthy releases include:
Support for Itanium has been dropped in the V9.x releases of OpenVMS, which run on x86-64 only.
NonStop OS was ported from MIPS-based hardware to Itanium in 2005. NonStop OS was later ported to x86-64 in 2015. Sales of Itanium-based NonStop hardware ended in 2020, with support ending in 2025.
In 2005, Itanium support in GCC which is used for compiling Linux was improved.
GNU Compiler Collection deprecated support for IA-64 in GCC 10, after Intel announced the planned phase-out of this ISA. LLVM (Clang) dropped Itanium support in version 2.6.
HP sells a virtualization technology for Itanium called Integrity Virtual Machines.
Emulation is a technique that allows a computer to execute binary code that was compiled for a different type of computer. Before IBM's acquisition of QuickTransit in 2009, application binary software for IRIX/MIPS and Solaris/SPARC could run via type of emulation called "dynamic binary translation" on Linux/Itanium. Similarly, HP implemented a method to execute PA-RISC/HP-UX on the Itanium/HP-UX via emulation, to simplify migration of its PA-RISC customers to the radically different Itanium instruction set. Itanium processors can also run the mainframe environment GCOS from Groupe Bull and several x86 operating systems via instruction set simulators.
Itanium was aimed at the enterprise server and high-performance computing (HPC) markets. Other enterprise- and HPC-focused processor lines include Oracle's and Fujitsu's SPARC processors and IBM's Power microprocessors. Measured by quantity sold, Itanium's most serious competition came from x86-64 processors including Intel's own Xeon line and AMD's Opteron line. Since 2009, most servers were being shipped with x86-64 processors.
In 2005, Itanium systems accounted for about 14% of HPC systems revenue, but the percentage declined as the industry shifted to x86-64 clusters for this application.
An October 2008 Gartner report on the Tukwila processor, stated that "...the future roadmap for Itanium looks as strong as that of any RISC peer like Power or SPARC."
An Itanium-based computer first appeared on the list of the TOP500 supercomputers in November 2001. The best position ever achieved by an Itanium 2 based system in the list was No. 2, achieved in June 2004, when Thunder (Lawrence Livermore National Laboratory) entered the list with an Rmax of 19.94 Teraflops. In November 2004, Columbia entered the list at No. 2 with 51.8 Teraflops, and there was at least one Itanium-based computer in the top 10 from then until June 2007. The peak number of Itanium-based machines on the list occurred in the November 2004 list, at 84 systems (16.8%); by June 2012, this had dropped to one system (0.2%), and no Itanium system remained on the list in November 2012.
The Itanium processors show a progression in capability. Merced was a proof of concept. McKinley dramatically improved the memory hierarchy and allowed Itanium to become reasonably competitive. Madison, with the shift to a 130 nm process, allowed for enough cache space to overcome the major performance bottlenecks. Montecito, with a 90 nm process, allowed for a dual-core implementation and a major improvement in performance per watt. Montvale added three new features: core-level lockstep, demand-based switching and front-side bus frequency of up to 667 MHz.
|Merced||180 nm||2001-05-29||733 MHz||96 KB||none||266 MHz||1||1||116||2 or 4 MB off-die L3 cache|
|800 MHz||130||2 or 4 MB off-die L3 cache|
|McKinley||180 nm||2002-07-08||900 MHz||256 KB||1.5 MB||400 MHz||1||1||90||HW branchlong|
|Madison||130 nm||2003-06-30||1.3 GHz||3 MB||97|
|1.4 GHz||4 MB||91|
|1.5 GHz||6 MB||107|
|2003-09-08||1.4 GHz||1.5 MB||91|
|Deerfield||2003-09-08||1.0 GHz||1.5 MB||55||Low voltage|
|Hondo||2004-06||1.1 GHz||4 MB||2||1||170||Not a product of Intel, but of HP. 32 MB L4|
|Fanwood||2004-11-08||1.3 GHz||3 MB||1||1||62||Low voltage|
|Madison 9M||1.5 GHz||4 MB||400 MHz||122|
|1.6 GHz||6 MB|
|2005-07-05||1.67 GHz||6 MB||667 MHz|
|Itanium 2 9000 series|
|256 KB (D)+
1 MB (I)
|1||2||75–104||Virtualization, Multithread, no HW IA-32|
|Itanium 9100 series|
|256 KB (D)+
1 MB (I)
|1||1–2||75–104||Core-level lockstep, demand-based switching|
|Itanium 9300 series|
|256 KB (D)+
512 KB (I)
|10–24 MB||QPI with
|1||2–4||130–185||A new point-to-point processor interconnect, the QPI,|
replacing the FSB. Turbo Boost
|Itanium 9500 series|
|256 KB (D)+
512 KB (I)
|20–32 MB||QPI with
|1||4–8||130–170||Doubled issue width (from 6 to 12 instructions per cycle),|
Instruction Replay technology, Dual-domain hyperthreading
|Itanium 9700 series|
|256 KB (D)+
512 KB (I)
|20–32 MB||QPI with
|1||4–8||130–170||No architectural improvements over Poulson,|
5 % higher clock for the top model
When first released in 2001, Itanium's performance was disappointing compared to better-established RISC and CISC processors. Emulation to run existing x86 applications and operating systems was particularly poor, with one benchmark in 2001 reporting that it was equivalent at best to a 100 MHz Pentium in this mode (1.1 GHz Pentiums were on the market at that time). Itanium failed to make significant inroads against IA-32 or RISC, and suffered further following the arrival of x86-64 systems which offered greater compatibility with older x86 applications.
In a 2009 article on the history of the processor — "How the Itanium Killed the Computer Industry" — journalist John C. Dvorak reported "This continues to be one of the great fiascos of the last 50 years". Tech columnist Ashlee Vance commented that the delays and underperformance "turned the product into a joke in the chip industry". In an interview, Donald Knuth said "The Itanium approach...was supposed to be so terrific—until it turned out that the wished-for compilers were basically impossible to write."
Both Red Hat and Microsoft announced plans to drop Itanium support in their operating systems due to lack of market interest; however, other Linux distributions such as Gentoo and Debian remain available for Itanium. On March 22, 2011, Oracle Corporation announced that it would no longer develop new products for HP-UX on Itanium, although it would continue to provide support for existing products. Following this announcement, HP sued Oracle for breach of contract, arguing that Oracle had violated conditions imposed during settlement over Oracle's hiring of former HP CEO Mark Hurd as its co-CEO, requiring the vendor to support Itanium on its software "until such time as HP discontinues the sales of its Itanium-based servers", and that the breach had harmed its business. In 2012, a court ruled in favor of HP, and ordered Oracle to resume its support for Itanium. In June 2016, Hewlett-Packard Enterprise (the corporate successor to HP's server business) was awarded $3 billion in damages from the lawsuit. Oracle unsuccessfully appealed the decision to the California Court of Appeal in 2021.
A former Intel official reported that the Itanium business had become profitable for Intel in late 2009. By 2009, the chip was almost entirely deployed on servers made by HP, which had over 95% of the Itanium server market share, making the main operating system for Itanium HP-UX. On March 22, 2011, Intel reaffirmed its commitment to Itanium with multiple generations of chips in development and on schedule.
Although Itanium did attain limited success in the niche market of high-end computing, Intel had originally hoped it would find broader acceptance as a replacement for the original x86 architecture.
AMD chose a different direction, designing the less radical x86-64, a 64-bit extension to the existing x86 architecture, which Microsoft then supported, forcing Intel to introduce the same extensions in its own x86-based processors. These designs can run existing 32-bit applications at native hardware speed, while offering support for 64-bit memory addressing and other enhancements to new applications. This architecture has now become the predominant 64-bit architecture in the desktop and portable market. Although some Itanium-based workstations were initially introduced by companies such as SGI, they are no longer available.
...the 9700 series will be the last Intel Itanium processor.
((cite journal)): Cite journal requires
...developers can quickly develop applications today that will be compatible with and can easily be tuned for Solaris on Merced.
((cite web)): CS1 maint: unfit URL (link)
((cite web)): CS1 maint: unfit URL (link)
KitGuru Says: Even though it is highly likely that “Kittson” chips will be released, it does not seem that Intel and HP actually want to invest R&D money in boosting performance of IA-64 chips. As a result, it looks like the best thing "Kittson" will offer will be a 20 per cent performance improvement over current gen offerings.
The Gentoo/IA-64 Project works to keep Gentoo the most up to date and fastest IA-64 distribution available.
((cite web)): CS1 maint: unfit URL (link)
Once touted by Intel as a replacement for the x86 product line, expectations for Itanium have been throttled well back.
((cite web)): CS1 maint: unfit URL (link)
Windows Server 2008 R2 will be the last version of Windows Server to support the Intel Itanium architecture," [...] "SQL Server 2008 R2 and Visual Studio 2010 are also the last versions to support Itanium.