News

Intel unveils Xe-architecture-based discrete GPU for HPC

Intel's new Xe-architecture-based discrete GPU will be deployed in Aurora supercomputer at the Argonne National Laboratory

Jon Peddie

Intel made a few significant and interesting announcements at the 2019 supercomputer conference (SC19) in Denver.

  • The company officially launched its OneAPI, a unified and scalable programming model for heterogeneous computing architectures.
  • Announced a general-purpose GPU-based on the X architecture, code-named “Ponte Vecchio.” claiming it is optimized for HPC/AI acceleration
  • Revealed additional architectural details of the exascale Aurora Supercomputer at Argonne National Laboratory.

 

In addition, Intel said Ponte Vecchio will be manufactured on Intel’s 7nm technology. And the GPU will use Intel’s Foveros 3D and EMIB packaging innovations(introduced last December) and feature multiple technologies in-packaging, including high-bandwidth memory, Compute Express Link interconnect, and other intellectual property.

oneAPI. Intel says OneAPI will offer a developer-centric approach to heterogeneous computing. It will, claims the company, define programming for an increasingly AI-infused, multi-architecture world. It is a unified and open programming environment for developers allowing them to pick the architecture of their choice without compromising the performance and eliminating the complexity of separate codebases, multiple-programming languages, and different tools and workflows. OneAPI, says Intel, preserves existing software investments with support for existing languages while delivering flexibility for developers to create versatile applications.

Intel’s one API stack (Source: Intel)

 

OneAPI includes both an industry initiative based on open specifications and an Intel beta product. The OneAPI specification includes a direct programming language, powerful APIs, and a low-level hardware interface. Intel’s oneAPI beta software is a portfolio of developer tools that includes compilers, libraries, and analyzers, packaged into domain-focused toolkits. The initial oneAPI beta release targets Intel’s Xeon processors, Intel Core processors with integrated graphics, and Intel FPGAs, with additional hardware support to follow in future releases— read, GPUs. Developers can download the oneAPI tools, test drive them in the Intel oneAPIDevCloud, and learn more about oneAPI at software.intel.com/oneAPI.

First (announced) Xe-architecture-based GPU. At SC19, Intel unveiled a new category of general-purpose GPUs based on Intel’s Xe architecture. Code-named “Ponte Vecchio”. Intel says it is a new high-performance, highly flexible discrete general-purpose GPU designed for HPC modeling and simulation workloads, and AI training. Ponte Vecchio will be manufactured on Intel’s 7nm technology and will be Intel’s first Xe-based GPU optimized for HPC and AI workloads.

Features of Ponte Vecchio GPU (Source: Intel)

 

Ponte Vecchio will employ Intel’s Foveros 3D and EMIB packaging innovations and feature multiple technologies in-package, including high-bandwidth memory, Compute Express Link interconnect, and other intellectual property.

Aurora supercomputer. Intel says its data centric silicon portfolio and oneAPI initiative lays the foundation for the convergence of HPC and AI workloads at exascale within the Aurora system at Argonne National Laboratory.

Aurora exploits a lot of Intel technology (Source: Intel)

 

Aurora will be the first U.S. exascale system and will, according to Intel, leverage the full breadth of Intel’s data-centric technology portfolio, building upon the Intel Xeon Scalable platform and using Xe architecture-based GPUs, as well as Intel Optane DC Persistent Memory and connectivity technologies. The compute node architecture of Aurora will feature two 10nm-based Intel Xeon Scalable processors (code-named “Sapphire Rapids”) and six Ponte Vecchio GPUs. How many nodes in the Aurora is still a secret. Aurora will support over 10 petabytes of memory and over 230 petabytes of storage and will use the Cray Slingshot fabric to connect nodes across more than 200 racks. [ed. Assume at least two nodes in a 1U, and a 23-inch rack, which would allow for 840 nodes, or 5,040 GPUs, or less depending on storage.]

What do we think?

Intel’sentry into a top-to-bottom, scalable GPU architecture, which they call Xe with an annoying superscript, promises a lot, and the Florence bridge named device is the first reveal of a product, but delivery of the Aurora supercomputer to Argonne National Laboratory isn’t scheduled until 2021.

Intel’s discrete GPU will span from notebooks to supercomputers (Source: Intel)

 

It’s highly unlikely that Intel will develop a single architecture and simply step and repeat to scale it. More likely they will have three or four versions with different caches, I/O, Transcendentals, display processors, memory managers, ROPs, and texture processors. However, as the diagram above indicates, the various designs will look the same to a programmer through a common programming model. That is a perfectly logical and legitimate way to do things, and one just has to ignore the unified architecture marketing hype,

What Intel didn’t say. There was no mention of how many shaders, how much and what type of memory, if a Ponte Vecchio-based AIB would have a display output, how much it will cost and when it will be available. Intel critics will point to that dearth of information and make analogies to Larrabee, and slippages, and other denigrating possibilities — and they will be making a serious mistake in my opinion.

This is not a product announcement — it’s a technology announcement and peek about where the technology is going to be employed — in the greatest supercomputer ever built. Period.

The OneAPI is also promising, and difficult. That is why in this early unveiling Intel didn’t include discrete GPU, DSP, or NNP processors. They all have unique ISAs and programming rules, and it’s going to take some mighty clever programming work to make a single programming model that does all those translations so lazy programmers don’t have to think and can just use Alexa to write code for them.

Also missing from Intel’s announced plans is any mention of ray tracing. Don’t be smug about that. Intel has plenty of ray tracing capabilities (e.g., Embree) and Intel knows how to design ASPs, so I think it’s an easy forecast to say the company will have ray tracing in their dGPUs, for consumer and professional.

As for ASPs, Intel has a lot, CPUs, FPGAs, NNPs, GPUs, transcoders, communications, and security. The only thing missing from the company’s arsenal is a DSP and an ISP. However, we know Intel can do them because it already has. It just a question of demand.

Intel has finally shaken off the albatross of one architecture fits all and is moving, and we think aggressively, in whatever processor for the job. Intel should re-brand themselves — the processor company, it makes Intel inside ever more apt.