News

Imagination introduces its new E-Series GPU

Good for graphics and AI—simultaneously.

David Harold

In 2023, Imagination Technologies shifted its AI strategy, discontinuing its NNA to integrate AI into its new E-Series GPUs, offering up to 200 TOPS. This reflects skepticism towards dedicated NPUs and prioritizes power efficiency for edge markets like automotive and AI PCs. The E-Series features deep integration, reusing GPU components to minimize data movement and enhance efficiency. Key innovations include Neural Cores and Burst Processors, supporting diverse data types and developer tools. Imagination aims for realistic utilization and improved multitasking, with the first products expected in autumn 2025.

In 2023, Imagination Technologies shifted its AI strategy by discontinuing its Neural Network Accelerator (NNA) and focusing on integrating AI capabilities into its GPUs.

This decision reflects a degree of skepticism about the long-term viability of dedicated neural processing units (NPUs). The new AI solution, part of the E-Series, offers up to 200 TOPS, balancing performance and power efficiency. Imagination’s target market extends beyond smartphones to include automotive applications, AI-powered personal computers, and other edge devices not continuously connected to the cloud.

IMG E-Series
(Source: Imagination)

Automotive applications are a primary market, with growth expected in robotics, relying on AI for sensor fusion, data analysis, planning, and prediction. In mobile and consumer electronics, the focus is on computer vision, photography, intelligent assistants, and natural language processing. General edge computing sees applications in predictive maintenance and AI-driven functionalities across industries. The author anticipates strong adoption in the automotive sector, but success in the mobile market remains uncertain.

Imagination’s renewed focus on core competencies involves deep integration of AI acceleration within the GPU architecture.

Placing an AI accelerator higher in the system architecture increases silicon area, power consumption, and data movement overhead due to the need for dedicated SRAM and custom logic. Some GPU manufacturers are exploring separate AI acceleration blocks, but this approach requires transferring large data chunks, maintaining arithmetic logic unit (ALU) utilization, and retrieving processed data, leading to bandwidth demands and area costs.

Imagination has chosen deep integration, reusing existing GPU components like registers, memory, and scheduling mechanisms, keeping processing elements close to the compute core. This minimizes data movement, maximizes operational efficiency, and optimizes power use. This aligns with modern graphics APIs like OpenCL and Vulkan, which interleave compute and graphics tasks. Nvidia uses a similar approach, co-locating tensor and CUDA cores to share resources. GPUs use scratchpads and buffers for optimal performance, not just cache.

The Imagination E-Series GPU IPs support both graphics and AI tasks at the edge, scaling from 2 to 200 TOPS for AI workloads using int8 and FP8 precision. The E-Series serves as the primary processor for graphics and compute in edge systems, potentially eliminating the need for separate AI hardware. It builds upon Imagination’s GPU architecture to manage these demands, offering flexibility for designers of systems in various applications.

Kristof Beets, Imagination’s VP of product management, noted that concepts similar to deferred rendering in graphics apply to compute tasks like pruning and sparse processing. This allows reusing existing GPU capabilities, enabling AI functionality with minimal silicon area increase. The memory subsystem has been made more flexible for both graphics and compute customers. While the AI implementation doesn’t reuse the exact same hardware engine as deferred rendering, the principle of workload avoidance is valuable.

The E-Series introduces Neural Cores for AI acceleration up to 200 TOPS and Burst Processors for improved power efficiency in edge applications by reducing data movement and pipeline depth. The E-Series supports ray tracing and quadruples the AI capability of the D-Series. It supports various AI data formats and incorporates a memory design that reduces reliance on external memory, improving efficiency.

The fundamental GPU building block, similar to the DXTP GPU, scales up to four cores. On current process nodes, the E-Series can reach around 13 TFLOPS and slightly over 200 TOPS of int8 performance, also supporting bfloat16 and FP16 at half-rate. This improves density over the D-Series, which had a quarter-rate 8-bit pipeline for AI.

Imagination leverages its GPUs’ existing infrastructure for flexible data types through texture format support and data format conversion engines. Existing block compression capabilities can handle compressed AI formats, making the AI-specific logic added to the E-Series relatively small. The E-Series GPUs integrate with developer tools and frameworks like OpenCL, Apache TVM, oneAPI, and Imagination’s libraries. Multitasking performance is enhanced, supporting up to 16 virtual machines with quality-of-service management. The first E-Series product is expected in autumn 2025, with versions for automotive, consumer, desktop, and mobile markets.

Imagination views its E-Series as part of a trend where general-purpose GPUs are being adapted for AI tasks. By integrating AI capabilities, Imagination aims to reduce data movement and improve power efficiency, aligning with open standards. Architecturally, the Burst Processor reduces latency, improving performance and efficiency, especially for smaller workloads, by enabling data reuse and simplifying scheduling. Dedicated Matrix Multiplication Engines operate within this Burst framework.

The scheduling system is hierarchical. A hardware-based scheduler manages data dependencies, and a RISC-V firmware processor handles task prioritization in virtualized environments. Imagination continues to evolve the RISC-V cores used internally. The company aims for close to 50% average utilization for neural networks, improving over typical figures. Support for multiple virtual domains is crucial for automotive clients, and E-Series doubles this to 16. On-chip shared memory and embedded RISC-V controllers manage coordination between the GPU and other accelerators, reducing latency by keeping data movement on-chip. The architecture benefits from substantial tightly coupled memory.

For an in depth examination of Imagination’s new E-series, with opinion, go here.

LIKE WHAT YOU’RE READING? INTRODUCE US TO YOUR FRIENDS AND COLLEAGUES.