AoS, or array of structures, refers to a way of organizing data in computer memory whereby each element is a structure, and these structures are stored sequentially in an array. AoS is often contrasted with structure of arrays (SoA), whereby data for each field of the structure is stored in separate arrays. The choice between AoS and SoA impacts memory access patterns and performance, particularly in scenarios involving SIMD (Single Instruction, Multiple Data) operations.
Modern graphics processing units (GPUs) continue to chase higher computational performance, but memory architecture, rather than processing power, increasingly imposes the greatest limitations. While static random-access memory (SRAM) currently plays a key role in ensuring fast data access, scaling this technology presents mounting challenges. The difficulty of further miniaturizing SRAM cells hinders density improvements and drives up power consumption. In response, researchers have explored alternatives such as amorphous oxide semiconductors (AOS) to create persistent, high-bandwidth memory systems that could better support GPU workloads.

A team at the Georgia Institute of Technology’s School of Electrical and Computer Engineering—Faaiq Waqar, Ming-Yen Lee, Seongwon Yoon, Seongkwang Lim, and Shimeng Yu—focused their efforts on stacking persistent embedded memories using AOS technology over traditional CMOS circuitry. In their study titled “CMOS+X: Stacking Persistent Embedded Memories based on Oxide Transistors upon GPGPU Platforms,” the researchers investigated how AOS transistors might integrate with gain-cell memory designs to bypass the bottlenecks that GPUs experience in register file access and last-level cache bandwidth.
They evaluated how these bottlenecks limit data delivery to SIMD (single-instruction, multiple-data) lanes, thereby slowing down overall performance. To address this, the team examined high-bandwidth banked memories built with multi-ported gain cells formed from AOS materials. These configurations promised persistent storage and energy efficiency, while offering faster access to stored data.
Using monolithic three-dimensional (M3D) integration and back-end-of-line (BEOL) fabrication, the team constructed multiplexed memory arrays. These arrays were layered on top of new memory cells above existing logic without disrupting the transistor layers below. By exploiting BEOL fabrication, the team preserved silicon area and interconnect bandwidth, ensuring compact, vertically stacked memory blocks. M3D techniques further improved system density and reduced interconnect delay.
The AOS-based memory architecture performed well compared to traditional technologies such as High Bandwidth Memory (HBM) and Graphics Double Data Rate 6 (GDDR6). The researchers found the new design offered competitive power efficiency and speed, while using less area. They also proposed pathways to improve these results by investigating new device materials and cell structures.
To fully leverage the AOS-based memory system, the team designed a memory controller that exploits its low latency and high throughput. This controller used an intelligent data placement method to keep frequently accessed information close to processing elements, and it featured a cache coherence protocol that ensured consistency across memory banks.
The team analyzed the system’s power consumption and identified how different components contributed to energy use. They applied optimization techniques to reduce these losses, finding that the AOS-based memory consumed significantly less standby power than SRAM. These results demonstrated its potential for greater energy efficiency in large-scale systems.
The researchers also evaluated how well the AOS-based memory supported emerging computational fields such as AI, machine learning, and scientific simulations. These workloads require fast and efficient memory. The system showed strong performance in these areas. Additionally, the team explored in-memory computing capabilities, which allow operations to occur directly within memory rather than transferring data back and forth between processors.
The published work includes detailed circuit designs and simulation models, offering a road map for other researchers to replicate or extend the architecture. By releasing these tools publicly, the team hopes to encourage broader exploration of persistent memory technologies.
They concluded that AOS-based embedded memory, integrated using advanced CMOS and 3D stacking techniques, offers a promising path forward for high-performance, energy-efficient GPU architectures. Further refinement and manufacturing research could help scale the solution for widespread use in next-generation graphics and computing systems.
You can find the technical paper here and here on arXiv. arXivLabs is a framework that allows collaborators to develop and share new features directly on its website.
YOU LIKE THIS KIND OF STUFF? WE HAVE LOTS MORE. TELL YOUR FRIENDS, WE LOVE MEETING NEW PEOPLE.