News

MAX78000 brings CNN inference to battery-powered edge

ADI’s Maxim CNN accelerator targets AI at microjoule energy levels.

Jon Peddie

The MAX78000 is the chip that makes AI inference practical for battery-powered devices. Maxim Integrated—acquired by Analog Devices in 2021 for $21B—built a hardware CNN accelerator directly into a microcontroller, delivering inference at under 1 microjoule per operation. That is not a software optimization or a DSP trick—it is a dedicated neural network engine with 442 KB of weight SRAM, flexible precision, and a full PyTorch/TensorFlow toolchain. For IoT, wearables, and edge sensing, it changes the design calculus entirely.

MAX78000

Maxim Integrated launched the MAX78000 in October 2020, roughly a year before Analog Devices acquired the company for $21 billion. The chip combines three compute elements on a single die: an Arm Cortex-M4 with FPU running at up to 100 MHz for system control, a 32-bit RISC-V coprocessor at up to 60 MHz for auxiliary processing, and a dedicated hardware CNN accelerator optimized for deep convolutional neural network inference.

The CNN engine drives the chip’s value proposition. It stores up to 442 KB of 8-bit weights in on-chip SRAM, with support for 1-, 2-, 4-, and 8-bit weight precision—enabling networks of up to 3.5 million weights. Because the weight memory is SRAM-based, network updates deploy without reflashing. The engine pairs the weight storage with 512 KB of data memory and supports input images up to 1,024 × 1,024 pixels. Network depth reaches 64 programmable layers, with per-layer channel widths up to 1,024 channels. The architecture handles 1D and 2D convolution, streaming mode, MLP, and recurrent network types.

MAX78000 block diagram

Figure 1. Maxim MAX78000 block diagram. (Source: ADI)

The MCU subsystem brings 512 KB of flash, 128 KB of SRAM, a 12-bit parallel camera interface, I2S for digital audio, and up to 52 GPIO pins. Security features include AES-128/192/256 hardware acceleration, a true random number generator for key seeding, and optional secure boot.

Energy architecture

The MAX78000 does not compete on TOPS—Maxim and ADI have never published a peak TOPS figure. The chip competes on energy per inference. The hardware CNN accelerator executes inference in microjoules, while a software-equivalent inference on a standard Cortex-M MCU consumes roughly 100× more energy and takes 100× longer. That gap comes entirely from the dedicated accelerator: Moving weight data once into SRAM and executing convolution in hardware eliminates the repeated memory fetch cycles that dominate software inference on general-purpose cores.

Power management reinforces the architecture. An integrated single-inductor multiple-output switch-mode power supply accepts 2.0 V to 3.6 V input and uses dynamic voltage scaling to minimize active core consumption. The Cortex-M4 draws 22.2 µA/MHz executing from cache at 3.0 V. Selective SRAM retention in low-power modes, with the real-time clock running, preserves application state across sleep intervals—a requirement for duty-cycled IoT endpoints.

Toolchain

Engineers train networks in PyTorch or TensorFlow using standard workflows, then run Maxim’s conversion tools to map the trained model onto the CNN engine’s weight memory and layer configuration. The toolchain handles quantization-aware training for 1-, 2-, 4-, and 8-bit weight formats, generates the weight binary, and produces the C code to configure the engine at runtime. The workflow targets the existing AI engineering base—teams working in standard frameworks do not adopt a new training environment.

Target applications

The combination of microjoule inference, camera interface, audio input, and security features maps directly to a set of application categories: always-on keyword spotting in wearables, face detection in battery-powered access control panels, anomaly detection in industrial sensors, gesture recognition in consumer electronics, and video content classification in edge cameras. Each of these applications shares a common constraint—inference must run continuously on a coin cell, AA battery, or energy-harvested supply, with no path to active cooling.

The MAX78000 also addresses a second constraint that cloud inference cannot: latency and connectivity independence. A camera running face detection locally responds in milliseconds with no network round trip and no dependency on cellular or Wi-Fi availability.

A follow-on product, the MAX78002, launched in 2022 under ADI stewardship with a larger CNN engine and higher throughput for more complex networks. Both products continue in active production under the Analog Devices portfolio.

The MAX78000 established a design philosophy that has since become mainstream: AI inference at the endpoint requires hardware purpose-built for that workload, not a faster general-purpose processor. The chip validated that a dedicated CNN accelerator integrated with an MCU, priced near $5, unlocks inference in product categories that GPUs and high-power NPUs cannot reach economically—and that validation directly shaped the subsequent generation of AI microcontrollers from multiple vendors. For CIOs evaluating AI deployment at scale, the MAX78000 represents a cost and architecture model that cloud inference cannot match at the endpoint. The economics favor local inference whenever devices are battery-powered, privacy-constrained, connectivity-limited, or deployed in volumes where per-inference API costs accumulate. ADI’s continued investment in the follow-on MAX78002 confirms the architecture found a durable market. Procurement teams sourcing AI-capable microcontrollers for new IoT programs should include this product family in their evaluation alongside Nordic Semiconductor’s nRF9161, Ambiq’s Apollo5, and STMicroelectronics’ STM32N6.

What do we think? 

The MAX78000 gets the IoT AI economics right. Hardware inference at microjoule energy levels, a $5 price point, standard training tools, and no recurring cloud cost—that combination addresses the real constraints of large-scale endpoint deployment. IT decision-makers evaluating smart building, industrial, or retail AI programs should treat this chip family as a baseline reference point, not an afterthought.

The MAX78000’s 2020 launch marked an inflection point in how the industry approached IoT AI—the recognition that cloud inference cannot scale economically or reliably to billions of battery-powered endpoints. That inflection point redirected development investment toward hardware CNN accelerators embedded in microcontrollers, enabling always-on AI at costs and power levels that cloud models cannot reach. Every major MCU vendor now ships or roadmaps an on-chip AI accelerator, validating the market thesis the MAX78000 demonstrated in production first.

ADI’s Maxim is one of the 152 AI processors in our AI Processor Tracking Service, which also lists performance and other specifications for 291 products.

WHAT DO YOU THINK?  LIKE THIS STORY? TELL YOUR FRIENDS, TELL US.