Tenstorrent goes general-purpose in an age of AI specialization

Company says one system can do the whole inference job.

May 1, 2026

David Harold

Tenstorrent has made its Galaxy Blackhole systems generally available, pitching them as scalable, general-purpose AI infrastructure for LLM inference and video generation. The company claims up to 350 tokens per second per user on DeepSeek-R1-0528 671B in Blitz Mode, with sub-four-second time to first token on a 100K-token context. The larger story is not a single benchmark, but Tenstorrent’s argument that AI infrastructure is becoming too specialized while model architectures are still moving. This is a shot across the bows of disaggregated inference—and a restatement of Tenstorrent’s long-running thesis around scale, openness, and AI. (Source: Tenstorrent) Tenstorrent has announced general

...

Enjoy full access with a TechWatch subscription!

TechWatch is the front line of JPR information gathering service, comprising current stories of interest to the graphics industry spanning the core areas of graphics hardware and software, workstations, gaming, and design.

A subscription to TechWatch includes 4 hours of consulting time to be used over the course of the subscription.

Already a subscriber? Login below

Tenstorrent goes general-purpose in an age of AI specialization

Enjoy full access with a TechWatch subscription!

This content is restricted

Login