New software layer sits above Aliado SDK and integrates with ONNX Runtime to reduce app code and smooth integration friction. Shipping now to customers via the Software Centre on their website with production-grade samples spanning LLM chat, detection, and classification.

Semidynamics has introduced Inferencing Tools, a software suite intended to accelerate the deployment of AI applications on its Cervell RISC-V NPU. The tools sit above the company’s Aliado SDK and use an ONNX Runtime Execution Provider for Cervell, exposing higher-level APIs for session setup, tensor management, and inference orchestration. The goal is to shorten the time from a trained ONNX model to a running product while keeping the application code cleaner.
Semidynamics positions the stack with two lanes: Aliado SDK for low-level tuning and peak performance; Inferencing Tools for faster iteration and production hardening. They say no model conversion is required and list ready-to-adapt examples for LLM chat (e.g., Llama, Qwen), object detection (YOLO family), and image classification (ResNet, MobileNet, AlexNet). The company states the tools have been validated across a range of ONNX models. Availability is immediate to Semidynamics customers and partners via the Software Centre.
What do we think?
A focus on enabling models, not just shipping silicon, has worked well elsewhere in the ecosystem: the vendors that win mindshare reduce time-to-first-demo and time-to-production with turnkey runtimes, tested samples, and opinionated APIs. As Semidynamics leans harder into AI differentiation, this move makes strategic sense—especially if the ONNX EP is robust and the examples cover the long tail of real integration gotchas (tokenization, streaming I/O, pre/post-processing, and memory pressure on long sequences).
The success metrics here are:
- Friction removed: zero-conversion paths from popular ONNX models, clear error reporting, and sane defaults.
- Coverage and transparency: published support matrices (opsets, operators, precisions), plus guidance on when inference falls back to CPU/RISC-V cores.
- Repeatable performance: reference configs with measured latency/throughput and power on Cervell, not just works on my model.
If Semidynamics keeps tightening the loop—shipping maintained examples, exposing profiling hooks, and documenting kernel offload decisions—Inferencing Tools can become the shortest route to Cervell value.