Posted: By Jon Peddie 10.10.19
Founded in 2010 in Seoul by Dr. Hyung Min Yoon, formerly at Samsung, Hee-Jin Shin from LG, Byoung Ok Lee from MtekVision, and Woo Chan Park from Sejong University, SiliconArts took on the formidable task of designing and manufacturing a ray tracing hardware accelerator co-processor, which they called RayCore.
The company showed its first implementation in an FPGA in 2014, and it was impressive then (see TechWatch volume 14, number 17, August 26, 2014, p. 19). During the past four years, the company has been steadily making improvements, expanding its product line, and developed an ambitious and impressive road map. We have tried to encapsulate all that in this report.
The first implementation of SiliconArts’ ray tracing hardware accelerator was the RayCore 1000, which is described in the following paragraphs and diagrams.
|SiliconArts’ RayCore 1000 block diagram|
The RayCore 1000 features:
- Ray tracing optics effects
- Natural expression of light (ex. reflection, refraction, transmission, shadow)
- Based on specular reflection
- Ray tracing-specific graphics effect
- Moving light
- Global lighting
- Global shadow
- Colored shadow
- Textured shadow
- 2nd shadow
- Global reflection
- Global refraction
- Global transmission
- Optical effects
- Capable of implementing various special effects (with Shader)
- Motion Blur
- Depth of field
- Light shaft
- Other filtering techniques
The company is targeting the RayCore 1000 at smartphones and tablets and embedded and industrial processor for VR and AR.
In a test using Autodesk 3ds Max 2019, the company achieved the following results with the RayCore 1000:
- CPU only
- VRay S/W
- Default option: 61 s
- Art S/W
- Draft: 20 s
- Arnold S/W
- Default option: 38 s
- VRay S/W
- RayCore on Intel Arria 10 FPGA
- RayCore Plug-in S/W
- Max option: 1 s
- RayCore Plug-in S/W
The host system was an Intel Pentium Gold G5600 @ 3.9 GHz (4 CPU)
|Autodesk 3ds Max 2019 test with 2 omni lights and 12,268 triangles|
Shown in the following tables are the general performance of the RayCore 1000.
|SiliconArts’ RayCore 1000 performance|
The SiliconArts RayCore 1000 has been available for some time. The next generation is the RayCore 2000 described in the following section.
In 2018, the company introduced its RayCore 2000 realtime ray tracing IP design capable of 250 m-rays/s @ 500 MHz (4 cores), running at 2048 × 2048 resolution.
|SiliconArts’ RayCore 2000 block diagram|
The new core offers several capabilities.
|Ray tracing functions||Realtime reflection, refraction, transmission, shadow|
|Colored shadow, textured shadow, multi shadows|
|Multiple ray bounces (up to 15 bounces)|
|Lighting||Point light, spotlight, directional light|
|Multiple light sources support, global lighting|
|Up to 64 light positions|
|Shading & texture mapping||Phong shading|
|Multi-texturing, normal mapping, displacement mapping|
|Others||Anti-aliasing, SDR (Selectively Dithered Rendering)|
|Dynamic/static scene support|
|Scalable architecture (multi-cores support)|
|Stereoscopic 3D display support|
|API||RayCore API (OpenGL ES 1.1-familiar)|
|FPGA Platform||BitWare Intel Stratix V Platform|
|RayCore 2000 features|
The RayCore 2000 offers displacement mapping—the effect of the actual movement of geometric points according to a given height field. It enhances the level of realism while cutting complicated modeling processes. And the processor supports ambient occlusion and calculates how exposed each point in a scene is to ambient lighting.
It also offers multi-texturing and overlapping of multiple texture images, as well as light binding on a target object.
The company used a scene from a living room to benchmark some of the performance characteristics of the processor.
|Ray tracing reveals details. (Source: SiliconArts)|
Livingroom contents benchmark
- BitWare Altera Stratix V (100 MHz, RC2042) 4EA
- Resolution: FHD (1920 × 1080)
- Contents Complexity: 280k Primitives
- Light Source: 5 Lights (Global Omni Light 1, Local Spot) Light 4(Light Binding))
- SDR: On (SDR Threshold 24)
The following table shows the results.
|SiliconArts’ RayCore 2000 benchmark results|
This month the company is introducing, a powerful path tracing acceleration GPU IP solution they are calling “Lite.” It features:
- ‘Unified’ traversal and intersection test
- Based on high-performance MIMD architecture
- Highly applicable for servers and high-end GPU chips
|SiliconArts’ RayCore Lite architecture flow chart|
The hardware traversal and intersection test for ray/path tracing performance is 760 Mray/s (16 cores @ 190 MHz). These processes are common in ray/path tracing. The traversal unit finds the object hit by ray (searching tree data structure) and the intersection test unit finds the exact or closest point of the object which is intersected by ray. The RT core of Nvidia RTX is a kind of T&I unit.
The Performance is an ideal performance based on the clock speed and the required clock cycles to complete the logic pipeline. The Effective Performance is 160 Mray/s as benchmarked based on Intel Arria 10 PAC Card with Cornell Box.
The feature list is as follows:
|Path-tracing functions||Traversal & intersection test
Input data: ray information
Output data: hit-point calculation results
Path tracing & ray tracing functions with SW
|Others||Scalable architecture (multi-cores support)
S/W controllable T&I
Block-based ray generation & transmission
Pipelined ray block transfer
MS DirectX (under development)
3ds Max plug-in for path tracing rendering (under development)
Blender plug-in for ray tracing rendering (under development)
|FPGA platform||Intel OPAE/PAC support|
|SiliconArts’ RayCore Lite specifications|
Shown in the following tables are the expected performance of the RayCore Lite.
|SiliconArts’ RayCore Lite performance|
RayCore Lite is available now.
Just as important as the hardware is the software, and SiliconArts has a complete software stack from the FPGA to ray tracing kernels (e.g., Embree).
|SiliconArts’ software stack|
- Support IntersectM & OccludedM functions of Embree for RayCore Lite
- Support OPAE interface for PAC users
- Easy Embree application development
And what product announcement would be complete without a comparison to the leader? SiliconArts does a Pmark-like comparison (less the dollars) and claims to offer over 2× the rays per watt compared to Nvidia’s RTX 2060.
|H/W Device||Nvidia RTX Graphics AIB||Intel Programmable Acceleration Card with Intel Arria 10 GX FPGA|
|Power consumption||160W (RTX 2060) ~280W (TITAN RTX)||20 W (Benchmarked) ③|
|Ray tracing performance (ray/s)||Effective Perf. 300M~600M||Effective Perf. 160M ②|
|H/W: 5G (RTX 2060, 30 RT cores) ~11G (TITAN RTX, 72 RT cores)||H/W: 760M (16 cores @ 190 MHz) ①|
|Effective ray tracing performance/power consumption ratio||2.14||5.05|
|SiliconArts’ RayCore Lite performance comparison|
① The Performance is an ideal performance based on the clock speed and the required clock cycles to complete the logic pipeline.
② Effective Performance benchmarked on an Intel Arria 10 PAC Card with Cornell Box.
③ Power Consumption benchmarked based on an Intel Arria 10 PAC Card.
In addition to the three implementations listed above, the company is also developing other variants and extensions of the design.
Multi-core. The company will introduce a multi-core version of the design they are cleverly calling RayCore MC. It will be a photo-realistic GPU IP offering with Monte-Carlo path tracing, ray generation, and direct/indirect illumination.
|SiliconArts’ RayCore MC structure|
Shown in the following table are the RayCore MC’s proposed specifications.
|Path Tracing Functions||Path Tracing Support
Monte-Carlo Ray Generation
Real-time diffuse reflection / refraction / transmission / soft shadow
Colored shadow on transparent objects
Textured shadow, multi shadows
Depth of field, motion blur
|Lighting||Point light, spotlight, directional light
Multiple light sources support, global lighting
Dynamic/static scene support
Scalable architecture (multi-cores support)
|API||RayCore MC API|
|SiliconArts’ RayCore MC specifications|
RayCore MC will be available as IP.
RayTree. A new approach to ray tracing using a Fast KD-tree Acceleration Structure Generation H/W for Dynamic Object Rendering.
The company will offer a dedicated KD-tree generation design IP for hardware implementation ray tracing. To deliver high-quality dynamic 3D contents and guarantee interactivity, realtime, the company believes KD-tree re-generation is a compulsory requirement for any application. Despite CPU overhead in today’s systems, the CPU has been primarily responsible for KD-tree generation and this causes process delays as well as high-level of power consumption. RayTree, says the company, can replace the CPU’s role and further maximize KD-tree regeneration performance. It will re-generate KD-tree on a realtime basis, thereby realizing on-the-fly dynamic scene processing without any CPU use and saving power consumption.
Shown in the following diagram is the generalized concept of the pipeline.
|SiliconArts’ RayTree structure|
Designed for implementation in dedicated KD-tree generation hardware RayTree scans primitives and generates acceleration structure (KD-tree) to support realtime dynamic scene processing. The company says it solves the bottleneck problem between rendering and tree building task by load balancing and distributing resources efficiently yielding efficient ray tracing rendering.
Load-balance, says the company, generates exceptional KD-tree re-generation performance. The company predicts that when compared to KD-tree generation performance using mobile CPU, RayTree has 35× faster KD-tree generation capability. Furthermore, offloading CPU overhead improves power efficiency. By combining RayTree with RayCore effectively cuts the KD-tree generation process in mobile CPU says the company, which results in reduced power consumption at the system level.
|SiliconArts’ RayTree architecture|
SiliconArts’ RayTree will offer parallel hybrid tree generation architecture with a single scan-tree unit and n KD-tree units.
Shown in the following table is the expected performance of the RayTree IP design.
|Configuration||1 Scan-tree unit & 2 KD-tree units||1 Scan-tree unit & 4 KD-tree units|
|Performance||1.2M triangles/s@600 MHz
|1.7M triangles/s@600 MHz
|Input data type||24-bit float|
|Input data format||Bound box (AABB)|
|Output data format||Node data format, list (32-bit integer)|
|Building choice||Node cost, primitive cost|
|Max data set size||216|
|Max data size||221|
|SiliconArts’ RayTree specifications|
The company expects Ray Tree to be employed by OEMs within existing processor offerings.
RayAI NXP2010Chip. Ray tracing isn’t only about pixels and beautiful images, it is a general-purpose wave tool which includes RF and audio. SiliconArts is applying their ability to other applications and devices such as AI-based voice recognition MCU. Besides low power MCU and its peripherals, it includes a dedicated voice recognition hardware accelerator as well as 2-channel 24-bit data converters. The company thinks it is well structured for IoT application of voice recognition with low power and high performance.
|SiliconArts’ NXP2010 RayAI|
- AI Deep Learning Acc. H/W embedded
- 32-bit Floating-Point DSP
- 16-bit 8MB/16MB SDRAM memory stack
- 2 channel 24-bit ADC / 1 channel 16-bit DAC
- 2Kb Boot ROM
- DMA, Interrupt Controller, Timer, WDT, DAI
- 2 UART (Host Interface / AUX)
- 2 SPI (NOR Flash / AUX)
- 2 I 2 C (external CODEC / AUX)
- 1 I 2 S (external CODEC)
- 26 GPIOs
- 2 LDO (3.3 V → 1.2 V Core voltage)
- 88pin QFN Package 10 × 10 × 0.85
- Voice Recognition SW
- It could commit the best performance in case of 3rd party partners’ voice recognition software packed as a bundle solution.
The company expects applications of its NCP2010 in light control, disables assistant, consumer electronics, smart CCTV, and AI home speakers.
What do we think?
SiliconArts has defied predictions of its demise, and found additional funding to carry on its R&D. That R&D appears to be very well invested resulting in an impressive product line and roadmap. The company claims to have a large OEM customer and when that customer brings a product to market with SiliconArts’ technology, SiliconArts can stop using investor money and move toward positive cash flow.
2019 will be marked as the year of ray tracing with Nvidia’s huge commitment and introduction of ray tracing products, Adshir’s demonstrations of real-time ray tracing in mobile devices and the application of AR apps, Sony’s announcement that the PS5 will offer ray tracing and the expected introduction of hardware-accelerated ray tracing capabilities in AMD’s and Intel’s forthcoming GPUs. SiliconArts is in the right place at the right time.