News

SiliconArts new ray tracing chip and IP

4th generation path tracing engine

Jon Peddie

Founded in 2010 in Seoul by Dr. Hyung Min Yoon, formerly at Samsung, Hee-Jin Shin from LG, Byoung Ok Lee from MtekVision, and Woo Chan Park from Sejong University, SiliconArts took on the formidable task of designing and manufacturing a ray tracing hardware accelerator co-processor, which they called RayCore.

The company showed its first implementation in an FPGA in 2014, and it was impressive then (see TechWatch volume 14, number 17, August 26, 2014, p. 19). During the past four years, the company has been steadily making improvements, expanding its product line, and developed an ambitious and impressive road map. We have tried to encapsulate all that in this report.

RayCore 1000
The first implementation of SiliconArts’ ray tracing hardware accelerator was the RayCore 1000, which is described in the following paragraphs and diagrams.

SiliconArts’ RayCore 1000 block diagram

 

The RayCore 1000 features:

  • Ray tracing optics effects
    • Natural expression of light (ex. reflection, refraction, transmission, shadow)
    • Based on specular reflection
    • Ray tracing-specific graphics effect
    • Moving light
    • Global lighting
    • Global shadow
    • Colored shadow
    • Textured shadow
    • 2nd shadow
    • Global reflection
    • Global refraction
    • Global transmission
    • Optical effects
    • Capable of implementing various special effects (with Shader)
    • Defocus
    • Motion Blur
    • Depth of field
    • Light shaft
    • Other filtering techniques

The company is targeting the RayCore 1000 at smartphones and tablets and embedded and industrial processor for VR and AR.

In a test using Autodesk 3ds Max 2019, the company achieved the following results with the RayCore 1000:

  • CPU only
    • VRay S/W
      • Default option: 61 s
    • Art S/W
      • Draft: 20 s
    • Arnold S/W
      • Default option: 38 s
  • RayCore on Intel Arria 10 FPGA
    • RayCore Plug-in S/W
      • Max option: 1 s

The host system was an Intel Pentium Gold G5600 @ 3.9 GHz (4 CPU)

Autodesk 3ds Max 2019 test with 2 omni lights and 12,268 triangles

 

Shown in the following tables are the general performance of the RayCore 1000.

SiliconArts’ RayCore 1000 performance

 

The SiliconArts RayCore 1000 has been available for some time. The next generation is the RayCore 2000 described in the following section.

RayCore 2000

In 2018, the company introduced its RayCore 2000 realtime ray tracing IP design capable of 250 m-rays/s @ 500 MHz (4 cores), running at 2048 × 2048 resolution.
 

SiliconArts’ RayCore 2000 block diagram

 

The new core offers several capabilities.

Ray tracing functions Realtime reflection, refraction, transmission, shadow
  Colored shadow, textured shadow, multi shadows
  Ambient occlusion
  Multiple ray bounces (up to 15 bounces)
Lighting Point light, spotlight, directional light
  Multiple light sources support, global lighting
  Light binding
  Up to 64 light positions
Shading & texture mapping Phong shading
  Texture mapping
  Multi-texturing, normal mapping, displacement mapping
  MIP mapping
  Alpha-blending (α-texture)
Others Anti-aliasing, SDR (Selectively Dithered Rendering)
  Dynamic/static scene support
  Scalable architecture (multi-cores support)
  Stereoscopic 3D display support
API RayCore API (OpenGL ES 1.1-familiar)
FPGA Platform BitWare Intel Stratix V Platform
RayCore 2000 features

 

The RayCore 2000 offers displacement mapping—the effect of the actual movement of geometric points according to a given height field. It enhances the level of realism while cutting complicated modeling processes. And the processor supports ambient occlusion and calculates how exposed each point in a scene is to ambient lighting.

It also offers multi-texturing and overlapping of multiple texture images, as well as light binding on a target object.

The company used a scene from a living room to benchmark some of the performance characteristics of the processor. 

Ray tracing reveals details. (Source: SiliconArts)

 

Livingroom contents benchmark

  • BitWare Altera Stratix V (100 MHz, RC2042) 4EA
  • Resolution: FHD (1920 × 1080)
  • Contents Complexity: 280k Primitives
  • Light Source: 5 Lights (Global Omni Light 1, Local Spot) Light 4(Light Binding))
  • SDR: On (SDR Threshold 24)

The following table shows the results.

SiliconArts’ RayCore 2000 benchmark results

 

RayCore Lite
This month the company is introducing, a powerful path tracing acceleration GPU IP solution they are calling “Lite.” It features:

  • ‘Unified’ traversal and intersection test
  • Based on high-performance MIMD architecture
  • Highly applicable for servers and high-end GPU chips
SiliconArts’ RayCore Lite architecture flow chart

 

The hardware traversal and intersection test for ray/path tracing performance is 760 Mray/s (16 cores @ 190 MHz). These processes are common in ray/path tracing. The traversal unit finds the object hit by ray (searching tree data structure) and the intersection test unit finds the exact or closest point of the object which is intersected by ray. The RT core of Nvidia RTX is a kind of T&I unit.

The Performance is an ideal performance based on the clock speed and the required clock cycles to complete the logic pipeline. The Effective Performance is 160 Mray/s as benchmarked based on Intel Arria 10 PAC Card with Cornell Box.

The feature list is as follows:

Item Description
Path-tracing functions Traversal & intersection test
Input data: ray information
Output data: hit-point calculation results
Path tracing & ray tracing functions with SW
Others Scalable architecture (multi-cores support)
S/W controllable T&I
Block-based ray generation & transmission
Pipelined ray block transfer
API/Software Intel Embree
MS DirectX (under development)
3ds Max plug-in for path tracing rendering (under development)
Blender plug-in for ray tracing rendering (under development)
FPGA platform Intel OPAE/PAC support
SiliconArts’ RayCore Lite specifications

 

Shown in the following tables are the expected performance of the RayCore Lite.

SiliconArts’ RayCore Lite performance

 

RayCore Lite is available now.

Software stack
Just as important as the hardware is the software, and SiliconArts has a complete software stack from the FPGA to ray tracing kernels (e.g., Embree).

SiliconArts’ software stack

 

Intel Embree

  • Support IntersectM & OccludedM functions of Embree for RayCore Lite
  • Support OPAE interface for PAC users
  • Easy Embree application development

 

And what product announcement would be complete without a comparison to the leader? SiliconArts does a Pmark-like comparison (less the dollars) and claims to offer over 2× the rays per watt compared to Nvidia’s RTX 2060.

H/W Device  Nvidia RTX Graphics AIB Intel Programmable Acceleration Card with Intel Arria 10 GX FPGA
Power consumption  160W (RTX 2060) ~280W (TITAN RTX) 20 W (Benchmarked) ③
Ray tracing performance (ray/s) Effective Perf. 300M~600M  Effective Perf. 160M ②
  H/W: 5G (RTX 2060, 30 RT cores) ~11G (TITAN RTX, 72 RT cores) H/W: 760M (16 cores @ 190 MHz) ①
Effective ray tracing performance/power consumption ratio 2.14 5.05
SiliconArts’ RayCore Lite performance comparison

 

① The Performance is an ideal performance based on the clock speed and the required clock cycles to complete the logic pipeline.
② Effective Performance benchmarked on an Intel Arria 10 PAC Card with Cornell Box.
③ Power Consumption benchmarked based on an Intel Arria 10 PAC Card.

Road map
In addition to the three implementations listed above, the company is also developing other variants and extensions of the design.

Multi-core. The company will introduce a multi-core version of the design they are cleverly calling RayCore MC. It will be a photo-realistic GPU IP offering with Monte-Carlo path tracing, ray generation, and direct/indirect illumination.

SiliconArts’ RayCore MC structure

 

Shown in the following table are the RayCore MC’s proposed specifications.

Functionality  Description
Path Tracing Functions Path Tracing Support
Monte-Carlo Ray Generation
Real-time diffuse reflection / refraction / transmission / soft shadow
Glossy reflection
Colored shadow on transparent objects 
Textured shadow, multi shadows
Depth of field, motion blur
Lighting Point light, spotlight, directional light
Multiple light sources support, global lighting
Others Anti-aliasing
Dynamic/static scene support
Scalable architecture (multi-cores support)
API RayCore MC API
SiliconArts’ RayCore MC specifications

 

RayCore MC will be available as IP.

RayTree. A new approach to ray tracing using a Fast KD-tree Acceleration Structure Generation H/W for Dynamic Object Rendering.

The company will offer a dedicated KD-tree generation design IP for hardware implementation ray tracing. To deliver high-quality dynamic 3D contents and guarantee interactivity, realtime, the company believes KD-tree re-generation is a compulsory requirement for any application. Despite CPU overhead in today’s systems, the CPU has been primarily responsible for KD-tree generation and this causes process delays as well as high-level of power consumption. RayTree, says the company, can replace the CPU’s role and further maximize KD-tree regeneration performance. It will re-generate KD-tree on a realtime basis, thereby realizing on-the-fly dynamic scene processing without any CPU use and saving power consumption.

Shown in the following diagram is the generalized concept of the pipeline.

SiliconArts’ RayTree structure

 

Designed for implementation in dedicated KD-tree generation hardware RayTree scans primitives and generates acceleration structure (KD-tree) to support realtime dynamic scene processing. The company says it solves the bottleneck problem between rendering and tree building task by load balancing and distributing resources efficiently yielding efficient ray tracing rendering.

Load-balance, says the company, generates exceptional KD-tree re-generation performance. The company predicts that when compared to KD-tree generation performance using mobile CPU, RayTree has 35× faster KD-tree generation capability. Furthermore, offloading CPU overhead improves power efficiency. By combining RayTree with RayCore effectively cuts the KD-tree generation process in mobile CPU says the company, which results in reduced power consumption at the system level.

SiliconArts’ RayTree architecture

 

SiliconArts’ RayTree will offer parallel hybrid tree generation architecture with a single scan-tree unit and n KD-tree units.

Shown in the following table is the expected performance of the RayTree IP design.

Feature Description  
Part Number RT1002 RT1004
Configuration 1 Scan-tree unit & 2 KD-tree units  1 Scan-tree unit & 4 KD-tree units
Performance 1.2M triangles/s@600 MHz
40K triangles@30fps
1.7M triangles/s@600 MHz
58K triangles@30fps

 

Functionality  Description
Bus AXI
Input data type 24-bit float
Input data format Bound box (AABB)
Output data format  Node data format, list (32-bit integer)
Building choice  Node cost, primitive cost
Max data set size 216
Max data size  221
SiliconArts’ RayTree specifications

 

The company expects Ray Tree to be employed by OEMs within existing processor offerings.

RayAI NXP2010Chip. Ray tracing isn’t only about pixels and beautiful images, it is a general-purpose wave tool which includes RF and audio. SiliconArts is applying their ability to other applications and devices such as AI-based voice recognition MCU. Besides low power MCU and its peripherals, it includes a dedicated voice recognition hardware accelerator as well as 2-channel 24-bit data converters. The company thinks it is well structured for IoT application of voice recognition with low power and high performance.

SiliconArts’ NXP2010 RayAI 

 

 

Key features:

  • AI Deep Learning Acc. H/W embedded
  • 32-bit Floating-Point DSP
  • 16-bit 8MB/16MB SDRAM memory stack
  • 2 channel 24-bit ADC / 1 channel 16-bit DAC
  • 2Kb Boot ROM
  • DMA, Interrupt Controller, Timer, WDT, DAI
  • 2 UART (Host Interface / AUX)
  • 2 SPI (NOR Flash / AUX)
  • 2 I 2 C (external CODEC / AUX)
  • 1 I 2 S (external CODEC)
  • 26 GPIOs
  • 2 LDO (3.3 V → 1.2 V Core voltage)
  • 88pin QFN Package 10 × 10 × 0.85
  • Voice Recognition SW
    • It could commit the best performance in case of 3rd party partners’ voice recognition software packed as a bundle solution.

 

The company expects applications of its NCP2010 in light control, disables assistant, consumer electronics, smart CCTV, and AI home speakers.

What do we think?

SiliconArts has defied predictions of its demise, and found additional funding to carry on its R&D. That R&D appears to be very well invested resulting in an impressive product line and roadmap. The company claims to have a large OEM customer and when that customer brings a product to market with SiliconArts’ technology, SiliconArts can stop using investor money and move toward positive cash flow. 

2019 will be marked as the year of ray tracing with Nvidia’s huge commitment and introduction of ray tracing products, Adshir’s demonstrations of real-time ray tracing in mobile devices and the application of AR apps, Sony’s announcement that the PS5 will offer ray tracing and the expected introduction of hardware-accelerated ray tracing capabilities in AMD’s and Intel’s forthcoming GPUs. SiliconArts is in the right place at the right time.