Before our cars can become the robot slaves we want them to be, taking us here and there and not making us tell them how or when. Before they can have some sense of vision and common sense, robot slave cars have to be able to reliably see their surrounding environment. Right now, their vision is limited. Environmental factors such as rain, snow, or other blockages can affect a camera’s vision, just like they can block human vision, especially if you wear glasses. In addition to being able to make sense of its surroundings, a robust robot slave vehicle’s perception system should be able to reason about the validity of the data coming in from sensors—that common sense thing—something many humans could use. To engage such common sense, it’s necessary to figure out if there is any invalidity in the sensor data, and do it as early as possible in the processing pipeline before data is consumed by downstream modules. That's just common sense isn’t it?
Nvidia thinks they’ve got this all figured out and have developed ClearSightNet, a deep neural network (DNN) trained to evaluate a robot slave car’s cameras’ ability to see clearly and help determine root causes of occlusions, blockages, and reductions in visibility. The company developed it with the following requirements in mind:
- The ability to reason across a large variety of potential causes of camera blindness.
- Output meaningful information that is actionable.
- Must be very lightweight so it can run on multiple cameras with minimal computational overhead.
ClearSightNet, Nvidia informs us, segments camera images into regions corresponding to two types of blindness: occlusion and reduction in visibility — the rain, dirt, or snow on your glasses or camera.
Occlusion segmentation corresponds to regions of the camera field of view that are either covered by an opaque blockage such as dust, mud or snow or contain no information, such as pixels saturated due to sun. Perception is generally completely impaired in such regions for humans and robot slaves.
Reduced visibility segmentation corresponds to regions that are not completely blocked but have compromised visibility due to heavy rain, water droplets, glare, fog, etc. Perception is often partially impaired in these regions and should be regarded as having lower confidence.
The network generates a mask that is overlaid on the input image and shows where fully occluded regions are by a red blob, and reduced visibility and/or partially occluded regions are indicated by green areas.
|Real world on left, ClearSight’s interpretation on right. (Source: Nvidia)|
ClearSightNet, says Nvidia, gives a ratio or percentage which indicates the fraction of the input image that is affected by occlusion or reduced visibility. That’s probably useful, but I’m not sure why. In the image above, 84% of the image pixels are affected by occlusion, with partial occlusion visualized in green and full occlusion in red.
One thing Nvidia and other developers won’t have to solve for visual and common-sense impairment due to the consumption of spirits, fermented hops, and grapes. But if the robot slave car is doing its job the humans in it can enjoy those pleasures while the windshield wipers are slapping time and the occupants are singing every song that driver knew…
Nvidia explains that this information can be used in multiple ways. For example, the vehicle can choose to not engage its autonomous function when blindness is high, alert the user to consider cleaning the camera lens or windshield or use ClearSightNet output to inform camera perception confidence calculation. This could be problematic if one was napping or otherwise involved after consuming spirits, fermented hops, or grapes. And if it was snowing outside, I doubt they’d want to get dressed and go clear the spectacles of the car.
Not to worry, says Nvidia. The current and future versions of ClearSightNet will continue to generate both end-to-end analyses of any detailed information about camera blindness to enable a large degree of control on the implementation that ships with vehicles. Building a robot slave car is a lot trickier than a lot of people originally thought when these programs got started. Nvidia continues to pile experience and invest in computer vision R&D and surprises us almost every month.