CG and CV are black holes for processing power

Posted: 09.09.14

Too much, good enough—nonsense  

Some of you reading this may know I’ve postulated a few axioms over the years, all of them about scale in one way or another. One of my favorites is my first: In computer graphics, too much is not enough—1981. It was true then, and it’s true now. It’s also why I get so tired of the question, but haven’t integrated graphics caught up? No. There is no catchup. You can’t catch up. You’ll never catch up. 

A friend of mine more famous than I, Jim Blinn, also has an axiom: Blinn’s Law: As technology advances, the rendering time remains constant— 2002 (but I’m pretty sure I heard him say that long before 2002).

NHK’s 8K camera shown at NAB 2014.

At Siggraph this year, another old friend, Lincoln Wallen, CTO Dreamworks, and I were trying to explain to an earnest younger colleague, Neil Schneider, that even with all the additional horsepower remote graphics and virtualization will bring you, you can’t make a movie (in this case) any faster. Well, that’s not totally correct—yes, you could make a movie faster, but no one in the movie business wants to do that. Why? Well, it’s not union rules or stretching work to keep their jobs. It’s because the movie industry, especially the special effects and animation folks, understand my first axiom, and practice Jim’s. They want to make more beautiful, believable, delightful movies, and they will take every MIP, FLOPS, and polygon you can give them—can I have a bit more sir… ?

My second axiom, along the same lines of scale is: The more you can see, the more you can do—1998. If you’re going to generate a lot of pixels, you have to be able to see them. That’s why I love 4K and have two of them plus a third WUXGA sitting in front of me— there’s a lot to see whether it’s a visualization, 1 a movie, a game, or just my ridiculously large quarterly spreadsheet.

But that’s all back-end stuff. What if it’s not computer-generated but captured, from a camera, from a 4K, 5K, or 8K camera? Modern movies are using 8K cameras, and squeeze it down to 5K because that’s about all current pipelines can handle, and then squeeze it more down to 4K because of screen technology limitations, and the videographers cry. All those beautiful pixels just dumped on the floor for the cleaning people to sweep up when the project is over.

Which, as you probably have guessed by now, leads to my third axiom: In computer vision, there is never enough—2000. Doesn’t sound too startling until you get a closer look. The amount of work that has to happen to get a meaningful image out of the big (in pixels and size) sensors is astounding. Bayer to RGB conversion (to remove mosaic effects), Autofocus, Auto Exposure, Auto white balance, and noise reduction, lens correction, to mention a few, plus stabilization, HDR, and transfer functions. And today we’re squeezing all that stuff into a mobile phone and taking 4K videos at 30 frames per second. Thirty frames a second!? Hell, I can blink that fast; I want 120 frames per second, if you don’t mind—there’s never enough … Actually, at 4K I probably want 240 frames per second—no artifacts, no blur, no pixel losses—I want it, I want it all, I want it now. And guess what?—we’re all going to get it … with depth … in a mobile phone.

We’ll also get it in other cameras, like security cameras. No more blurry, grainy, frame-dropping crap like we’ve had to put up with. That puts computer vision in the middle of video/movies, machine vision, and the new catch-all IoT, which conveniently brings me to my fourth axiom, The technology works when it’s invisible—2003.

The ultimate invisible power is a black hole, isn’t it?

If you want a taste of 8K, take a look at this (below) on a 4K monitor full screen.