February 21, 2008 4:00 AM PST

Stanford camera chip can see in 3D

Most folks think of a photo as a two-dimensional representation of a scene. Stanford University researchers, however, have created an image sensor that also can judge the distance of subjects within a snapshot.

To accomplish the feat, Keith Fife and his colleagues have developed technology called a multi-aperture image sensor that sees things differently than the light detectors used in ordinary digital cameras.

Each subarray on the multi-aperture sensor captures a small portion of the overall image, a portion that overlaps slightly with that of the neighboring subarrays. By comparing the differences, a camera can judge the distance of elements in the subject. (Note that this mock-up differs from reality, in which each subimage would be rotated 180 degrees, but this makes the idea easier to grasp.)

(Credit: Keith Fife/Stanford University)

Instead of devoting the entire sensor for one big representation of the image, Fife's 3-megapixel sensor prototype breaks the scene up into many small, slightly overlapping 16x16-pixel patches called subarrays. Each subarray has its own lens to view the world--thus the term multi-aperture.

After a photo is taken, image-processing software then analyzes the slight location differences for the same element appearing in different patches--for example, where a spot on a subject's shirt is relative to the wallpaper behind it. These differences from one subarray to the next can be used to deduce the distance of the shirt and the wall.

"In addition to the two-dimensional image, we can simultaneously capture depth info from the scene," Fife said when describing the technology in a talk at the International Solid State Circuits Conference earlier this month in San Francisco.

The result is a photo accompanied by a "depth map" that not only describes each pixel's red, blue, and green light components but also how far away the pixel is. Right now, the Stanford researchers have no specific file format for the data, but the depth information can be attached to a JPEG as accompanying metadata, Fife said.

Recording photos in three dimensions is a pretty radical overhaul of the concept. Depending on your preferences, it could be anything from an exciting new frontier to the latest annoying digital gimmick.

Either way, you'd best start thinking about the implications because Fife isn't the only one working on the challenge. Image-editing powerhouse Adobe Systems has shown off some 3D camera technology too. It should be noted, of course, that stereoscopy itself is an old and respected photographic subject.

Even if you don't want to print holographic pictures of your new kitten, I suspect that 3D technology could help with some traditional photography challenges. Just as face detection can make a camera decide better where to focus and how to expose a shot, having a depth map could make this sort of calculation that much more sophisticated.

This diagram shows the multi-aperture sensor, which puts a small lens over a group of image sensor pixels. Each subarray gets its own microlens.

(Credit: Keith Fife/Stanford University)

Other advantages
Depth isn't the only potential advantage of the multi-aperture approach, Fife said. It could also help reduce noise, which in digital photography takes the form of colored speckles that are a particular plague when shooting at higher ISO sensitivity settings.

The noise is reduced because multiple subarrays capture the same views. It's therefore easier to distinguish true color of the subject from off-color noise. In addition, each subarray can be set to record a specific color, which could reduce the "color crosstalk" of current image sensors, he said. Today's "Bayer" pattern sensors employ a checkerboard of red, green, and blue pixel sensors, but bright red light captured by a red pixel can, for example, leak out a bit and affect the neighboring blue and green pixels.

Each subarray gets its own microlens. Although that complicates the manufacturing of the sensor, it could simplify the lenses used in existing cameras, Fife said. And lens manufacturing today certainly has no shortage of difficulties with a variety of exotic glass and even fluorite crystal elements, aspherical elements, and other avant-garde optics.

"There is opportunity for most of the complexity of the lens design to sit at the semiconductor rather than at the objective lens," Fife said. "Although the local optics (on the sensor) may be challenging, it is possible that the optics can be better controlled with lithography and semiconductor processes than with the injection molding and grinding that is used in the conventional camera lenses."

The microlenses might even be all that's needed for some applications, such as taking super-closeup "in vivo" photos inside plant and animal subjects where there's no room for a camera, Fife said. "The multiaperture sensor can form images at close proximity...because no objective lens is needed," Fife said.

This photo shows the prototype chip with 12,616 subarrays. Each pixel on the chip is 0.7 microns on edge, and the chip consumes 10.45 milliwatts of power.

(Credit: Keith Fife/Stanford University)

No free lunch
Lest you get carried away by the technology, you should be aware of a number of caveats:

• Because the same subject matter is captured redundantly by multiple pixels, the ultimate sensor resolution is lower than the raw number on the overall sensor.

• Processing the image, both to figure out how to merge the subimages into one overall image and to create the depth map, takes about 10 times as much processing horsepower as conventional on-chip image processing. Cameras already are battery hogs, and nobody wants to draw any more power or slow down camera performance.

• 3D images are possible only with subjects that have texture and other detail. "If a picture is captured of a perfectly smooth white wall, it is impossible to estimate the distance to that wall," Fife said.

So those are the downsides, but that's par for the course with new technology. And even if the technology never materializes, it's a strong indicator of the radical transformations that are in store for digital photography.

Recent posts from Underexposed
Yahoo hopes users will help pinpoint photos
Red Hat lives on the edge with Fedora 9
Firefox add-on infected with Trojan remnant
Linux video project evades DMCA, back on Google Code
Google: Unicode conquers ASCII on the Web
Add a Comment (Log in or register) 17 comments (Page 1 of 2)
Robotic applications.
by ralfthedog February 21, 2008 7:57 AM PST
I think this has far more applications in robotics than photography.
Reply to this comment
Why not binocular vision
by samhuff February 21, 2008 8:32 AM PST
Two regular lens give 3 D vision that can record the distance to the white wall. advantage here is one lens so it doesn't have to spaced apart.
Reply to this comment
Stanford should look at Seeingaid.com they use something like this
by Manhattan2 February 21, 2008 8:46 AM PST
I don't think people grasp the power that this type of technology has. We are working on solutions that can calculate depth with regular cameras and even recorded digital and analog footage. The folks at Stanford should get in touch with the people making the seeing aid.
Reply to this comment
Great Post
by thomashawk February 21, 2008 10:02 AM PST
Great post on this exciting new technology Stephen. Scoble and I interviewed Marc Levoy down at Stanford last Fall and he showed us some of this and lots of other new technology being worked on at Stanford first hand. Interesting stuff for sure.
Reply to this comment
Another practical advantage
by Hernys February 21, 2008 10:39 AM PST
With depth information you can adjust lighting for flash pictures on a per pixel basis. This might seem like a not so exiting application, but it would have orders of magnitude more impact than actual 3D pictures. Think about this: how many of your pictures don't look great because of poor lighing, movement due to underexposure or overexposure of close elements and underexposure of far elements of the picture? I would say, for most users, the majority of noght pictures fall in one of those areas!!! Using strong flash and then adjusting the exposure based on depth would solve those issues. Currently you have to decide between risking underexposure or having horrible pictures of overexposed subjects in front of dark backgrounds. On the other hand, how many uses do you have for 3D pictures? Can you easily print 3d albums? View them in the computer? Use a digital picture frame? Yes, at some point you probably will be able to do those things, but not in less than a decade, and even then it will be impractical for most uses (as 3D pictures are unrealistic unless seen from a specific distance and angle). Flat photography will remain mainstream for many decades. Use this tech to solve today's flash issues, and you have a great, practical product.
Reply to this comment
O, for a good true stereo digital camera
by ArtInvent February 21, 2008 11:11 AM PST
It's nice to see people thinking about 3D technology. Yet here's what's baffling: my 40 year old Stereo Realist slide film camera produces startling 3D images that will blow away pretty much any digital image you've likely ever seen. Digital cameras are so cheap and ubiquitous, most cell phones even have them whether anyone uses them or not. AND YET - still no simple stereoscopic digital cameras. Not a single one. A stereo digital camera/viewer would just consist of two lens-sensor pairs combined into one chassis, with one synchronized shutter release. (Just a modern equivalent of a Stereo Realist or any 100 year old stereo camera). Wouldn't even need a zoom lens: most stereo cameras just use a fixed 35mm-ish lens with phenomenal results. Instead of a separate stereo viewer, a proper digital stereo cam could easily incorporate two high-resolution micro-lcds or OLED's (also fairly cheap these days) for high-fidelity stereo playback of the recorded images. There's little reason this couldn't double as a 3D video camera with absolutely stunning realism. How about it Canon, Sony, Pentax, Nikon . . . ? It's all existing technology. Come on.
Reply to this comment
Sounds interesting, but I'll wait and see
by Arbalest05 February 21, 2008 11:25 AM PST
This article states that the image being processed arrives at all the multiple sensors through a single lens. Since it is not really possible to detect depth when the lens you're looking through is set to a single focus point, the best you can do is estimate 3 dimensions based on visual cues (the article states that object must have texture). Real 3D would requires comparison of at least 2 images taken from different vantage points (just like the 2 eyes that most people have).
Reply to this comment View reply
Imaging Opportunity
by bdennis410 February 21, 2008 11:32 AM PST
That same capability, in reverse, as an image projector, will do wonders in the virtual reality arena, like in Second Life, and in gaming, almost any image presentation. Biology, CAD, genetics, and a host of other research and practical applications could benefit immensely. VR gaming already exists, but it is easy to imagine a headset or headsup display environment that will blow your mind. The closer we get to "perfect" imaging and presentation, the better off we are. It just takes computing "horsepower" and we get better at that all the time.
Reply to this comment
Shading Issues
by Jack Gratteau February 21, 2008 1:21 PM PST
The light that forms the image strikes the sensor at oblique angles as you move away from the center. This creates differing gain called shading. Placing a lens on this surface makes this effect even greater. There are already sensors with micro-lenses on a per-pixel basis and this is a big problem. For depth perception, stereo and time-of-flight imaging require much less processing. The idea that this might make a macro or micro imaging sensor where I think this makes the best sense.
Reply to this comment View reply
Impossible?
by Brad S. S. February 21, 2008 2:25 PM PST
It wouldn't be "impossible" to estimate the distance to a perfectly smooth white wall, just the opposite, really: http://www.pages.drexel.edu/~twd25/webcam_laser_ranger.html You'd think someone from Stanford could figure that one out.
Reply to this comment View all 2 replies
1 | 2 | Next 10 Comments >>
Powered by Jive Software
  • About Underexposed

  • This blog sheds light on digital photography, science and open-source software--Stephen Shankland's eclectic beat. Shankland joined CNET News.com in 1998 after a five-year stint as a science writer. He's a lab rat who grew up in Los Alamos, New Mexico, and graduated from Harvard.

    Contact Stephen at Stephen.Shankland@cnet.com

Add this feed to your online news reader
Google
Yahoo
MSN

Stuff I'm reading:

Most popular stories

  1. Images: Microsoft telescope puts universe on your desktop

  2. Photos: Cracking open the Atari 2600

  3. This VC forecast scares the pants off of me

  4. End of Intel, AMD duopoly near? Via readies Isaiah chip

  5. Photos: Microsoft previews 2008 Xbox games

Latest tech news headlines

Featured blogs

Beyond Binary by Ina Fried

Coop's Corner by Charles Cooper

Defense in Depth by Robert Vamosi

Geek Gestalt by Daniel Terdiman

Green Tech

One More Thing by Tom Krazit

Outside the Lines by Dan Farber

The Iconoclast by Declan McCullagh

The Social by Caroline McCarthy

Resource center from News.com sponsors

advertisement
On MovieTome: SEX AND THE CITY clips are here!
Advanced
search
Advanced
search
Visit other CNET Networks sites: