Blog Post

TechFest 2011: 3D Scanning with a regular camera or phone!

Play TechFest 2011: 3D Scanning with a regular camera or phone!
Sign in to queue


3-D television is creating a huge buzz in the consumer space, but the generation of 3-D content remains a largely professional endeavor. Our research demonstrates an easy-to-use system for creating photorealistic, 3-D-image-based models simply by walking around an object of interest with your phone, still camera, or video camera. The objects might be your custom car or motorcycle, a wedding cake or dress, a rare musical instrument, or a hand-crafted artwork. Our system uses 3-D stereo matching techniques combined with image-based modeling and rendering to create a photorealistic model you can navigate simply by spinning it around on your screen, tablet, or mobile device.



Download this episode

The Discussion

  • User profile image

    This could be huge for 3D printing or Game Developers. How hard would it be to use such a 3D model in a (XNA) game? Is that something you are looking at?

  • User profile image

    Very Very cool

  • User profile image

    I've often wondered if people where trying to create 3D models out of Photosynth pointclouds. I'm assuming this is how it works?

  • User profile image

    wow! I can't wait until this is an app on my Win7 Phone!

  • User profile image

    like ckurt, i wonder if you can get at the model data somehow...

    also, i wonder if you can record the data using video instead of snapping individual pics


  • User profile image

    Good job, Laura! I'm glad you got this interview. 

    I'd love to see an interview with these guys where they go into all the gory details, but take the time to break down all of the academic terms into plain language. I'm especially interested to know how much 3D data is downloaded with the photos when you open one of these and how that 3D data is formatted. I would think that the camera positions, relative to each other, would certainly be used, but what about a geometric model or depth maps for the images? What do you see if the images are turned off? 

    I'm interested to know where their method of playback differs from the end of Noah Snavely's Photo Tourism presentation (05:00 - 05:31) where he projects Rick Szeliski's photos of the Great Wall of China onto a mesh computed from the sparse structure from motion point cloud, rather than projecting the photos onto flat geometric planes, and also a very similar feature that Blaise demoed from a research version of the Photosynth viewer back in 2007 (15:03 - 18:10) that has never been released to the public. 

    From what I can see, parts of the images are turned off sooner than other parts as each fades to the next image, depending on the distance an object is (there's sometimes some image tearing around the edges of objects). That says to me that this isn't as simple as just playing the images back in sequence with camera tilt and differences in distance from the object compensated for when the photos' positions are solved for by Photosynth, as commenters on Gizmodo's story seemed to think

    The reflections being morphed between input photos reminds me of Pravin Bhat's video from 2007: Using Photographs to Enhance Videos of a Static Scene, although he had many low resolution video frames to sample reflections and specularities from before mapping them to the high resolution textures from the photos.

    The digital narrative (second segment) says that the component technologies are Photosynth (to solve for the photos' positions), then multi-view stereo (to do dense 3D reconstruction like Yasutaka Furukawa's PMVS2 does), and piecewise planar modeling which hearkens back to Sudipta's work seen in 2008 here and here (00:43:20 -00:55:55). Lastly, it mentions "Image-based rendering (view interpolation)", which I take to be the image morphing seen in playback. I'd like to understand that part better.

    I'm very curious to know why dense 3D reconstruction is necessary on the server side if the 3D data is going to be compressed down to such a lightweight 3D model which relies so heavily on the original photos. 


    I would love to see this technology wrapped into a Photosynth update later this year. After Internet Explorer 9 is released this year with its ability to use your video card and IE's support for the 3D capabilities of CSS3, this ought to be able to run in your browser without the need for a plugin. On the plugin side, Silverlight 5 also comes out this fall with support for your video card as well. If Silverlight 5's 3D power is greater than what you can do with CSS3, then my ideal would be to have a Javascript viewer that works in all browsers (and on iPads and such things that don't allow Silverlight), but then seamlessly and silently upgrade to an enhanced Silverlight 5 viewer for anyone who already has Silverlight installed, similar to, which uses Seadragon AJAX unless you already have Silverlight.

  • User profile image

    @dentaku: @aL_

    You guys should look up Henri Astre's PhotosynthToolkit. It enables you to simply download the primary point cloud in a synth, optionally download the photos in the synth, and then load the images and Photosynth's solution for their positions into Yasutaka Furukawa's PMVS2 to perform dense reconstruction.

    You can choose to create a mesh from either Photosynth's sparse reconstruction or PMVS2's dense reconstruction and a low quality texture can be derived from the colours of the points, transferred to the mesh.

    If you have 3D Studio Max installed, Josh Harle also wrote a script to project the original photos back onto a mesh created from the point clouds which provides a much higher quality texture for the mesh. Josh's code now comes packed into Henri's PhotosynthToolkit in version 7 and up.

    Other must sees are Greg Downing's work posted over on the Photogrammetry Forum and Kean Walmsley's sneak peek at what Autodesk Labs is up to with their upcoming version of PhotoSceneEditor.

  • User profile image

    @CKurt: For stationary objects that you can get enough coverage of, this might work well.

    For the moment 3D reconstruction doesn't handle deformable objects so well and even for something like the car, I would think that some manual carving of the car away from the background or the ground underneath it would be necessary. Even then, some manual editing would be needed to separate the wheels from the body of the car so that they could spin and the doors would need to be cut out so that they could hinge properly. 

    You could actually have a taste of this with MeshLab. I've used it to mesh some of the point clouds I've extracted with Christoph Hausner's SynthExport, Henri Astre's PhotosynthToolkit, and the new Photosynth import plugin for MeshLab 1.3.0. I know that the same thing is possible with Kean Walmsley's BrowsePhotosynth plugin for AutoCAD 2011.

    As I've said already, I'm very interested to know what the 3D model solved for here really looks like without the photos. Sudipta's work at MSR from a few years ago added extracting line segments from images (edge detection) in addition to points and did some interesting things there. I'd love to see what sort of a model for that car was computed on the server and how that model differs from what is beamed out to the web. 

    They're using some slick visual trick which uses the input photographs and interpolates between them, but I doubt that the model has great reconstructions of the car's windshields or the reflective properties of the entire surface of the car. I would love to be proved wrong about that, though.

  • User profile image


    cool, i definitely check that out Smiley

  • User profile image

    For those of you who haven't seen this by now, there's now a third video of this research project over at MIT Technology Review, to add to Gizmodo's video and Channel 9's.

  • User profile image

    I just want to know how to get this on my phone. No one seems to have said anything about that yet, but clearly they have a working application SOMEWHERE.

  • User profile image
    Ron Brown

    Why is this using a .ms tld in the url? Doesn't that belong to Monteserrat?

  • User profile image

    Eric Stollnitz of MSR recently gave a more in-depth explanation of this project and how it is accomplished in the last session at the end of the MIX'11 conference.

  • User profile image
    Miro Gawinski

    This is very interesting. I have a few applications I would like to try this with. How do I get my hands on the application? Will photosynth work??

Add Your 2 Cents