Coffeehouse Thread

21 posts

Forum Read Only

This forum has been made read only by the site admins. No new threads or comments can be added.

Looking for your Questions for the Developers behind Project Natal

Back to Forum: Coffeehouse
  • User profile image
    W3bbo

    BitFlipper said:
    CreamFilling512 said:
    *snip*

    That is surprising, because everything I have read up until now stated it was a time-of-flight camera. Here is an example. What changed?

     

    Also, saying "it projects a grid on the scene in near-infrared light" doesn't explain how it works. At all. How does it come up with a depth value for each pixel? I am not saying it doesn't do that, I am just saying a huge amount of other info is left out that doesn't give us any idea of how this actually works.

     

    The link you provided doesn't help much either. It says another sensor "reads and then interprets" the grid. Great, doesn't actually give any explanation of how it works. Reading depth is much more complex than ready light intensity.

     

    Microsoft probably bought up the ToF company thinking "this'll be useful" before realising it wasn't.

     

    The grid approach works if the resolution is high enough: assuming the grid isn't projected using coherent light, objects further away will have wider grid spacing on them than objects that are closer.

     

    ...that's just my hypothesis.

  • User profile image
    rhm

    W3bbo said:
    BitFlipper said:
    *snip*

    Microsoft probably bought up the ToF company thinking "this'll be useful" before realising it wasn't.

     

    The grid approach works if the resolution is high enough: assuming the grid isn't projected using coherent light, objects further away will have wider grid spacing on them than objects that are closer.

     

    ...that's just my hypothesis.

    I notice the thing has two lenses. It could be that one is for the infra-red camera and the other for visible light camera, but it could also be that it has an infra-red camera behind both and uses either a sensor that can sense infra-red and visible light simultaneously, or has some way of splitting the light on one side to drive both kinds of sensor.

     

    If it can sense infra-red from both lenses then detecting depth is a standard computer vision problem of finding coherence between parts of the two images and using the difference in horizontal axis to determine distance. The reason for using infra-red and projecting a grid would be to make the machine-vision task a heck of a lot easier than it is when you just have the subject's natural texture to go on.

  • User profile image
    Cream​Filling512

    rhm said:
    W3bbo said:
    *snip*

    I notice the thing has two lenses. It could be that one is for the infra-red camera and the other for visible light camera, but it could also be that it has an infra-red camera behind both and uses either a sensor that can sense infra-red and visible light simultaneously, or has some way of splitting the light on one side to drive both kinds of sensor.

     

    If it can sense infra-red from both lenses then detecting depth is a standard computer vision problem of finding coherence between parts of the two images and using the difference in horizontal axis to determine distance. The reason for using infra-red and projecting a grid would be to make the machine-vision task a heck of a lot easier than it is when you just have the subject's natural texture to go on.

    It seems that there are three lenses on the front.  One is an RGB camera which has nothing to do with motion tracking, its for other scenarios.  And the other two, one floods the scene with "coded" infared light, and another camera reads back where the "coded" light hit objects.

     

    My guess there is some processing that reads how the IR projection is perturbed by objects in the scene, and gets a 3D point cloud out of that.  Then the 3D point cloud gets fed into some kind of neural net trained by a huge dataset of human poses (kind of like handwriting recognition), then "magic", and out comes bunch of skeletal joints.

     

    Anyway regardless of how it works exactly, its one of the most high-tech consumer products out there.

     

     

    Edit: Found a site that talks about using "coded light" for 3D object recognition, here: http://academic.research.microsoft.com/Paper/6148176.aspx

Conversation locked

This conversation has been locked by the site admins. No new comments can be made.