Coffeehouse Thread

70 posts

Article about kinect internals

Back to Forum: Coffeehouse
  • Bas

    Charles said:
    CreamFilling512 said:
    *snip*

    Everybody who agrees, post here. We will use this thread to convince Kinect product management that a C9 interview is worth it. Smiley

    C

    Yes.

  • daSmirnov

    Charles said:
    CreamFilling512 said:
    *snip*

    Everybody who agrees, post here. We will use this thread to convince Kinect product management that a C9 interview is worth it. Smiley

    C

    Yes please.

  • aL_

    good ol' giz also seems to have visited the kinect lab and did a pretty nice write up

    http://gizmodo.com/5604308/deep-inside-xbox-360-kinect-the-interface-of-microsofts-dreams

  • aL_

    aL_ said:

    good ol' giz also seems to have visited the kinect lab and did a pretty nice write up

    http://gizmodo.com/5604308/deep-inside-xbox-360-kinect-the-interface-of-microsofts-dreams

    eurogamer/digital foundry also posted a new kinect article recently. some new stuff in there but they do some nice lag analysis from e3, they are doing it with kinect adventures though and not the NUIview debug application we've seen breif shots of, but still Smiley

  • Ian2

    aL_ said:
    aL_ said:
    *snip*

    eurogamer/digital foundry also posted a new kinect article recently. some new stuff in there but they do some nice lag analysis from e3, they are doing it with kinect adventures though and not the NUIview debug application we've seen breif shots of, but still Smiley

    Sounds good to me!

  • Bass

    Charles said:

    Would you like to see some technical deep dives into Kinect here on C9? Is this something we should pursue?

    C

    Do it.

  • Blue Ink

    Charles said:

    We need more than 9 replies, though who doesn't love the number nine? Smiley
    C

    Pretty please...

     

    Also, what figuerres said. From innovative UI to 3D modelling, there are several scenarios worth exploring...

  • Dovella

    Blue Ink said:
    Charles said:
    *snip*

    Pretty please...

     

    Also, what figuerres said. From innovative UI to 3D modelling, there are several scenarios worth exploring...

    GO GO Big Smile

  • BitFlipper

    Charles said:

    Would you like to see some technical deep dives into Kinect here on C9? Is this something we should pursue?

    C

    YES!!

     

    @rhm

    "Saying  "it projects a grid in infra-red" doesn't tell us how it turns whatever comes in from the sensor into a depth map, let alone how the depth map becomes a list of points being tracked and how those tracked points then become the skeletons of the players"

     

    I had the exact same response in another thread, because my earlier assumption was that this was based on a time-of-flight camera, which apparently it never was. A TOF camera is technically difficult to implement but easy to understand. Each pixel simply measure the time it takes for a light pulse to return after hitting an object, kinda like radar. This "projecting a grid" thing was new to me and I didn't understand it. After searching for answers, I think I figured out how it works. 

     

    Basically, it indeed projects this grid of light (or "structured light", as it is also called, although I think that could be more complex than just a simple grid). Imagine you project a single vertical line that runs from the top of the scene to the bottom. Now you also have a camera that is slightly offset to the side from the projector. If the light was hitting a wall with no 3D features, the line will be captured as a perfectly straight line. Now imagine you place an object somewhere between the wall and the projector/camera. Now the line will be "distorted" in such a way that wherever it hits the object, it will seem to be "shifted" to the one side. The amount of shifting is related directly (although not linearly) to how far away that part of the scene is from the camera. If you have a "grid" that is dense enough and with sharp enough lines, and also a camera with high enough resolution, you should be able to get high enough resolution details. BTW, I also believe that in order to prevent ambiguity in the situation where there are big jumps in depth so that it could result in a line shifting past an adjacent line, that they project multiple lower density grids sequentially.

     

    The advantage of this is that a TOF camera is expensive, and their accuracy is still limited by the fact that light travels so fast that each pixel only has time to capture a few photons at a time, making them very "noisy". With this structured light method, the limitations are more related to how high resolution you are willing to go with the components, etc (you have more control over the results).  

     

    Anyway, this is all speculation and Kinect could work in a completely different way. An in-depth technical discussion with those that would actually know would be helpful. And as rhm mentioned, how do they get from the point-cloud to the skeletons?

     

    Some other questions:

    • Which parts are processed on the Kinect hardware, and which parts on Xbox?
    • What is the final CPU usage on Xbox with the final implementation (early reports said 10% - 15%)?
    • What is the final lag amount?
    • Is it expected that the lag can be improved in later updates to the software/firmware?
    • EDIT: Can two (or more) Kinect interfaces co-exist side-by-side? Does it do some sort of synchronized multiplexing when it detects another interface in the vicinity? I ask because I currently have 2 Xboxes set up next to each other. Can I add Kinect to both without them interfering with each other?

  • Cream​Filling512

    BitFlipper said:
    Charles said:
    *snip*

    YES!!

     

    @rhm

    "Saying  "it projects a grid in infra-red" doesn't tell us how it turns whatever comes in from the sensor into a depth map, let alone how the depth map becomes a list of points being tracked and how those tracked points then become the skeletons of the players"

     

    I had the exact same response in another thread, because my earlier assumption was that this was based on a time-of-flight camera, which apparently it never was. A TOF camera is technically difficult to implement but easy to understand. Each pixel simply measure the time it takes for a light pulse to return after hitting an object, kinda like radar. This "projecting a grid" thing was new to me and I didn't understand it. After searching for answers, I think I figured out how it works. 

     

    Basically, it indeed projects this grid of light (or "structured light", as it is also called, although I think that could be more complex than just a simple grid). Imagine you project a single vertical line that runs from the top of the scene to the bottom. Now you also have a camera that is slightly offset to the side from the projector. If the light was hitting a wall with no 3D features, the line will be captured as a perfectly straight line. Now imagine you place an object somewhere between the wall and the projector/camera. Now the line will be "distorted" in such a way that wherever it hits the object, it will seem to be "shifted" to the one side. The amount of shifting is related directly (although not linearly) to how far away that part of the scene is from the camera. If you have a "grid" that is dense enough and with sharp enough lines, and also a camera with high enough resolution, you should be able to get high enough resolution details. BTW, I also believe that in order to prevent ambiguity in the situation where there are big jumps in depth so that it could result in a line shifting past an adjacent line, that they project multiple lower density grids sequentially.

     

    The advantage of this is that a TOF camera is expensive, and their accuracy is still limited by the fact that light travels so fast that each pixel only has time to capture a few photons at a time, making them very "noisy". With this structured light method, the limitations are more related to how high resolution you are willing to go with the components, etc (you have more control over the results).  

     

    Anyway, this is all speculation and Kinect could work in a completely different way. An in-depth technical discussion with those that would actually know would be helpful. And as rhm mentioned, how do they get from the point-cloud to the skeletons?

     

    Some other questions:

    • Which parts are processed on the Kinect hardware, and which parts on Xbox?
    • What is the final CPU usage on Xbox with the final implementation (early reports said 10% - 15%)?
    • What is the final lag amount?
    • Is it expected that the lag can be improved in later updates to the software/firmware?
    • EDIT: Can two (or more) Kinect interfaces co-exist side-by-side? Does it do some sort of synchronized multiplexing when it detects another interface in the vicinity? I ask because I currently have 2 Xboxes set up next to each other. Can I add Kinect to both without them interfering with each other?

    Laser light would work well for this purpose.

  • BitFlipper

    CreamFilling512 said:
    BitFlipper said:
    *snip*

    Laser light would work well for this purpose.

    The problem with lasers are that you need some way to "scan" them, which is typically something like spinning mirrors or other methods that would drive up the complexity/cost. I am sure there are some "cheap" projectors that could do the trick.

  • Cream​Filling512

    BitFlipper said:
    CreamFilling512 said:
    *snip*

    The problem with lasers are that you need some way to "scan" them, which is typically something like spinning mirrors or other methods that would drive up the complexity/cost. I am sure there are some "cheap" projectors that could do the trick.

    http://images.eurogamer.net/assets/articles//a/1/2/1/2/1/1/6/joyride1.jpg.jpg

     

    This image somewhat captures the infared projection from the Kinect.

  • BitFlipper

    CreamFilling512 said:
    BitFlipper said:
    *snip*

    http://images.eurogamer.net/assets/articles//a/1/2/1/2/1/1/6/joyride1.jpg.jpg

     

    This image somewhat captures the infared projection from the Kinect.

    I don't know, that looks like it could be a lens flare or something. I really doubt it sends out circular patterns.

  • aL_

    BitFlipper said:
    CreamFilling512 said:
    *snip*

    I don't know, that looks like it could be a lens flare or something. I really doubt it sends out circular patterns.

    still interesting to se that the flare is not uniform, so its sending out some kind of pattern Smiley (that could be due to irregularities in the camera lens though)

     

    its also worth noting that just becase its a laser, it doesnt have to mean that its a Collimated laser. laser light is just like regular light except that its highly monochromatic, that is, its only at a narrow frequency band [or color]. even if the light is collimated at one point, if can be scatterd in a lens to have more of a conic projection.

    that might be how they get around focusing problems since such a light would always be focused anyway, both in the forground and the background.

    its likely that kinect sensor uses this narrow band to filter out noise light from other sources such as lamps and such.

     

     we would ofcourse know for sure if the c9 team asked the hardware folks Wink

  • kettch

    aL_ said:
    BitFlipper said:
    *snip*

    still interesting to se that the flare is not uniform, so its sending out some kind of pattern Smiley (that could be due to irregularities in the camera lens though)

     

    its also worth noting that just becase its a laser, it doesnt have to mean that its a Collimated laser. laser light is just like regular light except that its highly monochromatic, that is, its only at a narrow frequency band [or color]. even if the light is collimated at one point, if can be scatterd in a lens to have more of a conic projection.

    that might be how they get around focusing problems since such a light would always be focused anyway, both in the forground and the background.

    its likely that kinect sensor uses this narrow band to filter out noise light from other sources such as lamps and such.

     

     we would ofcourse know for sure if the c9 team asked the hardware folks Wink

    The concentric pattern could be a recursive reflection of the lens elements off of the back of the lens glass itself, I've seen something like that before. There is some kind of irregularity towards the center, but that kind of looks like a moire pattern.

  • tina10

    I just wanted to jump in here and let you all know what's happening with our Kinect content.  I have been working with the Xbox team with getting Developer interviews and deep dives onto channel 9 for the past couple of months.  I am working diligently to make this happen.  I hope to have some great interviews with some great developers behind Kinect for you very soon. 

  • Richard.Hein

    tina10 said:

    I just wanted to jump in here and let you all know what's happening with our Kinect content.  I have been working with the Xbox team with getting Developer interviews and deep dives onto channel 9 for the past couple of months.  I am working diligently to make this happen.  I hope to have some great interviews with some great developers behind Kinect for you very soon. 

    Sounds good!  Can't wait.  Smiley

  • Dovella

    OT.

     

    Excuse me, how to recive  Kinect Beta test invite?

Comments closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums, or Contact Us and let us know.