When I get the depth video stream from Kinect, I rely on the data which the IR sensors provide - the depth data, in particular. I do not modify the first bit and shift left the second bit by 8 to get the depth information. Based on the acquired depth data, I assign pixels different colours. The problem which I found is that after the maximum 4000mm threshold the sensors identify any further points as "critically near" (less than 850mm).
P.S. I haven't checked the raw sensors data but I have a surmise that they just might be returning either negative or 0 values (thus passing the < 850 statement)...
Add your 2¢