Recent advances in 3D content understanding (opens in new tab)

(ai.facebook.com)

92 pointsjimarcey6y ago16 comments

16 comments

Kinda boggles my mind that there hasn't been a stronger push for hardware support of 3D imaging in phones. It clearly provides more useful information for analysis. Even when we look at an image in a photo we map it to a 3D projection in our mind.

Gravityloss6y ago

Think about the data explosion when all our photos and video store the whole light field.

A 2.5D model is probably less data?

katmannthree6y ago

There have been a few phones which provided that, notably the HTC Evo 3D. It had dual cameras and an autostereoscopic (glasses-free 3D) display.

The value provided was, IMO, quite minimal. It was not easier to use or better in any user-facing discernible way, the 3D stuff just felt like a gimmick. 3D photos taken by the phone and viewed on its screen did not feel more lifelike. The color depth and image quality was poor even by the standards of other phones of its era.

While it was very cool and felt very futuristic, it did not feel worth the cost.

obituary_latte6y ago

Not sure if this qualifies, but an interesting development which was new to me not so long ago was discovering the iPhone X has the ability to measure things with the camera. I.e. the measure app allows you to designate a point in the scene, draw a line and add another point and it will measure the distance. Also has nice snapping features and the ability to take a pic of the scene with the measurements superimposed on top.

BubRoss6y ago

Phones with multiple camera are starting to become common. Time of flight sensors that use projected infrared patterns aren't trivial to out in a phone so the demand needs to be there. Still, there have been tablets that have integrated Intel's depth cameras.

taneq6y ago

Nit: Projected patterns are structured light, not time-of-flight. As far as I’m aware (would love to be wrong!) you can’t do ToF with a traditional CCD or CMOS sensor and resolution is invariably woeful.

1 more reply

zaroth6y ago

The videos showing the algorithm in practice are really nice demos.

I’m curious how big of a step forward this is from the previous state of the art, and at what computational cost.

Also curious if the technique scales well with multiple cameras with overlapping fields of view. That is to say, I assume accuracy can be increase through sensor fusion in the basic sense of averaging errors, but actually molding a cohesive 3D view of a 360° environment and understanding that an object at the end of one frame is the same object from a different perspective at the end of another camera frame.

Obviously this seems like it should be extremely useful for AutoPilot. Compared to the relative inaccuracy of the positional information of adjacent cars on the AutoPilot guidance display that we have today this seems like a big step forward.

I think it’s interesting how the RNN is identifying specific types of objects and then depth mapping them. I assume it can’t just depth map the whole image without that first classification step? I’m thinking like for the Smart Summon application where depth mapping everything around you is pretty crucial and obviously not entirely working at this point.

Darkphibre6y ago

I do photogrammetry as a hobby, and would love to see more RGBD cameras on the market. I've even considered hacking together my own... anyone have pointers for cost-conscious options?

A few of the models I've shrunk down and posted: https://sketchfab.com/darkphibre

throwaway67346y ago

The new Azure Kinect is really impressive.

I've used the og Kinect, Kinect v2, and intel realsense d435 and it was much more accurate then all of those.

ipsum26y ago

Sorry for derailing the thread, but how was the new Azure Kinect's performance outdoors? I've had trouble finding this information anywhere.

1 more reply

nojvek6y ago

Very impressive research. Just scared what invasive thing Facebook will use it for.

I hope the researchers are advocating for its ethical use.

mrfusion6y ago

It really seems like we should be close to having robots that can navigate a room.

throwaway67346y ago

You can get pretty far building one yourself using the out of the box ROS (https://www.ros.org/) navigation stack with a kinect, off the shelf motor controller, and something like an nvidia jetson

j / k navigate · click thread line to collapse

16 comments

jcims6y ago

Gravityloss6y ago

Think about the data explosion when all our photos and video store the whole light field.

A 2.5D model is probably less data?

katmannthree6y ago

There have been a few phones which provided that, notably the HTC Evo 3D. It had dual cameras and an autostereoscopic (glasses-free 3D) display.

While it was very cool and felt very futuristic, it did not feel worth the cost.

obituary_latte6y ago

BubRoss6y ago

taneq6y ago

1 more reply

zaroth6y ago

The videos showing the algorithm in practice are really nice demos.

I’m curious how big of a step forward this is from the previous state of the art, and at what computational cost.

Darkphibre6y ago

I do photogrammetry as a hobby, and would love to see more RGBD cameras on the market. I've even considered hacking together my own... anyone have pointers for cost-conscious options?

A few of the models I've shrunk down and posted: https://sketchfab.com/darkphibre

throwaway67346y ago

The new Azure Kinect is really impressive.

I've used the og Kinect, Kinect v2, and intel realsense d435 and it was much more accurate then all of those.

ipsum26y ago

Sorry for derailing the thread, but how was the new Azure Kinect's performance outdoors? I've had trouble finding this information anywhere.

1 more reply

nojvek6y ago

Very impressive research. Just scared what invasive thing Facebook will use it for.

I hope the researchers are advocating for its ethical use.

mrfusion6y ago

It really seems like we should be close to having robots that can navigate a room.

throwaway67346y ago

You can get pretty far building one yourself using the out of the box ROS (https://www.ros.org/) navigation stack with a kinect, off the shelf motor controller, and something like an nvidia jetson

j / k navigate · click thread line to collapse