For a Human-Centered AI

3D Reconstruction: the Algorithm of 3D Optical Metrology (3DOM) Unit Awarded

October 9, 2018

The Best Paper Award went to the work co-authored by Fabio Remondino (FBK-3DOM) presented at the ECCV 2018's International Workshop on Recovering 6D Object Pose in Munich

The article entitled Image-to-Voxel Model Translation with Conditional Adversarial Networks, co-authored by the Head of Fondazione bruno Kessler’s 3D Optical Metrology Unit (3DOM) Fabio Remondino, has been awarded the Best Paper Award at the International Workshop on Recovering 6D Object Pose held as part of the European Conference on Computer Vision 2018.

The paper – awarded among the five selected and admitted to the Workshop – illustrates an innovative algorithm capable of generating 3D information from images using generative adversarial networks (GAN) with a better reconstruction capacity in terms of quality and resolution of details compared to scene reconstruction models with multiple non-rigid objects.

The algorithm, in addition to improving the current 3D reconstruction methods, shows how conditional adversarial volumetric networks can generate voxel models of complex scenes with multiple objects and that skipping the connections between 2D convolutional and 3D deconvolutional layers facilitates the reconstruction of fine and higher quality detail of disordered scenes with multiple different class 3D elements.

“The method presented in the paper uses correspondences between 2D silouhettes and sections of a camera vision field in order to predict the voxel model of a scene with multiple objects – Fabio Remondino explains. We have used pyramid-shaped voxels and a network of generators with skip connections between 2D and 3D maps, directly working in 3D, and developed a new Z-GAN framework for the translation of a single colour image to a voxel model of a scene. To “train” and prove its efficiency and reliability, a dataset was collected with approximately 36,000 ground-truth 3D images and models, depth maps and object pose. The outcome – the researcher adds – is that the model tested provides relevant results in 3D scene reconstruction, more performing than modern models of reconstruction when compared with the state of the art, both for the number of reconstructed objects and for quality and level of geometric detail”.

The areas of application can be many: from the Creative Industry to cultural heritage, from the robot vision to the 6D pose estimation, from augmented reality to autonomous driving.

The author/s