Hierarchically Decomposed Graph Convolutional Networks for Skeleton-Based Action Recognition
We present TransProteus, a dataset, and methods for predicting the 3D
structure, masks, and properties of materials, liquids, and objects inside
transparent vessels from a single image without prior knowledge of the image
source and camera parameters. Manipulating materials in transparent containers
is essential in many fields and depends heavily on vision. This work supplies a
new procedurally generated dataset consisting of 50k images of liquids and
solid objects inside transparent containers. The image annotations include 3D
models, material properties (color/transparency/roughness...), and segmentation
masks for the vessel and its content. The synthetic (CGI) part of the dataset
was procedurally generated using 13k different objects, 500 different
environments (HDRI), and 1450 material textures (PBR) combined with simulated
liquids and procedurally generated vessels. In addition, we supply 104
real-world images of objects inside transparent vessels with depth maps of both
the vessel and its content. We propose a camera agnostic method that predicts
3D models from an image as an XYZ map. This allows the trained net to predict
the 3D model as a map with XYZ coordinates per pixel without prior knowledge of
the image source. To calculate the training loss, we use the distance between
pairs of points inside the 3D model instead of the absolute XYZ coordinates.
This makes the loss function translation invariant. We use this to predict 3D
models of vessels and their content from a single image. Finally, we
demonstrate a net that uses a single image to predict the material properties
of the vessel content and surface.