In this paper we propose a novel framework, Latent-Class Hough Forests, for 3D object detection and pose estimation in heavily cluttered and occluded scenes. Firstly, we adapt the state-of-the-art template matching feature, LINEMOD [1], into a scale-invariant patch descriptor and integrate it into a regression forest using a novel template-based split function. In training, rather than explicitly collecting representative negative samples, our method is trained on positive samples only and we treat the class distributions at the leaf nodes as latent variables. During the inference process we iteratively update these distributions, providing accurate estimation of background clutter and foreground occlusions and thus a better detection rate. Furthermore, as a by-product, the latent class distributions can provide accurate occlusion aware segmentation masks, even in the multi-instance scenario. In addition to an existing public dataset, which contains only single-instance sequences with large amounts of clutter, we have collected a new, more challenging, dataset for multiple-instance detection containing heavy 2D and 3D clutter as well as foreground occlusions. We evaluate the Latent-Class Hough Forest on both of these datasets where we outperform state of the art methods.
Thanks to Wadim Kehl we have new corrected annotations for our dataset. Please make use of the updated groundtruth for either new or old object models.
Read info – Old Object Models [12.9 MB] – New Object Models [16.7 MB]New corrected annotation for: Old Object Models [2.4 MB] – New Object Models [2.4 MB]
Objects: Coffee Cup [809 MB] – Shampoo [1.16 GB] – Joystick [1.17 GB] – Camera [820 MB] – Juice Carton [980MB] – Milk [962 MB]
If you make use of te dataset, please cite:
@incollection{tejani2014latent,
title={Latent-class hough forests for 3D object detection and pose estimation},
author={Tejani, Alykhan and Tang, Danhang and Kouskouridas, Rigas and Kim, Tae-Kyun},
booktitle={Computer Vision--ECCV 2014},
pages={462--477},
year={2014},
publisher={Springer}
}
[1] "Model Based Training, Detection and Pose Estimation of Texture-Less 3D Objects in Heavily Cluttered Scenes", S. Hinterstoisser, V. Lepetit, S. Ilic, S. Holzer, G. R. Bradski, K. Konolige, N. Navab, ACCV 2012 – paper
For any inquiries or feedback please contact: