3D object detection is a critical component of many applications such as autonomous driving, robotics, and augmented reality, where having a precise understanding of the 3D environment is crucial. In the context of 3D object detection, a key challenge lies in the high cost of annotating 3D bounding boxes, making IT difficult to scale supervised learning methods to new applications.
To address this, various learning paradigms such as semi-supervised[1][2], weakly supervised[3], and unsupervised domain adaptation have been proposed to reduce the need for large amounts of annotated data while maintaining or improving performance. By leveraging minimal labeled data or even unannotated data, these approaches help reduce the reliance on costly 3D box annotations.
Most state-of-the-art methods rely on a teacher-student architecture. A crucial aspect of this approach is pseudo-label filtering, which can BE done using two main strategies. One strategy involves untrained heuristics, such as confidence scores produced by detection models, while the other strategy uses uncertainty estimation modules trained on a small set of annotated 3D data. Both of these approaches, however, have limitations. Heuristics can BE overly reliant on hyperparameters that may overfit, while uncertainty estimators can prove unreliable.
Recent breakthroughs in 2D vision-language models (VLMs) have inspired research in 3D vision, particularly around the potential of these models for pretraining [4][5].
However, despite the promise of VLMs, there is little exploration of their use in the context of semi-supervised, weakly supervised, or unsupervised domain adaptation for 3D object detection. Therefore, we aim to fill this gap by leveraging the power of foundation models for more robust pseudo-label filtering. This could involve using pixel features from the 2D projections of 3D points to calculate intra-object coherence, as well as neighborhood incoherence scores to ensure that objects are correctly detected and isolated. Additionally, 2D features could BE used as a pretext for scene completion tasks, providing finer object contours and estimating occluded parts of detected objects.
[1] Zhao, N., et al. (2020). Sess : Self-ensembling semi-supervised 3d object detection. CVPR.
[2] Xu, H., et al. (2021, September). Semi-supervised 3d object detection via adaptive pseudo-labeling. ICIP.
[3] Yao, B., et al. (2024). Uncertainty-guided Contrastive Learning for Weakly Supervised Point Cloud Segmentation. IEEE Transactions on Geoscience and Remote Sensing.
[4] Chen, Zhimin, et al. "Bridging the domain gap : Self-supervised 3d scene understanding with foundation models." NeurIPS 2024
[5] Sirko-Galouchenko, S., et al. (2024). OccFeat : Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks. CVPR 2024.
Localisez l'entreprise et calculez votre temps de trajet en un clic !
Students in their 5th year of studies (M2 or gap year)
Computer vision skills
Machine learning skills (deep learning, perception models, generative AI...)
Python proficiency in a deep learning framework (especially TensorFlow or PyTorch)
Scientific research experience will BE appreciated
In line with CEA's commitment to integrating people with disabilities, this job is open to all.
Le CEA est un acteur majeur de la recherche, au service des citoyens, de l'économie et de l'Etat.
Il apporte des solutions concrètes à leurs besoins dans quatre domaines principaux : transition énergétique, transition numérique, technologies pour la médecine du futur, défense et sécurité sur un socle de recherche fondamentale. Le CEA s'engage depuis plus de 75 ans au service de la souveraineté scientifique, technologique et industrielle de la France et de l'Europe pour un présent et un avenir mieux maîtrisés et plus sûrs.
Implanté au coeur des territoires équipés de très grandes infrastructures de recherche, le CEA dispose d'un large éventail de partenaires académiques et industriels en France, en Europe et à l'international.
Les 20 000 collaboratrices et collaborateurs du CEA partagent trois valeurs fondamentales :
- La conscience des responsabilités
- La coopération
- La curiosité
sur le site du recruteur.