Through the recent developments of AI, the use of models produced by machine learning has become widespread, even in industrial settings. However, studies are flourishing showing the dangers that such models can bring, in terms of safety, privacy or even fairness. To mitigate these dangers and improve trust in AI, one possible avenue of research consists in designing methods for generating *explanations* of the model behaviour. Such methods, regrouped under the umbrella term "eXplainable AI" (XAI), empower the user by providing them with relevant information to make an informed choice to trust the model (or not).
In particular, rather than attempting to explain the model behaviour after IT has been
trained (post-hoc explanations), some XAI methods propose to enforce explainability
constraints directly during the design phase of the machine learning process, resulting
in so-called self-explainable models. In this regard, case-based reasoning is currently
considered a viable alternative to more opaque black box convolutional neural networks (CNN), with works such as (Chen et al. 2019) or (Nauta, Bree, and Seifert 2021). In case-based reasoning, new instances of a problem are solved by drawing comparisons with examples encountered before, such that the decision is taken and motivated by the fact that the new instance resembles some known cases (also called prototypes).
CEA-LIST has already developed an open-source library for case-based reasoning net-
works, called CaBRNet(Xu-Darme et al. 2024), and wish to extend its application to new domains and modalities.
This internship focuses on the use of machine learning models for the recognition and
classification of bird songs. Audio clips are often encoded in the form of spectrograms, i.e. 2D representations of the intensity of the signal at various frequencies, across a given period of time. Since spectograms can BE interpreted as images, a common practice consists in processing them using deep CNNs originally designed for computer vision (Kahl et al. 2021). Hence, the goal of the internship is to extend the case-based reasoning approach to this new task, by adapting existing computer vision methods to learn audio prototypes. In particular, the new approach will take into account the temporal specificities of audio samples. Indeed, contrary to computer vision models which are spacially invariant (the nature of an object remains identical regardless of its position inside the image), spatial location is crucial in spectograms as IT corresponds to different frequency ranges and different periods of times.
In practice, the internship will BE split in several subtasks as follows :
Establish a baseline using the reference BirdNET model.
Identify a body of existing works on self-explainable models for audio classification
Design and train a case-based reasoning model for audio classifiation, using the CaBRNet framework.
As IT is not realistic to BE expert in machine-learning, computer vision and XAI, we encourage candidates that do not meet the full qualification requirements to apply nonetheless. We strive to provide an inclusive and enjoyable workplace. We are aware of discriminations based on gender (especially prevalent on our fields), race or disability, we are doing our best to fight them.
Minimal qualifications :
Master student or equivalent (2nd/3rd engineering school year) in computer science
knowledge of Python and the Pytorch framework
ability to work in a team, some knowledge of version control
Preferred :
notions of AI and neural networks
notions of Computer Vision
notions of explainable AI
Le CEA est un acteur majeur de la recherche, au service des citoyens, de l'économie et de l'Etat.
Il apporte des solutions concrètes à leurs besoins dans quatre domaines principaux : transition énergétique, transition numérique, technologies pour la médecine du futur, défense et sécurité sur un socle de recherche fondamentale. Le CEA s'engage depuis plus de 75 ans au service de la souveraineté scientifique, technologique et industrielle de la France et de l'Europe pour un présent et un avenir mieux maîtrisés et plus sûrs.
Implanté au coeur des territoires équipés de très grandes infrastructures de recherche, le CEA dispose d'un large éventail de partenaires académiques et industriels en France, en Europe et à l'international.
Les 20 000 collaboratrices et collaborateurs du CEA partagent trois valeurs fondamentales :
- La conscience des responsabilités
- La coopération
- La curiosité
sur le site du recruteur.