Bonjour,
Serait-il possible de diffusion l’annonce de stage ci-dessous?
Cordialement
Master Internship
Autonomous object
recognition in videos using
Deep learning and
Developmental learning
Key-words: Autonomous systems, Deep learning, Developmental learning,
Unsupervised learning, Siamese Neural Networks, Similarity learning, Object discovery, Saliency,
Spatio-temporal coherence
Period: 5
months, starting from February/March (subject to negotiation)
Suited skills for the
candidate:
– Master in artificial
intelligence or computer vision. Previous experience in neural
networks would be appreciated, particularly for image
recognition. Interested in developmental learning. Scientific
curiosity. Ability to read/write scientific articles. Good
autonomy.
– Good programming skills required (C++, python, opencv, tensorflow, git)
Supervision:
Frédéric Armetta and Mathieu
Lefort (SMA Team) – Stefan Duffner (Imagine Team)
Send applications (curriculums and motivation letters) and any request to mathieu.lefort@liris.cnrs.fr, frederic.armetta@liris.cnrs.fr, stefan.duffner@liris.cnrs.fr
Localisation:
LIRIS Laboratory, Lyon, France
Financial
reward:
~500 € per month
Subject:
This internship aims to
contribute to the development of an autonomous object
recognition system for videos. In this context, the system is
exposed to a visual flow (videos) and has to extract
(proto-)objects to iteratively refine its internal
representation for them. The purpose of this work is to
develop such a system that is able to autonomously recognize
and differentiate objects thanks to the building of an
internal representation for these objects. Taking inspiration
from human perception and following the constructivist
learning paradigm [1], we want to get rid of the use of a
large labeled database, prior knowledge or sophisticated
object detectors, but instead provide an autonomous
development. The problem and associated objectives differ from
the general way to address the learning using supervised
reinforcement methods like deep learning. Indeed, no large
dataset would ideally be available for the system (extraction
of visual identified objects should be part of the result and
not provided).
This internship will
capitalize on previous promising preliminary results. The
current system can first extract global shapes to catch
candidate objects from the video, using simple temporal
filters (for instance, a Kalman filter) and the
spatio-temporal coherence of objects (movement and spatial
overlap can help to define instances of objects as similar).
It then uses Siamese Neural Networks [2][3] to learn a
similarity metric by providing pairs of examples marked as
coming from the same or from different classes. This model
constructs a manifold that can be used to classify examples of
unknown classes. Following these guidelines, preliminary
results show that the system is able to classify new instances
of objects with a good accuracy. Nevertheless, the way to
maintain and make evolve this representation raises many
questions that can be deepened on a short or long term
according to the analysis of needs in the course of the
project (catastrophic forgetting [4], active learning [5],
overfitting, ability to generalize, little data, etc.).
A challenging
topic that we would like
to deepen during this
internship relies
in the possibility to use the
so built internal
representation to facilitate the object extraction from
the videos. Indeed, without any knowledge of the objects
and due to the
relatively
simple temporal filtering
to detect candidate
objects, the first extraction is coarse and highly sensitive
to environmental
noise. The internal
representation could then be used to validate and
outline the candidate
objects.
In this case, the
object catching and the internal representation
for objects
evolve together. The
process
we want
to elaborate
is a
self-starting one
operating
without external input. In
other words, the so
form system
has to learn how to perceive efficiently in order to be
able to learn more,
and reciprocally.
We
face here a
chicken-and-egg
cognitive problem, also known as a representation
bootstrapping problem [6].
The project could lead to a
PhD position in case of financial acceptance of the associated
submitted project.
Bibliography :
[1] Piaget. J. (1948), « La
naissance de l’intelligence chez l’enfant »
[2] Zheng, L., Duffner, S.,
Idrissi, K., Garcia, C., Baskurt, A. (2016). « Pairwise
Identity Verification via Linear Concentrative Metric
Learning ». IEEE Transactions on Cybernetics
[3] Berlemont, S.,
Lefebvre, G., Duffner, S., Garcia, C. (2017). «
Class-Balanced Siamese Neural Networks », Neurocomputing
[4] Goodfellow, J., Mirza,
M., Xiao, D., Courville, A., Bengio, Y. (2015). « An
Empirical Investigation of Catastrophic Forgeting in
Gradient-Based Neural Networks », CoRR
[5] Lefort, M., Gepperth,
A. (2015). « Active learning of local predictable
representations with artificial curiosity ». International
Conference on Development and Learning and Epigenetic
Robotics (ICDL-Epirob), Providence (USA)
[6] Mazac, S., Armetta, F.,
Hassas, S. (2014). «On bootstrapping sensori-motor patterns
for a constructivist learning system in continuous
environments. In Alife 14
—
LEFORT Mathieu