Reconnaissance d’expressions corporelles dans des mouvements de personnes en vue de la synthèse de style

Arthur Crenn

Résumé

The theme of my thesis concerns the recognition and synthesis of facial and body expressions. Our problem is to study, understand and extract the elements that translate a person's emotional state from the expressions of his face and body, in order to recognize and also to synthesize the emotion or style in a gesture. This dual objective of recognition and synthesis will make it possible to generalize new modes of interaction in applications such as video games, human-machine interaction, etc. Indeed, these applications can be enriched with scenarios that could be adapted to the emotional state of the user. To answer this problem, the crucial point is to understand "where" the expression information is located in an action. Indeed, in a few seconds, a human knows how to characterize an expression he sees while recognition algorithms are still far from it, especially for body expressions where, to our knowledge, little work has been done compared to the recognition of facial expressions. Concerning the recognition of expressions, our objective is, first of all, to propose features capable of recognizing the expression carried by a movement. To do this, the main problem is to separate the movement achieved from the perceived expression. Concerning the recognition of facial expressions, we are interested in the societal problem of parental protection. To do this, it is necessary to understand and know how to recognize facial expressions of children. To solve this problem, we have built and proposed a new database to help the computer vision community understand the specificities of facial expressions of children's faces. Secondly, we also hope that the various descriptors proposed in recognition of body expressions can be used in the field of animation synthesis. Indeed, in the field of animation, the creation of an action that conveys an emotion or a style requires a lot of work, know-how and time for an animator to propose stylized animations. For example, in a video game, to create such animations, it is often necessary to have a huge database of movements including each style for each virtual character. To have such a database, one of the following two methods is often used. The first consists in capturing all the movements made by different actors playing different styles. In the second, it is the graphic designer who must create the various animations by hand using animation software. Our objective in this context is to use the descriptors quantifying the expression detected in recognition of facial and body expressions in order to develop tools capable of changing / editing the style or expression of an animation. These tools will assist and facilitate the work of graphic designers by allowing them to quickly synthesize a stylized "primal animation". These stylized animations can be refined in post-processing by adding an artistic touch to the generated animation.

Le thème de ma thèse concerne la reconnaissance et la synthèse d’expressions faciales et corporelles. Notre problématique est d’étudier, de comprendre et d’extraire les éléments qui traduisent l’état émotionnel d’une personne à partir des expressions de son visage et de son corps, dans le but de la reconnaissance et également de la synthèse de l’émotion ou du style dans un geste. Ce double objectif de reconnaissance et de synthèse permettra de généraliser de nouveaux modes d’interactions dans des applications comme les jeux-vidéos, l’interaction homme-machine, etc. En effet, ces applications pourront s’enrichir de scénario qui pourraient d’adapter à l’état émotionnel de l’utilisateur. Pour répondre à ce problème, le point crucial est de comprendre « où » se situe l’information de style dans une action ou une animation. En effet, un humain sait en quelques secondes caractériser une expression qu’il voit alors que les algorithmes de reconnaissance en sont encore loin notamment pour les expressions corporelles où à notre connaissance, peu de travaux ont été réalisés comparé à la reconnaissance des expressions faciales. Concernant la reconnaissance des expressions, notre objectif est, dans un premier temps, de proposer des descripteurs capables de reconnaître l’expression portée par un mouvement. Pour cela, le principal verrou est d'arriver à séparer le mouvement réalisé de l'expression perçue. Concernant la reconnaissance d'expressions faciales, nous nous intéressons au problème sociétal concernant la protection parentale. Pour cela, il est nécessaire de comprendre et de savoir reconnaitre des d'expressions faciales chez des enfants. Afin de résoudre ce problème, nous avons construit et proposé une nouvelle base de données dans le but d'aider la communauté de la vision par ordinateur à comprendre les spécificités des expressions faciales de visages d'enfants. Dans un second temps, nous souhaitons également que les différents descripteurs proposés en reconnaissance d'expressions corporelles puissent être utilisés dans le domaine de la synthèse d’animations. En effet, dans le domaine de l’animation, la création d’une action porteuse d'une émotion ou d'un style nécessite énormément de travail, de savoir-faire et de temps à un animateur afin de proposer des animations stylisées. Par exemple, dans un jeu-vidéo, pour créer de telles animations, il est souvent nécessaire de disposer d’une énorme base de données de mouvements incluant chaque style pour chaque personnage virtuel. Pour disposer d’une telle base de données, on a souvent recours à l’une des 2 méthodes suivantes. La première consiste à procéder à la capture de tous les mouvements réalisés par différents acteurs jouant différents styles. Dans la seconde, c’est le graphiste qui doit réaliser les différentes animations à la main en utilisant un logiciel d’animation. Notre objectif dans ce cadre est de mettre à profit les descripteurs quantifiant l’expression détectée en reconnaissance d’expressions faciales et corporelles afin de développer des outils capables de changer / éditer le style ou l’expression d’une animation. Ces outils permettront d’assister et de faciliter le travail des graphistes en leur permettant de synthétiser rapidement une « animation primale » stylisée. Ces animations stylisées pourront être affinées en post-processing en apportant une touche artistique à l’animation générée.

Capturing and transferring expression from children's faces for interaction with virtual modes

Reconnaissance d’expressions corporelles dans des mouvements de personnes en vue de la synthèse de style

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager