Search results

45 records were found.

Cette thèse présente un système de suivi automatique des gestes labiaux à partir d'une séquence vidéo d'un locuteur. Le système combine une analyse ascendante et descendante de la forme des lèvres. Une première étape d'analyse chromatique, basée sur un apprentissage statistique, fournit une image en niveaux de gris où le contraste entre lèvres et peau est rehaussé. Parallèlement, un modèle linéaire 3D des gestes labiaux est appris pour un locuteur à partir de formes clés phonétiquement pertinentes. Le modèle est alors synthétisé et projeté sur l'image imposant a priori les caractéristiques de la forme des lèvres. Il est adapté sur l'image rehaussée des lèvres par optimisation de ses paramètres de contrôle. Ce système combine ainsi de manière hybride la précision de l'analyse chromatique et la robustesse imposée par le modèle. Ce systèm...
Il est beaucoup plus fréquent d'entendre parler de capture que de mesure de mouvement. On peut y voir l'intuition que derrière le mot mouvement se conçoit un phénomène plus complexe que la donnée de marqueurs qui en constitue aujourd'hui la norme d'instrumentation rigoureuse. Si le marqueur est quantifiable, le mouvement conserve une qualité supplémentaire à explorer. Je retrace ainsi ici les travaux de recherche que j'ai encadrés ces dernières années sur cette notion de capture de mouvement, à travers les outils scientifiques que sont la mesure et le modèle. Mes activités ont été initialement dédiées à l'animation 3D, puis se sont progressivement tournées vers des enjeux liés à l'anatomie. Les contributions en animation 3D ont d'abord porté sur la recherche d'espaces paramétriques optimaux pour mesurer et générer le mouvement articulé...
This paper presents a method for the extraction of articulatory parameters from direct processing of raw images of the lips. The system architecture is made of three independent parts. First, a new greyscale mouth image is centred and downsampled. Second, the image is aligned and projected onto a basis of artificial images. These images are the eigenvectors computed from a PCA applied on a set of 23 reference lip shapes. Then, a multilinear interpolation predicts articulatory parameters from the image projection coefficients onto the eigenvectors. In addition, the projection coefficients and the predicted parameters were evaluated by an HMMbased visual speech recogniser. Recognition scores obtained with our method are compared to reference scores and discussed.
We introduce a novel approach to capture linear and non linear motions of the skin surface due to muscle bugglings and other complex sub-surface interactions. Our method uses standard camera and consists of a video motion capture of the desired body surface covered by a stretchy and calibrated cloth. We develop a non model-based tracking method using regularity and topology of dedicated fabrics. Our tracking and 3D reconstruction methods support occlusions : we detect occlusions of 3D surfaces using temporal and spatial coherences, and are able to restart surface tracking once the occlusions have ended. The obtained 3D mesh can easily be integrated and adapted in a sequence of an animated virtual human. We demonstrate our method with the skinning of non-linear biceps deformations.
This work presents a methodology for 3D modeling of lip motion in speech production and its application to lip tracking and visual speech animation. Firstly, a geometric modeling allows to create a 3D lip model from 30 control points for any lip shape. Secondly, a statistical analysis, performed on a set of 10 key shapes, generates a lip gesture coding with three articulatory-oriented parameters, specific to one speaker. The choice of the key shapes is based on general phonetic observations. Finally, the application for lip tracking of the 3D model controlled by the three parameters is presented and evaluated.
This paper presents a new integration of a data-driven approach using dimension reduction and a physically-based simulation for real-time character animation. We exploit Lie group statistical analysis techniques (Principal Geodesic Analysis, PGA) to approximate the pose manifold of a motion capture sequence by a reduced set of pose geodesics. We integrate this kinematic parametrization into a physically-based animation approach of virtual characters, by using the PGA-reduced parametrization directly as generalized coordinates of a Lagrangian formulation of mechanics. In order to achieve real-time without sacrificing stability, we derive an explicit time integrator by approximating existing variational integrators. Finally, we test our approach in task-space motion control. By formulating both physical simulation and inverse kinematics ...
This article present a visual characterization of facial motions inherent with speaking. We propose a set of four Facial Speech Parameters (FSP): jaw opening, lips rounding, lips closure, and lips raising, to represent the primary visual gestures of speech articulation into a multidimensional linear manifold. This manifold is initially generated as a statistical model, obtained by analyzing accurate 3D data of a reference human subject. The FSP are then associated to the linear modes of this statistical model, resulting in a 3D parametric facial mesh. We have tested the speaker-independent hypothesis of this manifold with a model-based video tracking task applied on different subjects. Firstly, the parametric model is adapted and aligned to a subject's face for a single shape. Then the face motion is tracked by optimally aligning the i...
This article presents the first version of a talking head, called MOTHER (MOrphable Talking Head for Enhanced Reality), based on an articulatory model describing the degrees-offreedom of visible (lips, cheeks...) but also partially or indirectly visible (jaw, tongue...) speech articulators. Skin details are rendered using texture mapping/blending techniques. We illustrate here the flexibility of such an articulatory control of video-realistic speaking faces by first demonstrating its ability in tracking facial movements by an optical-to-articulatory inversion using an analysis-by-synthesis technique. The stability and reliability of the results allow the automatic inversion of large video sequences. Inversion results are here used to build automatically a coarticulation model for the generation of facial movements from text. It improve...
This paper considers 3D imaging of moving objects and introduces a technique that exploits visible and x-ray images to recover dense 3D models. While recent methods such as tomography from cone-beam x-ray can advantageously replace more expensive and higher-dose CT scanners, they still require specific equipment and immobilised patients. We investigate an alternative strategy that combines a single x-ray source and a set of colour cameras to capture rigidly moving samples. The colour cameras allow for coarse marklerless motion tracking, which is further refined with the x-ray information. Once the sample poses are correctly estimated, a dense 3D attenuation model is reconstructed from the set of x-ray frames. Preliminary results on simulated data compared to ground-truth as well as actual in-vivo experiments are presented.
Creating and animating subject-specific anatomical models is traditionally a difficult process involving medical image segmentation, geometric corrections and the manual definition of kinematic parameters. In this paper, we introduce a novel template morphing algorithm that facilitates 3D modeling and parameterization of skeletons. Target data can be either medical images or surfaces of the whole skeleton. We incorporate prior knowledge about bone shape, the feasible skeleton pose, and the morphological variability in the population. This allows for noise reduction, bone separation, and the transfer, from the template, of anatomical and kinematical information not present in the input data. Our approach treats both local and global deformations in successive regularization steps: smooth elastic deformations are represented by an displa...