Thèse soutenue

FR
Accès à la thèse
Auteur / Autrice : Yann Jacob
Direction : Patrick Gallinari
Type : Thèse de doctorat
Discipline(s) : Informatique
Date : Soutenance en 2013
Etablissement(s) : Paris 6

Mots clés

FR

Mots clés contrôlés

Résumé

FR  |  
EN

The emergence of the Web 2. 0 has seen the apparition of a large quantity of data that can easily be represented as complex graphs. There is many tasks of information analysis, prediction and retrieval on these data, while the state-of-the-art models are not adapted. In this thesis, we consider the task of node classification/labeling in complex partially labeled content networks. The applications for this task are for instance video/photo annotation in the Web 2. 0 websites, web spam detection or user labeling in social networks. The originality of our work is that we focus on two types of complex networks rarely considered in existing works: \textbf{multi-relationnal graphs} composed of multiple relation types and \textbf{heterogeneous networks} composed of multiple node types then of multiple joint labeling problems. First, we proposed two new algorithms for multi-relationnal graph labeling. These algorithms learn to weight the different relation types in the label propagation process according to their usefullness for the labeling task. They learn to combine the different relation types in an optimal manner for classification, while using the node content information. Then, we proposed an algorithm for heterogeneous graph labeling. Here, a specific problem is that each type of node has it own label set: for instance visual tags for a photo and groups for an user, then we must solve these different classification problems simultaneously using the graph structure. Our algorithm is based on the usage of a latent representation common to all node types allowing to process the different node types in an uniformized manner. Our experimental results show that this model is able to take in account the correlations between labels of different node types.