Thèse soutenue

Classification de sentiments sur le Web2. 0

FR
Auteur / Autrice : Abdelhalim Rafrafi
Direction : Patrick Gallinari
Type : Thèse de doctorat
Discipline(s) : Informatique
Date : Soutenance en 2013
Etablissement(s) : Paris 6

Résumé

FR  |  
EN

Internet becomes an essential media in everyday life : we use it to check thenews, to do our shopping, to shape our opinion, to share our feelings and experiencefeedbacks. This process generates a large amount of data on our personalities andlifestyles. With this amount of information we are quickly disarmed. "Looks like theoverload of information gives a sense of emptiness. " French quotation by Jean-PierreApril. Thus, some automated filtering and analyzing tools are required to make theinformation accessible to everybody. In this general context, our works focuses onsentiment analysis and on sentiment classification in particular. Classical algorithms for text categorization like SVM, NB, PLSA or LDA showseveral limitations for sentiment analysis. These limitations are related to the par-ticularity of the task : sentiment classification requires to take into account thestructure of the text (including negations for instance), the lexical field modeling isnot sufficient to understand the user messages. However, considering the text struc-ture requires some complex representations and/or algorithms that can hardly scaleup. We investigated many solutions to tackle those antagonist objectives simulta-neously. First we focused on regularized formulations adapted to sentiment classifi-cation to perform an efficient feature selection in N-grams space. Then, we exploredan orthogonal research axis : given a basic classifier, we simply increased the lear-ning set sizes using the web2. 0 as an infinite source of labeled data. Finally, we triedto combine the advantages from both solutions using an original neural network architecture.