Thèse soutenue

Extraction et formalisation de la sémantique des liens hypertextes dans des documents culturels, scientifiques et techniques

FR  |  
EN
Auteur / Autrice : Moustafa Al-Hajj
Direction : Hubert CardotGilles Verley
Type : Thèse de doctorat
Discipline(s) : Informatique
Date : Soutenance en 2007
Etablissement(s) : Tours

Résumé

FR  |  
EN

The use of hypertext links on the web makes sites more attractive and easier to read and allows enrichment of sites by information coming from other sites. However, this links produce some difficulties for readers and search engines. The hypertext links are carrying semantic information which, if it were completely formalized, would be exploitable by programs to improve navigation and research of information, and would take its place in the emergence of semantic web. In this thesis, we propose an original methodology for the formal semantic extraction of hypertext links. The suggested method has been tested on the links of a corpus. The formalism RDF has been used to represent the link semantics. Ontology for the links specific to the field of biographies of famous people was made up starting from the link semantics extracted and then represented in RDFS. Some tools of supervised learning and of web pages characterization by keywords has been used to help with the formal extraction of semantics.