Find a copy online
Links to this item
theses.fr Accès au texte intégral

Find a copy in the library
Finding libraries that hold this item...
Details
Genre/Form: | Thèses et écrits académiques |
---|---|
Material Type: | Document, Thesis/dissertation, Internet resource |
Document Type: | Internet Resource, Computer File |
All Authors / Contributors: |
Elena Viorica Epure; Camille Salinesi; David Naccache; Frank Hopfgartner; Christophe Cerisara; Rébecca Deneckere; Alain Wegmann; Université Paris 1 Panthéon-Sorbonne.; École doctorale de Management Panthéon-Sorbonne (Paris).; Université Paris 1 Panthéon-Sorbonne. Centre de recherche en informatique. |
OCLC Number: | 1128206874 |
Notes: | Titre provenant de l'écran-titre. |
Description: | 1 online resource |
Responsibility: | Elena Viorica Epure ; sous la direction de Camille Salinesi. |
Abstract:
The proliferation of digital data has enabled scientific and practitioner communities to createnew data-driven technologies to learn about user behaviors in order to deliver better services and support to people in their digital experience. The majority of these technologies extensively derive value from data logs passively generated during the human-computer interaction. A particularity of these behavioral traces is that they are structured. However, the pro-actively generated text across Internet is highly unstructured and represents the overwhelming majority of behavioral traces. To date, despite its prevalence and the relevance of behavioral knowledge to many domains, such as recommender systems, cyber-security and social network analysis,the digital text is still insufficiently tackled as traces of human behavior to automatically reveal extensive insights into behavior.The main objective of this thesis is to propose a corpus-independent method to automatically exploit the asynchronous communication as pro-actively generated behavior traces in order to discover process models of conversations, centered on comprehensive speech intentions and relations. The solution is built in three iterations, following a design science approach.Multiple original contributions are made. The only systematic study to date on the automatic modeling of asynchronous communication with speech intentions is conducted. A speech intention taxonomy is derived from linguistics to model the asynchronous communication and, comparedto all taxonomies from the related works, it is corpus-independent, comprehensive--as in both finer-grained and exhaustive in the given context, and its application by non-experts is proven feasible through extensive experiments. A corpus-independent, automatic method to annotate utterances of asynchronous communication with the proposed speech intention taxonomy is designed based on supervised machine learning. For this, validated ground-truth corpora arecreated and groups of features--discourse, content and conversation-related, are engineered to be used by the classifiers. In particular, some of the discourse features are novel and defined by considering linguistic means to express speech intentions, without relying on the corpus explicit content, domain or on specificities of the asynchronous communication types. Then, an automatic method based on process mining is designed to generate process models of interrelated speech intentions from conversation turns, annotated with multiple speech intentions per sentence. As process mining relies on well-defined structured event logs, an algorithm to produce such logs from conversations is proposed. Additionally, an extensive design rationale on how conversations annotated with multiple labels per sentence could be transformed in event logs and what is the impact of different decisions on the output behavioral models is released to support future research. Experiments and qualitative validations in medicine and conversation analysis show that the proposed solution reveals reliable and relevant results, but also limitations are identified,to be addressed in future works.
Reviews

