Description
Emotion recognition in conversation has applications within numerous domains such as affective computing and medicine. Recent methods for emotion recognition jointly utilize conversational data over several modalities including audio, video, and text. However, state-of-the-art frameworks for this task do not focus on the feature extraction and feature fusion steps of this process. This thesis aims to improve the state-of-the-art method by incorporating two components to better accomplish these steps. By doing so, we are able to produce improved representations for the text modality and better model the relationships between all modalities. This paper proposes two methods which focus on these concepts and provide improved accuracy over the state-of-the-art framework for multimodal emotion recognition in dialogue.
Details
Contributors
- Rawal, Siddharth (Author)
- Baral, Chitta (Thesis director)
- Shah, Shrikant (Committee member)
- Computer Science and Engineering Program (Contributor)
- Barrett, The Honors College (Contributor)
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
2020-05
Topical Subject
Resource Type
Language
- eng
Additional Information
English
Series
- Academic Year 2019-2020
Extent
- 22 pages