$content.nome.text $content.cognome.text

Semi-supervised sentiment clustering on natural language texts

Frigau, Luca^Primo;Romano, Maurizio^Secondo;Ortu, Marco^Penultimo;Contu, Giulia^Ultimo

2023-01-01

Abstract

In this paper, we propose a semi-supervised method to cluster unstructured textual data called semi-supervised sentiment clustering on natural language texts. The aim is to identify clusters homogeneous with respect to the overall sentiment of the texts analyzed. The method combines different techniques and methodologies: Sentiment Analysis, Threshold-based Naïve Bayes classifier, and Network-based Semi-supervised Clustering. It involves different steps. In the first step, the unstructured text is transformed into structured text, and it is categorized into positive or negative classes using a sentiment analysis algorithm. In the second step, the Threshold-based Naïve Bayes classifier is applied to identify the overall sentiment of the texts and to define a specific sentiment value for the topics. In the last step, Network-based Semi-supervised Clustering is applied to partition the instances into disjoint groups. The proposed algorithm is tested on a collection of reviews written by customers on Booking.com. The results have highlighted the capacity of the proposed algorithm to identify clusters that are distinct, non-overlapped, and homogeneous with respect to the overall sentiment. Results are also easily interpretable thanks to the network representation of the instances that helps to understand the relationship between them.

Scheda breve

Scheda completa

Scheda completa (DC)

         Anno di pubblicazione 
       
        2023 
       
         Parole chiave 
       
        Tb-NB; NeSSC; Reviews; Tourism data; Booking.com 
       
         Tipologia: 
       
        1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
s10260-023-00691-4-7.pdf accesso aperto Tipologia: versione editoriale Dimensione 1.35 MB Formato Adobe PDF Visualizza/Apri	1.35 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Università degli Studi di Cagliari

Università degli Studi di Cagliari

Semi-supervised sentiment clustering on natural language texts

Frigau, Luca^Primo;Romano, Maurizio^Secondo;Ortu, Marco^Penultimo;Contu, Giulia^Ultimo

Primo

Secondo

Penultimo

Ultimo

2023-01-01

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Semi-supervised sentiment clustering on natural language texts

Frigau, LucaPrimo;Romano, Maurizio Secondo;Ortu, MarcoPenultimo;Contu, GiuliaUltimo

Primo

Secondo

Penultimo

Ultimo

2023-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Questionario e social

Frigau, Luca^Primo;Romano, Maurizio^Secondo;Ortu, Marco^Penultimo;Contu, Giulia^Ultimo

Scheda breve

Scheda completa

Scheda completa (DC)