Manuela Sanguinetti

Treebanking User-Generated Content: A Proposal for a Unified Representation in Universal Dependencies

Sanguinetti Manuela
;
2020-01-01

Abstract

The paper presents a discussion on the main linguistic phenomena of user-generated texts found in web and social media, and proposes a set of annotation guidelines for their treatment within the Universal Dependencies (UD) framework. Given on the one hand the increasing number of treebanks featuring user-generated content, and its somewhat inconsistent treatment in these resources on the other, the aim of this paper is twofold: (1) to provide a short, though comprehensive, overview of such treebanks - based on available literature - along with their main features and a comparative analysis of their annotation criteria, and (2) to propose a set of tentative UD-based annotation guidelines, to promote consistent treatment of the particular phenomena found in these types of texts. The main goal of this paper is to provide a common framework for those teams interested in developing similar resources in UD, thus enabling cross-linguistic consistency, which is a principle that has always been in the spirit of UD.
2020
Inglese
Proceedings of the 12th Language Resources and Evaluation Conference
979-10-95546-34-4
ELRA, Language Resources Association
5240
5250
11
https://aclanthology.org/2020.lrec-1.645
12th International Conference on Language Resources and Evaluation, LREC 2020
Comitato scientifico
11-16 Maggio 2020
Marseille (Fra)
internazionale
scientifica
Web; social media; treebanks; Universal Dependencies; annotation guidelines; UGC
4 Contributo in Atti di Convegno (Proceeding)::4.1 Contributo in Atti di convegno
Sanguinetti, Manuela; Bosco, Cristina; Cassidy, Lauren; Cetinoglu, Ozlem; Cignarella Alessandra, Teresa; Lynn, Teresa; Rehbein, Ines; Ruppenhofer, Jos ...espandi
273
10
4.1 Contributo in Atti di convegno
open
info:eu-repo/semantics/conferencePaper
File in questo prodotto:
File Dimensione Formato  
lrec2020_ud.pdf

accesso aperto

Descrizione: conference paper
Tipologia: versione editoriale (VoR)
Dimensione 278.7 kB
Formato Adobe PDF
278.7 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Questionario e social

Condividi su:
Impostazioni cookie