Asymmetries in extraction from nominal copular sentences : a challenging case study for NLP tools

Lorusso, Paolo; Greco, Matteo Paolo; Chesi, Cristiano; Moro, Andrea Carlo

In this paper we discuss two types of nominal copular sentences (Canonical and Inverse, Moro 1997) and we demonstrate how the peculiarities of these two configurations are hardly considered by standard NLP tools that are currently publicly available. Here we show that example-based MT tools (e.g. Google Translate) as well as other NLP tools (UDpipe, LinguA, Stanford Parser, and Google Cloud AI API) fail in capturing the critical distinctions between the two structures in the end producing both wrong analyses and, possibly as a consequence of a non-coherent (or missing) structural analysis, incorrect translations in the case of MT tools. To support the proposed analysis, we present also an empirical study showing that native speakers are indeed sensitive to the critical distinctions. This poses a sharp challenge for NLP tools that aim at being cognitively plausible or at least descriptively adequate (Chowdhury & Zamparelli 2018).

Asymmetries in extraction from nominal copular sentences : a challenging case study for NLP tools

Lorusso, Paolo;Greco, Matteo Paolo;Chesi, Cristiano;Moro, Andrea Carlo

2019

Abstract

In this paper we discuss two types of nominal copular sentences (Canonical and Inverse, Moro 1997) and we demonstrate how the peculiarities of these two configurations are hardly considered by standard NLP tools that are currently publicly available. Here we show that example-based MT tools (e.g. Google Translate) as well as other NLP tools (UDpipe, LinguA, Stanford Parser, and Google Cloud AI API) fail in capturing the critical distinctions between the two structures in the end producing both wrong analyses and, possibly as a consequence of a non-coherent (or missing) structural analysis, incorrect translations in the case of MT tools. To support the proposed analysis, we present also an empirical study showing that native speakers are indeed sensitive to the critical distinctions. This poses a sharp challenge for NLP tools that aim at being cognitively plausible or at least descriptively adequate (Chowdhury & Zamparelli 2018).

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2019
			
	Settore Scientifico Disciplinare (validi fino a 24/06/2024)
	
				Settore L-LIN/01 - Glottologia e Linguistica
			
	Settore Scientifico Disciplinare (validi dal 09/05/2024)
	
				Settore GLOT-01/A - Glottologia e linguistica
			
	Titolo del Convegno
	
				Sixth Italian Conference on Computational Linguistics CLiC-it
			
	Luogo del Convegno
	
				Bari (Italia)
			
	Periodo del Convegno
	
				November 13-15, 2019
			
	Titolo del Volume
	
				Proceedings of the Sixth Italian Conference on Computational Linguistics CLiC-it 2019 (Bari, November 13-15, 2019)
			
	Editore
	
				CEUR
			
	ISBN
	
				9791280136008
			
	Parole chiave
	
				non-local dependencies deep parsing grammaticality judgments self-paced reading
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Lorusso et al 2019 - CLIC-IT-2019.pdf accesso aperto Tipologia: Published version Licenza: Creative Commons Dimensione 1.28 MB Formato Adobe PDF	1.28 MB	Adobe PDF