Over the years, the original conception of idioms as semantically empty and formally frozen units (Bobrow and Bell, 1973; Swinney and Cutler, 1979) has been replaced by a more complex view, whereby some idioms display an analyz able semantic structure (Nunberg, 1978) that allows for greater formal plasticity (Nunberg et al., 1994; Gibbs and Nayak, 1989). Corpus data have anyway shown that all types of idioms allow for a certain degree of manipulation if an appropriate context is provided (Duffley, 2013; Vietri, 2014). On the other hand, psycholin guistic data have revealed that the processing of idiom variants is not necessarily harder than the processing of idiom canonical forms or that it can be similar to the processing of literal language (McGlone et al., 1994; Geeraert et al., 2017a). Despite this possible variability, in two computational studies we show that focus ing on lexical fixedness is still an effective method for automatically telling apart non-compositional idiomatic expressions and compositional non-idiomatic expres sions by means of distributional-semantic indices of compositionality that compute the cosine similarity between the vector of a given phrase to be classified and the vectors of lexical variants of the same phrase that are generated distributionally or from the Italian section of MultiWordNet (Pianta et al., 2002). Idioms all in all result to be less similar to the vectors of their lexical variants with respect to compositional expressions, confirming that they tend to be employed in a more formally conservative way in language use. In two eye-tracking studies we then compare the reading times of idioms and literals in the active form, in a passive form with preverbal subject and in a passive form with postverbal subject, which preserves the verb-noun order of the canonical active form. The first experiment reveals that passives are longer to read than actives with no significant effect of idiomaticity in passive forms. A second experiment with more ecological dialogic stimuli reveals that preserving the surface verb-noun order of the active form fa cilitates the processing of passive idioms, suggesting that one of the core issues with idiom passivization could be the violation of canonical verb-noun order rather than verb voice per se.

Working both sides of the street: computational and psycholinguistic investigations on idiomatic variability / Senaldi, Marco Silvio Giuseppe; relatore: Bertinetto, Pier Marco; Scuola Normale Superiore, 2019.

Working both sides of the street: computational and psycholinguistic investigations on idiomatic variability

Senaldi, Marco Silvio Giuseppe
2019

Abstract

Over the years, the original conception of idioms as semantically empty and formally frozen units (Bobrow and Bell, 1973; Swinney and Cutler, 1979) has been replaced by a more complex view, whereby some idioms display an analyz able semantic structure (Nunberg, 1978) that allows for greater formal plasticity (Nunberg et al., 1994; Gibbs and Nayak, 1989). Corpus data have anyway shown that all types of idioms allow for a certain degree of manipulation if an appropriate context is provided (Duffley, 2013; Vietri, 2014). On the other hand, psycholin guistic data have revealed that the processing of idiom variants is not necessarily harder than the processing of idiom canonical forms or that it can be similar to the processing of literal language (McGlone et al., 1994; Geeraert et al., 2017a). Despite this possible variability, in two computational studies we show that focus ing on lexical fixedness is still an effective method for automatically telling apart non-compositional idiomatic expressions and compositional non-idiomatic expres sions by means of distributional-semantic indices of compositionality that compute the cosine similarity between the vector of a given phrase to be classified and the vectors of lexical variants of the same phrase that are generated distributionally or from the Italian section of MultiWordNet (Pianta et al., 2002). Idioms all in all result to be less similar to the vectors of their lexical variants with respect to compositional expressions, confirming that they tend to be employed in a more formally conservative way in language use. In two eye-tracking studies we then compare the reading times of idioms and literals in the active form, in a passive form with preverbal subject and in a passive form with postverbal subject, which preserves the verb-noun order of the canonical active form. The first experiment reveals that passives are longer to read than actives with no significant effect of idiomaticity in passive forms. A second experiment with more ecological dialogic stimuli reveals that preserving the surface verb-noun order of the active form fa cilitates the processing of passive idioms, suggesting that one of the core issues with idiom passivization could be the violation of canonical verb-noun order rather than verb voice per se.
2019
L-LIN/01 GLOTTOLOGIA E LINGUISTICA
Linguistica
computational linguistics
idioms
Linguistics
psycholinguistics
Scuola Normale Superiore
Bertinetto, Pier Marco
Lenci, Alessandro
File in questo prodotto:
File Dimensione Formato  
Senaldi_thesis_def.pdf

accesso aperto

Descrizione: doctoral thesis full text
Tipologia: Tesi PhD
Licenza: Solo Lettura
Dimensione 2.08 MB
Formato Adobe PDF
2.08 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11384/86016
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact