In this paper we describe the methodologies we proposed to tackle the EVALITA 2020 shared task PRELEARN. We propose both a methodology based on gated recurrent units as well as one using more classical word embeddings together with ensemble methods. Our goal in choosing these approaches, is twofold, on one side we wish to see how much of the prerequisite information is present within the pages themselves. On the other we would like to compare how much using the information from the rest of Wikipedia can help in identifying this type of relation. This second approach is particularly useful in terms of extension to new entities close to the one in the corpus provided for the task but not actually present in it. With this methodologies we reached second position in the challenge.
B4DS @ PRELEARN: Ensemble method for prerequisite learning
Puccetti, Giovanni
;
2020
Abstract
In this paper we describe the methodologies we proposed to tackle the EVALITA 2020 shared task PRELEARN. We propose both a methodology based on gated recurrent units as well as one using more classical word embeddings together with ensemble methods. Our goal in choosing these approaches, is twofold, on one side we wish to see how much of the prerequisite information is present within the pages themselves. On the other we would like to compare how much using the information from the rest of Wikipedia can help in identifying this type of relation. This second approach is particularly useful in terms of extension to new entities close to the one in the corpus provided for the task but not actually present in it. With this methodologies we reached second position in the challenge.File | Dimensione | Formato | |
---|---|---|---|
b4ds@prelearn.pdf
accesso aperto
Tipologia:
Accepted version (post-print)
Licenza:
Creative Commons
Dimensione
83.84 kB
Formato
Adobe PDF
|
83.84 kB | Adobe PDF | |
aaccademia-7528.pdf
accesso aperto
Tipologia:
Published version
Licenza:
Creative Commons
Dimensione
144.15 kB
Formato
Adobe PDF
|
144.15 kB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.