The paper illustrates the design and development of a textual corpus repre- sentative of the historical variants of Ital- ian during the Great War, which was en- riched with linguistic (lemmatization and pos-tagging) and meta-linguistic annota- tion. The corpus, after a manual revision of the linguistic annotation, was used for specializing existing NLP tools to process historical texts with promising results.
Italian in the Trenches: Linguistic Annotation and Analysis of Texts of the Great War
Irene De Felice;
2018-01-01
Abstract
The paper illustrates the design and development of a textual corpus repre- sentative of the historical variants of Ital- ian during the Great War, which was en- riched with linguistic (lemmatization and pos-tagging) and meta-linguistic annota- tion. The corpus, after a manual revision of the linguistic annotation, was used for specializing existing NLP tools to process historical texts with promising results.File in questo prodotto:
File | Dimensione | Formato | |
---|---|---|---|
DeFelice_etal_clic-it_2018.pdf
file ad accesso aperto
Licenza:
Non specificato
Dimensione
149.08 kB
Formato
Adobe PDF
|
149.08 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.