Corpus Taurinense (CT)
The Corpus Taurinense (CT) is a corpus of Old Italian (more specifically XIII century Florentine) of 259,299 tokens (21,087 types and 7,599 lemmata). It is fully lemmatized, POS-tagged, disambiguated, and marked up for text structure, literary genre and philological forms. The CT has a long history and is the first corpus we planned. As a matter of fact, it was this project that first aroused Manuel Barbera's interest in Corpus Linguistics and NLP, that cemented his partnership with Carla Marello, and that, eventually, set in motion the train of events which brought in existence bmanuel.org, the computational group associated with it and with Turin University, and corpora.unito.it the pole of linguistic resources distribution. The CT was conceived by Barbera and Marello on the night of March 14th, 1998 in Padua during a meeting of ItalAnt, and was born in Stuttgart on April 29th, 2000, when the first working demo ("ANT4") was ready for interrogation (midwives were Arne Fitschen, Manuel Barbera and Ulrich Heid).
The Corpus Taurinense (CT) is a corpus of Old Italian (more specifically XIII century Florentine) of 259,299 tokens (21,087 types and 7,599 lemmata). It is fully lemmatized, POS-tagged, disambiguated, and marked up for text structure, literary genre and philological forms. The CT has a long history and is the first corpus we planned. As a matter of fact, it was this project that first aroused Manuel Barbera's interest in Corpus Linguistics and NLP, that cemented his partnership with Carla Marello, and that, eventually, set in motion the train of events which brought in existence bmanuel.org, the computational group associated with it and with Turin University, and corpora.unito.it the pole of linguistic resources distribution. The CT was conceived by Barbera and Marello on the night of March 14th, 1998 in Padua during a meeting of ItalAnt, and was born in Stuttgart on April 29th, 2000, when the first working demo ("ANT4") was ready for interrogation (midwives were Arne Fitschen, Manuel Barbera and Ulrich Heid).
- Type of material
- Terms of use
- Target audience
- Subject areas
- Tags
- Languages
- Media formats
- Other metadata
- author: Barbera, Manuel
- author: Marello, Carla
- author: Tomatis, Marco
- publisher: Barbera, Manuel
- publisher: Università degli Studi di Torino
- OER type
- Metadata and online reference
Submitted by
Fernando Martínez de Carnero
30/11/2015
in the project Strumenti e tecnologie per insegnare le lingue
last updated 04/12/2015
- Evaluations
- No evaluation
Please log in to add evaluation.
No comments yet.
Please log in to leave a comment.