Show simple item record

dc.date.accessioned2022-01-21T22:27:04Z
dc.date.available2022-01-21T22:27:04Z
IdentifierELRA-W0022
dc.identifier.urihttp://metashare.elda.org/repository/browse/1e401bfade6e11e2b1e400259011f6ea341de6811cf44c1b8fd0436535c6d1ca/
dc.identifier.urihttp://linghub.org/handle/123456789/1127128
DescriptionThe ILSP/ELEFTHEROTYPIA Corpus contains approximately 3 million words classified and annotated according to the common core PAROLE encoding standard. Thus, each file is classified according to the parameters of Medium, Topic and Genre, and structurally annotated at paragraph level (CES Level 1). The format of the corpus is SGML files. The source of the files is the Greek daily newspaper ELEFTHEROTYPIA. A subset of the corpus (250,000 words) is morpho-syntactically tagged; all the words are also lemmatised and checked. For the morphosyntactic annotation of the corpus, a stepwise procedure consisting of the following four steps was used: automatic morphosyntactic annotation, automatic disambiguation, manual disambiguation and checking, conversion into the PAROLE format requirements. In certain texts, some passages are written in "katharevoussa", an older version of Greek; these passages are marked as "distinct" and have not been morpho-syntactically annotated. The tagset used for the morphological annotation of the corpus is presented in the "Addendum to TA - Encoding features and values for the morphological layer in the lexicon Merged Tags" (P-WP1.1.-MEMO-ERLI-5). More information about the PAROLE project: http://www.elda.org/catalogue/fr/text/doc/parole.html
Languagegre/ell
dc.language.isoell
RightsELRA_END_USER
SourceMETA-SHARE
TitleILSP/ELEFTHEROTYPIA Corpus (Greek corpus)
Typecorpus
dcterms.created2005-05-12
dcterms.hasVersion1.0
dcterms.issued2014-09-23
dcterms.licensehttp://www.elra.info/IMG/pdf_ENDUSER_140312.pdf
dcterms.modified2004-05-12
rdf.typehttp://purl.org/net/def/metashare#LanguageResource
rdf.typehttp://www.w3.org/ns/dcat#Dataset


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record


Copyright  © 2020 All Rights Reserved by Prêt-à-LLOD Project.

Horizon 2020

This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 825182.