Show simple item record

Croatian corpus of non‐professional written language by typical speakers and speakers with language disorders RAPUT 1.0

CreatorKologranić Belić, Lana
CreatorKuvač Kraljević, Jelena
CreatorŠtefanec, Vanja
CreatorHržica, Gordana
CreatorLjubešić, Nikola
Date2021-07-02T08:32:23Z
dc.date.accessioned2021-07-25T11:58:25Z
dc.date.available2021-07-25T11:58:25Z
Identifierhttp://hdl.handle.net/11356/1435
dc.identifier.urihttps://linghub.org/handle/123456789/1041381
DescriptionThe corpus consists of texts produced by nonprofessional typical speakers and speakers with different language disorders (developmental language disorder, dyslexia, traumatic brain injury, aphasia, other). Roughly half of the corpus consists of texts of typical speakers, and the other half of speakers with language disorders. Language samples were elicited by six groups of tasks representing different writing styles (descriptive, expository, narrative, and letter) and different levels of formality. The corpus has been manually annotated for normalized forms, lemmas, morphosyntactic information (by following the MULTEXT-East tagset), and type of error (phonological segmentation, orthography, non-standard spelling, typo, syntax, etc.). UD morphosyntactic description has been to the most part automatically generated from the MULTEXT-East morphosyntactic information.
PublisherJožef Stefan Institute
PublisherFaculty of Education and Rehabilitation, University of Zagreb
Rightshttps://creativecommons.org/licenses/by-sa/4.0/
RightsCreative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Subjectnon-professional written language
Subjectspeakers with language disorders
Subjecttypical speakers
TitleCroatian corpus of non‐professional written language by typical speakers and speakers with language disorders RAPUT 1.0
Typecorpus
TypeText
dcterms.available2021-07-02T08:32:23Z
dcterms.bibliographicCitationhttp://hdl.handle.net/11356/1435
dcterms.creatorKologranić Belić, Lana
dcterms.creatorKuvač Kraljević, Jelena
dcterms.creatorŠtefanec, Vanja
dcterms.creatorHržica, Gordana
dcterms.creatorLjubešić, Nikola
dcterms.date2021-07-02T08:32:23Z
dcterms.descriptionThe corpus consists of texts produced by nonprofessional typical speakers and speakers with different language disorders (developmental language disorder, dyslexia, traumatic brain injury, aphasia, other). Roughly half of the corpus consists of texts of typical speakers, and the other half of speakers with language disorders. Language samples were elicited by six groups of tasks representing different writing styles (descriptive, expository, narrative, and letter) and different levels of formality. The corpus has been manually annotated for normalized forms, lemmas, morphosyntactic information (by following the MULTEXT-East tagset), and type of error (phonological segmentation, orthography, non-standard spelling, typo, syntax, etc.). UD morphosyntactic description has been to the most part automatically generated from the MULTEXT-East morphosyntactic information.
dcterms.identifierhttp://hdl.handle.net/11356/1435
dcterms.publisherJožef Stefan Institute
dcterms.publisherFaculty of Education and Rehabilitation, University of Zagreb
dcterms.rightshttps://creativecommons.org/licenses/by-sa/4.0/
dcterms.rightsCreative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
dcterms.subjectnon-professional written language
dcterms.subjectspeakers with language disorders
dcterms.subjecttypical speakers
dcterms.titleCroatian corpus of non‐professional written language by typical speakers and speakers with language disorders RAPUT 1.0
dcterms.typecorpus
dcterms.typeText
odrl.Policyhttp://purl.org/net/rdflicense/cc-by-sa4.0


Check resource access

Authorized
Reason

Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

  • OLAC
    Main data from the OLAC dataset

Show simple item record


Copyright  © 2020 All Rights Reserved by Prêt-à-LLOD Project.

Horizon 2020

This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 825182.