Show simple item record

Consonant-vowel structures in the GOS 1.0 corpus

CreatorDobrovoljc, Kaja
CreatorKrek, Simon
CreatorČibej, Jaka
CreatorArhar Holdt, Špela
Date2020-02-19T11:09:29Z
dc.date.accessioned2021-07-24T21:27:43Z
dc.date.available2021-07-24T21:27:43Z
Identifierhttp://hdl.handle.net/11356/1290
dc.identifier.urihttps://linghub.org/handle/123456789/924957
DescriptionThe lists contain consonant-vowel structures of all lemmas, word forms, and normalized word forms in the GOS 1.0 Corpus of Spoken Slovene (http://hdl.handle.net/11356/1040). In each unit, its characters were converted as follows: C - consonant (in lists with finegrained character categorizations, consonants were divided into Z - sonorant, G - voiced obstruent, and K - voiceless obstruent), V - vowel, X - foreign consonant, Y - foreign vowel, S - symbol, P - punctuation, N - number, F - non-Latin-script character, ! - other. Each consonant-vowel structure also contains its frequency in the corpus (i.e. the total sum of the frequencies of all units corresponding to the consonant-vowel structure), as well as the set of all units (in the lists labeled "entire") or the set of its 30 most frequent units (in the lists labeled as "short"), along with their part-of-speech categories and their individual frequencies). They also contain the number of all unique units within the consonant-vowel structure. The lists were prepared based on frequency lists extracted from GOS 1.0 using LIST: http://hdl.handle.net/11356/1276 Note that there exists a related resource, "Consonant-vowel structures in the Gigafida 2.0 corpus", http://hdl.handle.net/11356/1289
PublisherJožef Stefan Institute
PublisherCentre for Language Resources and Technologies, University of Ljubljana
Rightshttps://creativecommons.org/licenses/by-sa/4.0/
RightsCreative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Subjectspoken Slovene
Subjectconsonant-vowel structures
Subjectsonorants
Subjectconsonants
SubjectGOS
Subjectvowels
Subjectobstruents
Subjectfrequency list
TitleConsonant-vowel structures in the GOS 1.0 corpus
TypelexicalConceptualResource
TypeText
dcterms.available2020-02-19T11:09:29Z
dcterms.bibliographicCitationhttp://hdl.handle.net/11356/1290
dcterms.creatorDobrovoljc, Kaja
dcterms.creatorKrek, Simon
dcterms.creatorČibej, Jaka
dcterms.creatorArhar Holdt, Špela
dcterms.date2020-02-19T11:09:29Z
dcterms.descriptionThe lists contain consonant-vowel structures of all lemmas, word forms, and normalized word forms in the GOS 1.0 Corpus of Spoken Slovene (http://hdl.handle.net/11356/1040). In each unit, its characters were converted as follows: C - consonant (in lists with finegrained character categorizations, consonants were divided into Z - sonorant, G - voiced obstruent, and K - voiceless obstruent), V - vowel, X - foreign consonant, Y - foreign vowel, S - symbol, P - punctuation, N - number, F - non-Latin-script character, ! - other. Each consonant-vowel structure also contains its frequency in the corpus (i.e. the total sum of the frequencies of all units corresponding to the consonant-vowel structure), as well as the set of all units (in the lists labeled "entire") or the set of its 30 most frequent units (in the lists labeled as "short"), along with their part-of-speech categories and their individual frequencies). They also contain the number of all unique units within the consonant-vowel structure. The lists were prepared based on frequency lists extracted from GOS 1.0 using LIST: http://hdl.handle.net/11356/1276 Note that there exists a related resource, "Consonant-vowel structures in the Gigafida 2.0 corpus", http://hdl.handle.net/11356/1289
dcterms.identifierhttp://hdl.handle.net/11356/1290
dcterms.isReplacedByhttp://hdl.handle.net/11356/1367
dcterms.publisherJožef Stefan Institute
dcterms.publisherCentre for Language Resources and Technologies, University of Ljubljana
dcterms.rightshttps://creativecommons.org/licenses/by-sa/4.0/
dcterms.rightsCreative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
dcterms.subjectspoken Slovene
dcterms.subjectconsonant-vowel structures
dcterms.subjectsonorants
dcterms.subjectconsonants
dcterms.subjectGOS
dcterms.subjectvowels
dcterms.subjectobstruents
dcterms.subjectfrequency list
dcterms.titleConsonant-vowel structures in the GOS 1.0 corpus
dcterms.typelexicalConceptualResource
dcterms.typeText
odrl.Policyhttp://purl.org/net/rdflicense/cc-by-sa4.0


Check resource access

Authorized
Reason

Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

  • OLAC
    Main data from the OLAC dataset

Show simple item record


Copyright  © 2020 All Rights Reserved by Prêt-à-LLOD Project.

Horizon 2020

This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 825182.