Show simple item record

A Small Dataset for English-to-Czech Speech Translation in the Travel Domain

CreatorCífka, Ondřej
CreatorBojar, Ondřej
Date2016-06-14T15:45:50Z
dc.date.accessioned2021-07-25T12:06:57Z
dc.date.available2021-07-25T12:06:57Z
Identifierhttp://hdl.handle.net/11234/1-1735
dc.identifier.urihttps://linghub.org/handle/123456789/1042796
DescriptionThis small dataset contains 3 speech corpora collected using the Alex Translate telephone service (https://ufal.mff.cuni.cz/alex#alex-translate). The "part1" and "part2" corpora contain English speech with transcriptions and Czech translations. These recordings were collected from users of the service. Part 1 contains earlier recordings, filtered to include only clean speech; Part 2 contains later recordings with no filtering applied. The "cstest" corpus contains recordings of artificially created sentences, each containing one or more Czech names of places in the Czech Republic. These were recorded by a multinational group of students studying in Prague.
PublisherCharles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Rightshttp://creativecommons.org/licenses/by-sa/4.0/
RightsCreative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Subjectmachine translation
SubjectASR
Subjectspeech corpus
TitleA Small Dataset for English-to-Czech Speech Translation in the Travel Domain
Typecorpus
TypeText
dcterms.available2016-06-14T15:45:50Z
dcterms.bibliographicCitationhttp://hdl.handle.net/11234/1-1735
dcterms.creatorCífka, Ondřej
dcterms.creatorBojar, Ondřej
dcterms.date2016-06-14T15:45:50Z
dcterms.descriptionThis small dataset contains 3 speech corpora collected using the Alex Translate telephone service (https://ufal.mff.cuni.cz/alex#alex-translate). The "part1" and "part2" corpora contain English speech with transcriptions and Czech translations. These recordings were collected from users of the service. Part 1 contains earlier recordings, filtered to include only clean speech; Part 2 contains later recordings with no filtering applied. The "cstest" corpus contains recordings of artificially created sentences, each containing one or more Czech names of places in the Czech Republic. These were recorded by a multinational group of students studying in Prague.
dcterms.identifierhttp://hdl.handle.net/11234/1-1735
dcterms.publisherCharles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
dcterms.rightshttp://creativecommons.org/licenses/by-sa/4.0/
dcterms.rightsCreative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
dcterms.subjectmachine translation
dcterms.subjectASR
dcterms.subjectspeech corpus
dcterms.titleA Small Dataset for English-to-Czech Speech Translation in the Travel Domain
dcterms.typecorpus
dcterms.typeText
odrl.Policyhttp://purl.org/net/rdflicense/cc-by-sa4.0


Check resource access

Authorized
Reason

Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

  • OLAC
    Main data from the OLAC dataset

Show simple item record


Copyright  © 2020 All Rights Reserved by Prêt-à-LLOD Project.

Horizon 2020

This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 825182.