Show simple item record

dc.date.accessioned2021-07-19T17:05:01Z
dc.date.available2021-07-19T17:05:01Z
dc.identifier.urihttps://datahub.ckan.io/dataset/901d9642-880f-44c3-b5a1-688e2b0d5752
dc.identifier.urihttps://linghub.org/handle/123456789/561790
dc.language.isodeu
dcterms.descriptionThis corpus contains a conversion of Wikipedia abstracts in six languages (dutch, english, french, german, italian and spanish) into the I used the NLP Interchange Format (NIF). The corpus contains the abstract texts, as well as the position, surface form and linked article of all links in the text. As such, it contains entity mentions manually disambiguated to Wikipedia/DBpedia resources by native speakers, which predestines it for NER training and evaluation. Furthermore, the abstracts represent a special form of text that lends itself to be used for more sophisticated tasks, like open relation extraction. Their encyclopedic style, following Wikipedia guidelines on opening paragraphs adds further interesting properties. The first sentence puts the article in broader context. Most anaphers will refer to the original topic of the text, making them easier to resolve. Finally, should the same string occur in different meanings, Wikipedia guidelines suggest that the new meaning should again be linked for disambiguation. In short: The type of text is highly interesting.
dcterms.identifier901d9642-880f-44c3-b5a1-688e2b0d5752
dcterms.issued2016-01-18 22:08:58.014152
dcterms.languagehttp://lexvo.org/id/iso639-3/deu
dcterms.modified2016-01-22 10:14:02.315000
dcterms.publisherhttps://datahub.ckan.io/organization/58f089dd-98c3-4de7-ba0f-e4c06680bf27
dcterms.titleDBpedia abstract German corpus
rdf.typehttp://www.w3.org/ns/dcat#Dataset
dcat.contactPointNc3212615d9d24abdb649c059d249ae6a
dcat.distributionhttps://datahub.ckan.io/dataset/901d9642-880f-44c3-b5a1-688e2b0d5752/resource/08c9e1f2-7c75-4ed0-9d0a-04c2d4cce5b4
dcat.landingPagehttps://datahub.io/dataset/dbpedia-abstract-corpus
dcat.keywordner
dcat.keywordfreme project
dcat.keyworddbpedia
dcat.keywordrdf
dcat.keywordlinguistics
dcat.keywordnif
dcat.keywordllod
dcat.keywordcorpus


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record


Copyright  © 2020 All Rights Reserved by Prêt-à-LLOD Project.

Horizon 2020

This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 825182.