Corpora, Lexica and Ontologies
Wordnet
- Open Dutch Wordnet: Open Dutch Wordnet
Vaccination debate
- Vaccination corpus: Corpus from online sources debating vaccination, annotated with atttribution relations, claims, events, and sentiments. Also annotated with propositional alignment relations.
Historical data
- The clariah-vocab-conversion repository was created as part of the CLARIAH project. It contains several historical and contemporaneous lexicons for Dutch that have been converted to one common format (LEMON-RDF), as well as code to convert lexicons and vocabularies to RDF.
- Mining-Ministers. Data and scripts used for investigating Fred van Lieburg's corpus.
Biographical data
- BiographicalDataModels
- BiographyNet: NLP tools and data used in BiographyNet
Event detection, (co)-reference and identity
- ECB+: An extension of the Event Coreference Bank to increase referential ambiguity
- Event Storyline corpus: An extension of ECB+ with narrative structures to create storylines
- Circumstantial Event Ontology and Corpus: An ontology with pre-conditions and post-conditions for events which reflect circumstantial causal relations, and an extension of ECB+ annotated with these relations
- GunViolenceCorpus: corpus with referential data on incidents that exhibits extreme variation and ambiguity
- OldBailey: Processing the OldBailey data to create LOD
- CorpusComparison: data for van Son et al., 2018. Resource Interoperability for Sustainable Benchmarking: The Case of Events
- MWEP
- MWEP on one incident
Dutch Framenet
- Spinoza-Dutch-Framenet-Corpus: SoNaR PropBank annotation extended with FrametNet1.7 frames and elements
- Dutch Framenet Lexicon
Image description
- DutchDescriptions: Dutch descriptions for the Flickr30K validation and test data, plus a cross-lingual comparison tool.
Word-sense disambiguation
Discourse
- content-types
- ContentZones: Annotated news data with document content zones
Generic resources
- vua-resources: Lexical resources that are used for semantic parsing by the CLTL modules