Word Sense Disambiguation
Performance on head and tail of WSD
More is Not Always Better - We describe a set of experiments to analyze properties such as the volume, provenance, and balancing of training data in the framework of a state-of-the-art WSD system when evaluated on the SemEval-2013 English all-words dataset.
The role of unannotated data
(replication, demo) This paper presents a reproduction study of Yuan et al. (2016) using mostly openly available datasets (GigaWord, SemCor, OMSTI) and software (TensorFlow). Our study showed that similar results can be obtained with much less data.
Other
- CornettoExportParser: Utilities to process the DebVisDic export of the Cornetto database
- ECB-parser
- Graph-based-WSD
- MFS_classifier : This repo contains the scripts to attempt to remove the mfs bias from a WSD system.
- OpenDutchWordnet: This repo provides a python module to work with Open Dutch WordNet. It was created using python 3.4.
- PostmaVossenGWC2014: This repository provides the code to replicate the results from PostmaVossenGWC2014
- pwgc: tool to load the princeton wordnet gloss corpus
- SemanticOverfitting
- sval_systems
- svm_wsd: Word Sense Disambiguation system developed on the DutchSemCor project using Support Vector Machines. The input is plain text, and the output XML
- vua-wsd-sem2015: System for the CLTL participation in SemEval2015 task 13: multilingual all-words sense disambiguation and entity linking
- WNEventTopicInspection
- WordNetMapper: This repo provides the possibility to map between lexical keys | offsets | ilidefs from one wordnet version to the other ["16","17","171","20","21","30"]. It makes use of the index.sense files from WordNet (http://wordnet.princeton.edu/) and the automatically generated mappings between WordNet offsets (http://nlp.lsi.upc.edu/tools/download-map.php)
- WordNetSimilarity: Programs and scripts that test performance of WordNet similarity measurements using different settings
- WordnetTools: Set of functions to use a wordnet in Wordnet-LMF format
- WordNet_Ambiguity: This repository provides a way to compute the WordNet ambiguity of a sequence of text.
- WSD-gold-standards-analysis: This module provides an IPython Notebook to analyze existing gold standards used to evaluate Word Sense Disambiguation.
- WSD_corpora
- WSD_error_analysis: This repo provides various ways to analyse system submissions to wsd competitions.