Scoring and Classifying Positive Interpretations
This repository contains the code for conducting the experiments as reported in the following paper:
C. van Son, R. Morante, L. Aroyo, and P. Vossen. Scoring and Classifying Implicit Positive Interpretations: A Challenge of Class Imbalance. In Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018), Santa Fe, New Mexico, 2018 (to appear).
It scores and classifies the positive interpretations generated from verbal negations in OntoNotes following the approach and evaluated on the dataset as described in the following paper:
E. Blanco and Z. Sarabi. Automatic generation and scoring of positive interpretations from negated statements. In Proceedings of NAACL-HLT, San Diego, CA, pages 1431–1441, 2016.
Requirements
The Jupyter Notebooks in this repository have already been rendered, so that you can inspect the results. Please note, however, that in order to run the code, one has to first obtain the data:
- OntoNotes 5.0 / OntoNotes 4.0
- CoNLL-2011 Shared Task distribution of OntoNotes
- Positive Interpretations dataset: please contact Eduardo Blanco or Zahra Sarabi to obtain this data
The code has been tested with Python 3.6 and needs the following packages:
- nltk
- pandas
- scipy
- numpy
- scikit-learn
Content
The repository contains the following folders:
code
: contains helper scripts and 4 notebooks for running the experimentsdata
: the required data (see above) should be placed heredata_analysis
: contains the results of theData Analysis
notebookresults
: contains the feature files used for training/testing, the predictions and the summarizing tables/figures
The notebooks in the code
folder can best be run in the following order:
- 1-Data_Preparation.ipynb
- 2-Data_Analysis.ipynb
- 3-Replication_Experiment.ipynb
- 4-Error_Analysis.ipynb
Contact
Chantal van Son (c.m.van.son@vu.nl / c.m.van.son@gmail.com)
Vrije Universiteit Amsterdam