Keras '2.0.8'
Hyperopt '0.1'
- Download th data from competition Codalab.
https://competitions.codalab.org/competitions/19938
- Download 300d word vectors pre-trained on Google News corpus.
https://code.google.com/archive/p/word2vec/
- Change filepaths for data and for word embeddings if needed:
in help_prep_functions.py
in loadW2vModel()
function insert filepath for word embeddings
in preprocessing_tweets.py
and preprocessing_reddit.py
change filepaths for data if needed.
- Choose features option:
In prep_pipeline.py
on line 98:
def main(data ='RumEval2019', feats = 'SemEvalfeatures')
feats can be either text
for avgw2v representation of the tweets or SemEvalfeatures
for additional extra features concatenated with avgw2v.
- Download necessary nltk packages, if they have not been downloaded.
In Python interactive interpreter, input the following commands:
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
- Run preprocessing script
python prep_pipeline.py
The description of the model architecture can be found in https://www.aclweb.org/anthology/S/S17/S17-2083.pdf The features used in this code are different to the ones used in the paper.
-
In
outer_semeval2019.py
you can choose the number of trials that the search algorithm performs while searching for the parameter combination. -
In
parameter_search.py
you can define search_space. -
Make sure
output/
folder has been created
mkdir output
- Run the baseline
python outer_semeval2019.py
If you have any questions feel free to contact me [email protected] or other task organisers [email protected]