explainaboard-experiments's Introduction

ExplainaBoard-experiments

for keeping track of experiments done with ExplanaBoard

Summary of scripts

src/process_wmt21reports.py: organize the json reports from explainaboard into data points with metrics, also calculated URIEL distances
src/process_wmt21train.py: organize the training data from WMT21 into data points with data size, type token ratio, ttr distance, and subword tokenization (subword not implemented yet).
src/linear_regression.py: helper functions for linear regression.
- builds regression pipelines from polynomial or simple regression, (basis expansion can be added too, but not added yet.)
- trains pipelines using bootstrapping,
- get feature importances
- prints results (MSE and R2).
src/generate_reports.sh : generates reports from explainaboard from a given input directory
sample_data: for now, I have a pkl of a data frame to be used for regression related models.
notebooks/linear-regression-analysis.ipynb: notebook containing results/plots/analysis so far, related to regression models. mostly linear regression was explored, but I tried out SVM and GPR as non-linear examples.

explainaboard-experiments's People

feedback:

add input/output data in the description, problem statement
reconsider performance metrics r2/mse- what would be a reasonable performance to expect for these models? how to baseline these performances? (refer to LangRank paper)
consider comparing predicting metrics from each data point vs. predicting mean value of the metric
correlation per bucketed features?
double check data processing
try other models like xgboost

leftover still to do:

beyond predicting bleu/mover_score/etc. from uriel/input data/sys output/reports, consider following analysis:

system by system analysis: for which systems/buckets did one system do better/worse than expected
what are the features of the language that are correlated with over/under-performance on particular phenomenon
system performs better for one metric vs. another? metric vs. metric analysis

Recommend Projects