NLU 2023 Course Project
-
Data should be
data/train.jsonl
anddata/test.jsonl
- Each line should be
{text_a:"sencence A.", text_b:"sencence B.", label:"entailment"}
and test dataset has no label
- Each line should be
-
See report at
Report.pdf
andlatex/
-
Run
check_length.py
and determine the proper max_length for tokenizer -
Run
preprocess.py
to tokenize the dataset -
Run
nli.py