If we use SST-2 accuracy as evaluator, DeBERTa should get the crown according to their papers.
Super Hero | SST-2 Acc |
---|---|
ALBERT | 95.2/ 97.1 |
BART | 96.6 |
BERT | 94.9 |
BORT | 96.2 |
ConvBERT | 95.7 |
DeBERTa | 96.8/ 97.5 |
DistilBERT | 91.3 |
ELECTRA | 96.9 |
Funnel Transformer | 95.0 |
FLAVA | 90.94 |
FNet | 95 |
GPT | 92 |
I-BERT | 96.4 |
MobileBERT | 92.8 |
MPNet | 95.5/ 96 |
Nystromformer | 91.4 |
RoBERTa | 96.5 |
SqueezeBERT | 92.2 |
XLNet | 94.4 |
YOSO | 92.3 |