Comparing BERTs with modified LSTMs
Initial implementation was done by:
- Adam Kapica
- Piotr Kramek
Pretrained glove word embeddings
Resources for implementation:
- Get familiar with the code!
- Test and run LSTM
- Add loaders for all datasets
- Implement BERT
- Test and run BERT
- Code evaluation for BERT
- Add DistilBERT
- Compare models
- Refactor code
- Analyze and document results
- Write report
- Resources to put under data directory (https://drive.google.com/drive/folders/1dWAChGX-tV9eeJ9gldqEXKMXLk3J4gtf?usp=sharing)
- LSTM_Layer: ~250k
- LSTM_Single_Cell: ~100k
- LSTM_POS_Penn: ~4M
- LSTM_POS_Universal: ~1.5M
- BERT: ~110M
- DistilBERT: ~65M