Comments (2)
Hi,
Yes, the pre-trained models like BERT and RoBERTa cannot be finetuned using longer sequence lengths than the maximum sequence length that they were pre-trained on as it will violate the pre-defined positional embedding rules, etc. This is why it is prevented from the huggingface implementation and errors out when attempts to increase max-seq-length above 512. Since the current version of LTP is implemented on top of RoBERTa model as a baseline, and has the same issue.
The setting of sequence length 1024 you found in the paper (in section A.2, probably?) is to demo the effect of long sequence lengths on processing latency and did not require a pre-trained checkpoint.
The possible workaround that I would suggest is to find a checkpoint that has been trained with longer sequence lengths (there might be some models specialized in processing long documents) and extend/migrate the LTP implementation to that model class.
Hope this helps answer your question.
from ltp.
Hi Sehoon,
Thanks for your reply. It's very helpful! I recently came cross several works which have addressed the 512 token length limitation, e.g., longformer and bigbird, which leverage the memory requirement by modifying the attention mechanism. Those models are pretrained on max_sequence_length over 512. I'm concerned that if LTP implementation can be extended/migrated to those models without pre-training.
And another quick question about your token pruning implementation. Is it possible to reconstruct the pruned tokens as demonstrated in Figure. 2 in the paper and make the final pruning result interpretable for the downstream task? Thanks!
from ltp.
Related Issues (10)
- Why don't mask during Testing?
- Cannot run with or without installing transformer. HOT 1
- No mask used in evaluation process
- Inference about hard pruning
- Does attention mask reduce computation costοΌ
- will token number becom larger when fix threshold (hard training step)?
- FLOPs HOT 2
- Where to get the pretrained model with max-seq-length over 512? HOT 4
- Some specified arguments are not used by the HfArgumentParser HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ltp.