Comments (1)
maybe you need to pad id, mask and token_type_ids as below?
padding_length = self.max_len - len(ids)
ids = ids + ([0]*padding_length)
mask = mask + ([0]*padding_length)
token_type_ids = token_type_ids + ([0]*padding_length)
I noticed that this small portion is not in your code but is in Abhishek's
class BERTDataset:
def __init__(self, review, target):
self.review = review
self.target = target
self.tokenizer = tokenizer
self.max_len = max_len
def __len__(self):
return len(self.review)
def __getitem__(self, item):
review = str(self.review[item])
review = " ".join(review.split())
tokenized_inputs = self.tokenizer.encode_plus(
review,
None,
add_special_tokens=True,
max_length=self.max_len,
padding=True,
truncation=True
)
ids = tokenized_inputs["input_ids"]
mask = tokenized_inputs["attention_mask"]
token_type_ids = tokenized_inputs["token_type_ids"]
padding_length = self.max_len - len(ids)
ids = ids + ([0]*padding_length)
mask = mask + ([0]*padding_length)
token_type_ids = token_type_ids + ([0]*padding_length)
return {
"ids": torch.tensor(ids, dtype=torch.long),
"mask": torch.tensor(mask, dtype=torch.long),
"token_type_ids": torch.tensor(token_type_ids, dtype=torch.long),
"targets": torch.tensor(self.target[item], dtype=torch.float),
}
from bert-sentiment.
Related Issues (14)
- Getting issue while loading model HOT 6
- ModuleNotFoundError: No module named 'tokenizers.tokenizers'
- Requirement.txt is missing HOT 1
- In Dataset.py
- TypeError: _init_() got an unexpected keyword argument 'comment_text' and AttributeError: module 'config' has no attribute 'DEVICE'
- requirements.txt file is missing
- line 15 model.py HOT 1
- In file train.py line 13 (get_linear_schedule_with_warmup import)
- Loading DataParallel GPU model on CPU HOT 7
- how to train on multi classes data? HOT 1
- index out of range HOT 3
- multi-label classification problem HOT 3
- IndexError: index ? is out of bounds for axis 0 with size ?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bert-sentiment.