keitakurita / practical_nlp_in_pytorch Goto Github PK
View Code? Open in Web Editor NEWA repository containing tutorials for practical NLP using PyTorch
A repository containing tutorials for practical NLP using PyTorch
In the bert_with_fastai notebook, would you know how to do a single text prediction?
Thanks
I think it would be a tiny bit faster if you allocate the whole memory for h_t and c_t from the beginning in OptimizedLSTM. I mean:
if init_states is None:
h_t, c_t = (torch.zeros(bs, self.hidden_size).to(x.device),
torch.zeros(bs, self.hidden_size).to(x.device))
instead of:
if init_states is None:
h_t, c_t = (torch.zeros(self.hidden_size).to(x.device),
torch.zeros(self.hidden_size).to(x.device))
Hi, very helpful repo, learned a lot from it.
I got a question about an implementation detail in TransformerXL.
In the transformer_xl_from_scratch
notebook, the memory length during validation is calculated as val_memory_length + train_bptt - val_bptt
.
Why aren't it just set to val_memory_length
?
Looking forward to reply.
In /deep_dives/lstm_from_scratch.ipynb
,
class NaiveLSTM(nn.Module):
....
def forward(self, x: torch.Tensor,
init_states: Optional[Tuple[torch.Tensor]]=None
) -> Tuple[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]]:
"""Assumes x is of shape (batch, sequence, feature)"""
...
Shouldn't init_states
be:
init_states: Optional[Tuple[torch.Tensor, torch.Tensor]]=None
instead of just:
init_states:Optional[Tuple[torch.Tensor]]=None
Because init_states
needs exactly two values to unpack to h_t, c_t
in:
....
else:
h_t, c_t = init_states
I need to have the class category for classification. Can you please tell me the required changes in the code? I am a beginner.
From the transformer implementation here:
class DecoderBlock(nn.Module):
level = TensorLoggingLevels.enc_dec_block
def __init__(self, d_model=512, d_feature=64,
d_ff=2048, n_heads=8, dropout=0.1):
super().__init__()
self.masked_attn_head = MultiHeadAttention(d_model, d_feature, n_heads, dropout)
self.attn_head = MultiHeadAttention(d_model, d_feature, n_heads, dropout)
self.position_wise_feed_forward = nn.Sequential(
nn.Linear(d_model, d_ff),
nn.ReLU(),
nn.Linear(d_ff, d_model),
)
self.layer_norm1 = LayerNorm(d_model)
self.layer_norm2 = LayerNorm(d_model)
self.layer_norm3 = LayerNorm(d_model)
self.dropout = nn.Dropout(dropout)
def forward(self, x, enc_out,
src_mask=None, tgt_mask=None):
# Apply attention to inputs
att = self.masked_attn_head(x, x, x, mask=src_mask)
x = x + self.dropout(self.layer_norm1(att))
# Apply attention to the encoder outputs and outputs of the previous layer
att = self.attn_head(queries=att, keys=x, values=x, mask=tgt_mask)
x = x + self.dropout(self.layer_norm2(att))
# Apply position-wise feedforward network
pos = self.position_wise_feed_forward(x)
x = x + self.dropout(self.layer_norm2(pos))
return x
In the forward mehtod, should:
att = self.attn_head(queries=att, keys=x, values=x, mask=tgt_mask)
Not be:
att = self.attn_head(queries=att, keys=enc_out, values=enc_out, mask=tgt_mask)
hello, i want to ask why the input word embeddings shape is (seq, batch_size, emb_dim)? i think it is should be (batch_size, seq_len, emb_dim).
The batch_size represent the number of sentense, and the seq represent the length of each sentense.
class Config(dict):
def init(self, **kwargs):
super().init(**kwargs)
for k, v in kwargs.items():
setattr(self, k, v)
def set(self, key, val):
self[key] = val
setattr(self, key, val)
config = Config(
testing=False,
bert_model_name="bert-base-uncased",
max_lr=3e-5,
epochs=1,
use_fp16=False,
bs=4,
discriminative=False,
max_seq_len=128,
)
what is the importance of function set here , could you please give an example?
Hey Kei
I found your excellent tutorial when I was searching for ELMO.
I installed the latest AllenNLP (0.8.4) and downloaded your code. I ran it and the cell that contains:
train_ds, test_ds = (reader.read(DATA_ROOT / fname) for fname in ["train.csv", "test.csv"])
and it produced the following error msg:
KeyError: "None of [['toxic', 'severe_toxic', 'obscene', 'threat', 'insult', 'identity_hate']] are in the [index]"
The train.csv is from a download of the Jigsaw dataset when I participated in the toxic comment competition. The header of the train.csv looks like:
"id","comment_text","toxic","severe_toxic","obscene","threat","insult","identity_hate"
Have you had a chance to run the code w/ the latest AllenNLP? If not, which version were you using? Just being lazy and hoping for a quick pointer before I dive in...
Thx,
SH
;-)
attn(q, k, v)
RuntimeError Traceback (most recent call last)
in
----> 1 attn(q, k, v)
/opt/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
323 for hook in self._forward_pre_hooks.values():
324 hook(self, input)
--> 325 result = self.forward(*input, **kwargs)
326 for hook in self._forward_hooks.values():
327 hook_result = hook(self, input, result)
in forward(self, q, k, v, mask)
28 attn = attn / attn.sum(dim=-1, keepdim=True)
29 attn = self.dropout(attn)
---> 30 output = torch.bmm(attn, v) # (Batch, Seq, Feature)
31 log_size(output, "attention output size") # (Batch, Seq, Seq)
32 return output
RuntimeError: bmm(): argument 'mat2' (position 1) must be Variable, not torch.FloatTensor
—
attn_head = AttentionHead(20, 20)
attn_head(q, k, v)
TypeError: torch.mm received an invalid combination of arguments - got (torch.FloatTensor, Variable), but expected one of:
Hi!
what if I want to use scibert embedding in my model is it enough just to replace this code :
`from allennlp.data.token_indexers import PretrainedBertIndexer
token_indexer = PretrainedBertIndexer(
pretrained_model="bert-base-uncased",
max_pieces=config.max_seq_len,
do_lowercase=True,
)
def tokenizer(s: str):
return token_indexer.wordpiece_tokenizer(s)[:config.max_seq_len - 2]`
by this code
` from allennlp.data.token_indexers import PretrainedBertIndexer
token_indexer = PretrainedBertIndexer(
pretrained_model="scibert-scivocab-uncased",
max_pieces=config.max_seq_len,
do_lowercase=True,
)
def tokenizer(s: str):
return token_indexer.wordpiece_tokenizer(s)[:config.max_seq_len - 2]`
when trying to understand
bert_text_classification.ipynb
this part of notebook
from allennlp.data.token_indexers import PretrainedBertIndexer
token_indexer = PretrainedBertIndexer(
pretrained_model="bert-base-uncased",
max_pieces=config.max_seq_len,
do_lowercase=True,
)
def tokenizer(s: str):
return token_indexer.wordpiece_tokenizer(s)[:config.max_seq_len - 2]
gives this error
ImportError Traceback (most recent call last)
in ()
----> 1 from allennlp.data.token_indexers import PretrainedBertIndexer
2
3 token_indexer = PretrainedBertIndexer(
4 pretrained_model="bert-base-uncased",
5 max_pieces=config.max_seq_len,
ImportError: cannot import name 'PretrainedBertIndexer'
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.
To view examples of installing some common dependencies, click the
"Open Examples" button below.
allennlp version used is 1.0.0
seems version differ , I could not find the solution
what should I do?
Hi ty for your tutorial. But I can't figure out why sometimes tensors have shape (input_size, hidden_size) and sometimes (hidden_size, hidden_size)
Hi,
I have created a custom databunch which I am trying to load using load_data. But I am getting an attribute error -
File “/home/views.py”, line 641, in get
path, r"/home/data_save.pkl")
File “/usr/local/lib/python3.7/site-packages/fastai/basic_data.py”, line 281, in load_data
ll = torch.load(source, map_location=‘cpu’) if defaults.device == torch.device(‘cpu’) else torch.load(source)
File “/usr/local/lib/python3.7/site-packages/torch/serialization.py”, line 529, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File “/usr/local/lib/python3.7/site-packages/torch/serialization.py”, line 702, in _legacy_load
result = unpickler.load()
AttributeError: Can’t get attribute ‘FastAiBertTokenizer’ on <module ’ main ’ from ‘manage.py’>
The FastAiBertTokenizer has been defined in the program but I am still getting the error.
Maybe I have to define this function or import it in the context that I’m loading the databunch. But I don’t know how.
This is the code -
path = Path()
data = load_data(path, r"data_save.pkl")
bert_model = CustomBertModel()
learn = Learner(data, bert_model, metrics=[accuracy])
st2 = torch.load(r"final_model_base.pth", map_location=torch.device('cpu'))
learn.model.state_dict(st2)
Can you help me with this?
You do not have a license file.
Any restrictions on reusing / modifying your code and/or blog text?
Thanks
Thanks for this great tutorial on tranformer. In transformer_xl model I don't see any Encoder class and the decoder is fed directly with inputs. Could you clarify what part of the transformer_xl code is responsible for encoder ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.