azadyasar / neuralmachinetranslation Goto Github PK
View Code? Open in Web Editor NEWPyTorch implementation of NMT models along with custom tokenizers, models, and datasets
PyTorch implementation of NMT models along with custom tokenizers, models, and datasets
The wrong test file specified in the README:
python -m nmt evaluate --test_dataset ../data/test.csv
=>
python -m nmt evaluate --test_dataset ../data/eng-tur-test.csv
This is given OK in https://towardsdatascience.com/neural-machine-translation-inner-workings-seq2seq-and-transformers-229faff5895b.
Could you please tell me how to generate the .model files? And What's the en_sp.vocab meaning for? Thanks.
I was wondering, how does the decoder work during inference time? in particular, since the transformer takes in an entire batch (tensor) and thus outputs an entire tensor too it seemed to me that unlike an RNN, it can't take in it's own predictions without re-running the decoder every time it generates a token (e.g. it first takes in the start token then generates its first prediction then one takes its first prediction and the start token and continues). This processed didn't seem vectorized which worried me. Regardless, is this how it's done (e.g. in pytorch)?
I guess that's how it's done since your code does it:
and so does the official pytorch tutorial code:
def evaluate(eval_model, data_source):
eval_model.eval() # Turn on the evaluation mode
total_loss = 0.
src_mask = model.generate_square_subsequent_mask(bptt).to(device)
with torch.no_grad():
for i in range(0, data_source.size(0) - 1, bptt):
data, targets = get_batch(data_source, i)
if data.size(0) != bptt:
src_mask = model.generate_square_subsequent_mask(data.size(0)).to(device)
output = eval_model(data, src_mask)
output_flat = output.view(-1, ntokens)
total_loss += len(data) * criterion(output_flat, targets).item()
return total_loss / (len(data_source) - 1)
https://pytorch.org/tutorials/beginner/transformer_tutorial.html
So in 2021 there is no way to vectorized this?
btw, thanks for sharing your awesome code and blog! :)
related: https://www.quora.com/unanswered/Do-transformers-use-their-own-output-during-inference-time
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.