roxot / aevnmt.pt Goto Github PK
View Code? Open in Web Editor NEWPyTorch implementation of Auto-Encoding Variational Neural Machine Translation
License: MIT License
PyTorch implementation of Auto-Encoding Variational Neural Machine Translation
License: MIT License
Transformer training still needs some improvements:
When prior.sizes is defined, prior.size is not needed (since sum(prior.sizes) == prior.size).
Make prior.size behave the same way as prior.params; Multiple semi-colon separated values, which gets converted to a list of ints by the argument parser. Changes need to be applied in the arg parser, and in the code base wherever hparams.prior.size / hparams.prior.sizes is used.
log_prob should not be part of any model component, rather a component should return an object with a log_prob method. Because of padding this is not possible for Categorical distributions.
Possible solution: Add a class PaddedCategorical(Categorical) which has a padding argument in log_prob (and other required methods). This should be straightforward to extend to other likelihoods as well.
For the library to grow, it really needs some testing functionality.
feed_z arguments (for example, gen.tm.dec.feed_z) are now boolean which mirrors the RNN implementations, but the Transformer architectures support multiple options.
Additionally, We could also support feed_z methods for TM encoders: RNN architectures always initialize with z as hidden state, but for Transformers there are multiple options.
While the new hparams are much better than before, it still gets very verbose. One reason is that we have one argparser for any model and evaluation, which results in a lot of unused+confusing args.
I've added support for adding/removing arguments with arg groups (see bottom of args.py). These could be used to construct different argparsers, based on which model is used in which context.
Possible solution: Each model in the library gets its own arg groups with model-specific arguments, which is combined with train/eval specific arg groups and (if needed) other general arguments. These can already be combined by the existing argparser in hparams/hparams.py
When translating, Vocabulary will import the vocab file with from_file()
. This will open the file with the following encoding: ISO-8859-2
. When this is removed, the vocab file is correctly read.
The current implementation of distributed training is done with DataParallel, which has some drawbacks.
Jsonargparse changes its behaviour depending on the index of the --hparams_file argument; If any command line arguments are before --hparams_file, they are overridden by the contents of hparams_file.
This can cause issues that are hard to track, and the preferred behavior is that the command line argument always have precedence over config file arguments.
I've added a temporary solution by checking for the first argument in hparams.py, but a better solution would be to somehow remove this feature from jsonargparse.
I've left a TODO in the AEVNMTTrainer code for this issue for now.
To fix this issue, the Loss functions need arguments that accept these aux likelihoods, and a method that adds the additional likelihoods to the loss (mixture, auxilliary, dir prior, etc).
The RNN initialization in initialize_models uses different parameters for cell_type=lstm, and initializes all params with "rnn." in the parameter name. The new hparam format supports different cell types for inference, tm and lm, which causes a problem with this method.
Possible solution: Split initialize_model to initialize_tm, initialize_lm, initialize_inf
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.