mahalrs / newsgen Goto Github PK
View Code? Open in Web Editor NEWMulti-Modal Image Generation for News Stories
License: Apache License 2.0
Multi-Modal Image Generation for News Stories
License: Apache License 2.0
requirements.txt
is empty. Add required dependencies to it.
Right now TechCrunch is the only target news source in the crawl-input.json
. We should add additional news sources to the list for a diverse news data.
Currently data loading is very slow. Speed can be improved by splitting the data files into train/val/test. Update encode_data.py
. Also, remove raw text (headlines, captions, image paths, etc. to reduce size). Save using pytorch.save
to compress it.
Original VQGAN implementation used PyTorch Lightning module but we converted to PyTorch NN module. It is much easier to do distributed training with PyTorch Lightning, so let's convert it to Lightning module.
Crawler needs to normalize urls. For example, https://www.example.com
and https://www.example.com/
are the same and crawler shouldn't treat them as separate.
Use AdamW from transformers instead of Adam. Use linear schedule with warmup, transformers.get_linear_schedule_with_warmup
.
Using PyTorch Lightning, add an easy to use training/fine-tuning script.
Using VQGAN encoder, convert all images in the given dataset to image tokens. These image tokens along with tokens from BART encoder (encoded captions/news headline) are fed to BART decoder to train it to generate image tokens given encoded captions/headlines.
We can do this as part of data transform step, however, doing it beforehand will speed up the training.
create_subset.py
takes a fixed subset size argument. Fix it to take subset size as percentage.
Since current VQGAN model is using NN module and not Lightning module, the training loop should handle calls such as optimizer.step, loss.backward(), optimizer.zero_grad(), etc.
Update Readme with instruction to run Newsgen.
Currently we are logging images every 1000 mini batches during validation and testing. However, we will not have enough images logged in case we have a small dataset or higher number of devices. Either we should change it to 100 mini-batches or take this value as a command line argument.
Add script that will take processed dataset of headlines.json and captions.json and creates a train/validation/test split.
Add generate method using greedy search or beam search.
Currently we are calculating warmup and total training steps based on data loader length. However, we need to consider the scenario of multi-device training in which case total length will be divided by number of devices.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.