Comments (10)
Yeah, I will finish the code after this week, because i am busy this week. :)
from wgan-gp.
Thanks @caogang, will the change just involve changing the discriminator to having ResBlock
instead of LinearBlock
?
If that's all it takes I could submit a simple PR.
from wgan-gp.
Yes, that is one urgent modification needed to be done. It would be very nice for you to submit such PR.
from wgan-gp.
Hello Caocang, doing the change simply like that leads to an error (I think the it was about a 3D vs. 2D tensor expected), I'll need to understand the code a bit better before I'm able to make it work.
If I manage to get it fixed I'll submit a PR.
from wgan-gp.
Hi @thvasilo , I have finished the gan_language code. But I have only tested it on cpu. I will test it on gpu as well and push the new result soon.
from wgan-gp.
Thank you @caogang, I'll try to test this today and will close the issue if everything works.
from wgan-gp.
Hello @caogang I've run through 200 iterations of the model and I can verify that it doesn't crash on GPU, however I'm not sure about the quality of the model.
Using the default settings and tokenized parsing (instead of character-level) all the output is just the unk
keyword. Do you have examples where you got the model to have some meaningful output?
If you can recommend parameter settings I can try them out.
Another unusual thing I've noticed is that the cost figures (train_gen, train_disc, wasserstein_distance) only report the first few iterations (7 last time I tried it), but I can't come up with a good reason on why this would happen.
Upon review of the training log, the above happens because after iteration 7 the metrics go to nan.
from wgan-gp.
The nan problem may because of the pytorch version #12 . And I am running the gan_language.py
now. But it is very slow :(.
The command output is this:
iter 1599 tmp/lang/js4 0.649939207431 tmp/lang/js1 0.0690228257595 tmp/lang/js2 0.181709832974 tmp/lang/wasserstein distance 115.948425293 tmp/lang/train disc cost -66.5119781494 tmp/lang/train gen cost -18.3391399384 tmp/lang/time 9.30638914585 tmp/lang/js3 0.390863805779
iter 1699 tmp/lang/js4 0.654350427735 tmp/lang/js1 0.082869597317 tmp/lang/js2 0.207291999431 tmp/lang/wasserstein distance 119.094825745 tmp/lang/train disc cost -67.5338134766 tmp/lang/train gen cost -0.702518522739 tmp/lang/time 9.30622985125 tmp/lang/js3 0.407354697129
iter 1799 tmp/lang/js4 0.63200969616 tmp/lang/js1 0.062857846673 tmp/lang/js2 0.165385939136 tmp/lang/wasserstein distance 122.879089355 tmp/lang/train disc cost -69.4046707153 tmp/lang/train gen cost 6.02346229553 tmp/lang/time 9.30193270206 tmp/lang/js3 0.367523874878
iter 1899 tmp/lang/js4 0.618222129172 tmp/lang/js1 0.0649802950323 tmp/lang/js2 0.172988959998 tmp/lang/wasserstein distance 124.128738403 tmp/lang/train disc cost -70.9303894043 tmp/lang/train gen cost 4.98324775696 tmp/lang/time 9.30568416595 tmp/lang/js3 0.366494706406
iter 1999 tmp/lang/js4 0.591232030789 tmp/lang/js1 0.056358720747 tmp/lang/js2 0.14490814063 tmp/lang/wasserstein distance 125.916267395 tmp/lang/train disc cost -71.5480270386 tmp/lang/train gen cost 7.57119989395 tmp/lang/time 9.28405963659 tmp/lang/js3 0.33542050967
iter 2099 tmp/lang/js4 0.566582456808 tmp/lang/js1 0.0537158501776 tmp/lang/js2 0.142843420001 tmp/lang/wasserstein distance 129.192977905 tmp/lang/train disc cost -73.951171875 tmp/lang/train gen cost 11.3406629562 tmp/lang/time 9.32169566154 tmp/lang/js3 0.315354519341
iter 2199 tmp/lang/js4 0.583774236572 tmp/lang/js1 0.0563736255341 tmp/lang/js2 0.14336678617 tmp/lang/wasserstein distance 130.59588623 tmp/lang/train disc cost -73.9788131714 tmp/lang/train gen cost 14.0651540756 tmp/lang/time 9.33803822517 tmp/lang/js3 0.325970952853
iter 2299 tmp/lang/js4 0.560677620088 tmp/lang/js1 0.0559929911158 tmp/lang/js2 0.139462115734 tmp/lang/wasserstein distance 132.936828613 tmp/lang/train disc cost -75.1215515137 tmp/lang/train gen cost 15.6881551743 tmp/lang/time 9.35610331297 tmp/lang/js3 0.309859271646
iter 2399 tmp/lang/js4 0.600723672393 tmp/lang/js1 0.0626923963729 tmp/lang/js2 0.167899202686 tmp/lang/wasserstein distance 133.518753052 tmp/lang/train disc cost -75.1536636353 tmp/lang/train gen cost 17.0819015503 tmp/lang/time 9.29543369532 tmp/lang/js3 0.351785832441
iter 2499 tmp/lang/js4 0.559481759817 tmp/lang/js1 0.0532214248546 tmp/lang/js2 0.133561413303 tmp/lang/wasserstein distance 136.571792603 tmp/lang/train disc cost -77.7102508545 tmp/lang/train gen cost 17.6871318817 tmp/lang/time 11.9401491189 tmp/lang/js3 0.305097104994
iter 2599 tmp/lang/js4 0.556855971949 tmp/lang/js1 0.0528149367548 tmp/lang/js2 0.132663441013 tmp/lang/wasserstein distance 137.70753479 tmp/lang/train disc cost -77.2345962524 tmp/lang/train gen cost 20.4136238098 tmp/lang/time 12.6488033676 tmp/lang/js3 0.305337972041
iter 2699 tmp/lang/js4 0.545001725004 tmp/lang/js1 0.0511753393725 tmp/lang/js2 0.128217736777 tmp/lang/wasserstein distance 139.771087646 tmp/lang/train disc cost -78.144317627 tmp/lang/train gen cost 19.9227981567 tmp/lang/time 11.9526500607 tmp/lang/js3 0.293619363332
The sample output after 7099 epoch is :
Tchin oat and , dreave ain atbon
Ampors ives rlad bats anilg the
" Itthis pest tore budt lical by
In have Carmees manfseon frem Su
Bose in actation Peen garger the
Thenaan har ucic awalh chinds ,
Hhw sain reauld thandid lagery w
Butt the Coict offage , fhom for
Soch on ease whost timertewed ,n
Hut , arged fort rapreauad will
Thire ofterding on ouksand came
The revensrand st porgd coucerv
But of coupe thoed incent ahe co
In thken rhan the came on parts
Thie is runcomledts ut har thata
He would Cpllation onday c hista
Tupartovesai cyrain Calledsacan
" Becumed che hendts wastrefpors
Pellows on ther phiadint on comm
Chings to vicast inrecpliits oft
Hy Moded thear oues icall comple
St he cavs the confovedts not go
Thilelace is found con inden ind
"rum ofged fangy is mave nor to
Horger sre gect and , the alse t
ome comn leln in atoll aroose or
The said fortrwith came stowacie
Shmes aver that ouvicy puitts la
The kut gest ill but bracived in
Hougd , New Moved cared that ofd
H Indangut gingi gan iceits P ow
Itsin sappation pirsealuasion th
Hey Mading onds eich is parment
It tenders cat son in his has in
In the Mp snd hery laced is she
Phvice Cuppeet goass , cormovt a
The condirent for charddents wab
Hough is coungedsty gan bies , t
A Cowraloss ion in the Funlivts
Some mont collint in the mas itt
Pupeclick , drnal impordts cat a
Nost vitergs is is sheworpaced `
And ofpendation ,ighe sad apocie
from wgan-gp.
Hmm, I'll try pulling the latest master and trying again then.
The code you are running is using the parameters as defined in gan_language.py
without any changes I assume?
I agree that the training is very slow, on the GPU I'm using it's about 5x slower than yours. I'll take a look at Fisher-GAN as an alternative in the meantime :/
from wgan-gp.
I have changed MAX_N_EXAMPLES = 10000000
to perform a full training. All other parameters is same with gan_language.py
of master branch. When training on the GPU, it costs about 8981MiB memory.
from wgan-gp.
Related Issues (20)
- D_real.backward(mone) RuntimeError: invalid gradient at index 0 - expected shape [] but got [1] HOT 5
- why D_real.backward(one) and D_fake.backward(mone)? HOT 2
- AttributeError: 'generator' object has no attribute 'next' HOT 2
- sometimes loss is negative during training model HOT 1
- Discriminator requires_grad=False when training Generator
- bug in calc_gradient_penalty? HOT 1
- Mode Collapse WGAN
- It requires grad clip in original paper which seems to be ignored in this implementation. HOT 2
- How to download the language dataset?
- ValueError: Tensor._shape cannot be assigned, use Tensor.set_shape instead. HOT 3
- (op: 'FusedBatchNorm') with input shapes: [64,256,8,8], [256], [256], [0], [0].
- Penalizing Norm of Jacobian
- WGAN-gp loss keeps going large HOT 4
- A question about Dcost
- The aixs of norm of the gradient HOT 1
- The loss funciont is wrong in the implement? HOT 1
- Is the G_loss wrong?
- How does optimizer work when there are 3 backwards(real, fake, penalty)?
- how to decide the value of λ HOT 2
- 一堆错误 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from wgan-gp.