Comments (4)
from pytorch-a3c.
Hey, that's good to hear. I can create this PR. I've been adapting your great code in another project and have been adding docstrings to some functions and classes e.g. this ASCII art architecture to the ActorCritic class:
Implementation of A3C (https://arxiv.org/abs/1602.01783).
____________________________________________________________________________________________________
A3C policy model architecture
Image Processing module -> Flattened output -> Policy Learning Module ---> Final output
______________________ ___________________________________
| _______ ____ | __ | ___________ |
image -> | 4x |conv2d| + |ELU| | |__| | | LSTM | --> Critic FC-> | -> value
| |______| |___| | -> |__| --> | --> |__________| --> Actor FC -> | -> policy logits
| | |__| | ^ ^ | -> (hx, cx)
| | |__| | | | |
|_____________________| | prev cx hx |
|__________________________________|
____________________________________________________________________________________________________
Processes an input image (with num_input_channels) with 4 conv layers,
interspersed with 4 elu activation functions. The output of the final layer is then flattened
and passed to an LSTM (with previous or initial hidden and cell states (hx and cx)).
The new hidden state is used as an input to the critic and value nn.Linear layer heads,
The final output is then the predicted value, action logits, hx and cx.
"""
I can add a few things like this if you want in the PR? But otherwise if you just want the name change, I can do that as well
from pytorch-a3c.
I created the PR here with the typo fix:
#61
If you would like the ASCII art and some other documentation, happy to create another PR.
from pytorch-a3c.
Thanks for merging the PR!
I assume I'll sell my ascii art on the black market to the next highest bidder? 😛 I'm happy to close this issue.
from pytorch-a3c.
Related Issues (20)
- gradient share problem HOT 1
- What's the difference between environment 'Pong-v4' and 'PongDeterministic-v4'
- Reward Smoothing
- Multi-processing or multi-threading HOT 1
- The while True loop of function train?
- NotImplementedError HOT 6
- [Question] Does a2c support distributed processing?
- Question in train.py
- with respect to how to choose an action
- How does A3C aggregate the model from different learner? HOT 1
- Why do we reverse rewards? HOT 1
- Dependency list not provided (environment.yml file)
- Stuck in 'p.join()' HOT 1
- After some steps, all the NNs always output same action HOT 1
- Scepticism about the correctness of the use of the LSTMCell
- Can you provide the python, pytorch, numpy and other versions used in the project?
- TypeError: tuple indices must be integers or slices, not tuple
- if there's no "if shared_param.grad is not None: return" what will happen? HOT 1
- where see the result?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch-a3c.