markpwoodward / active_osl Goto Github PK

View Code? Open in Web Editor NEW

32.0 32.0 4.0 5.32 MB

Code for "Active One-shot Learning"

License: MIT License

Python 100.00%

active_osl's People

Contributors

Stargazers

Watchers

Forkers

octobeeeer songfgh raghavendran35 nutszebra

active_osl's Issues

Help me understand :)

I am currently looking into your code. I've read the paper behind it and I must say it is most impressive and really interesting. The code is pretty readable and for the most part easy to understand but there are small details I need clarification on. I must say I am rather new to tensorflow's estimator mechanism, but I've done a lot of reading just to understand your code better.

The agent contains all the trainable parts, meaning complete network architecture with himself. While the LSTM cell is stored as his private attribute, a dense layer behind it is created "just in time". So each new batch agent starts with "reuse=False", creates a new dense layer, then changes "reuse" to True. So at a t=1 new dense layer is created, and for t>1 (for the rest of the batch of episodes duration) existing dense layer is used.
This confuses me. Why do you treat dense layer different to LSTM cell? Does this mean that each new batch of episodes a new "blank" dense layer is being created?
I assume same LSTM cell, once created, is being each training step. But, it's internal state is reset each new batch of episodes. So the memory of lstm is not being transferred from batch to batch, am I right?

Would you be so kind to answer me these?
Thanks in advance!

Regression

Hi Mark!

Interesting work! I'm currently exploring online active-learning schemes for a time-series regression based problem. Your paper focuses on classification based problems. Have you explored regression problems? Is your code suited for such applications? Thanks!

Using model for predictions

Hi Mark! I'm interested in using and citing your model in my own research, but I'm currently struggling with modifying the code so that I can use the trained model for new predictions once its finished training. Would you mind elaborating on the example code you have in the README? Specifically, where that example code would fit in with the rest of the code, where and how the tensors last_label_t and features_t get generated (and how to capture them), etc.

Thank you for your help!

how can the lstm learn from request label?

1.how can the lstm learn from the request label given next timestep?

I found something that could be wrong

the reward code:
a_t = tf.cast(a_t, tf.float32)
label_t = tf.cast(label_t, tf.float32)
rewards_t = label_t*params.reward_correct + (1-label_t)*params.reward_incorrect # (batch_size, num_labels), tf.float32
rewards_t = tf.pad(rewards_t, [[0,0],[0,1]], constant_values=params.reward_request) # (batch_size, num_labels+1), tf.float32

r_t = tf.reduce_sum(rewards_t*a_t, axis=1) # (batch_size), tf.float32

is the code means the agent will get a reward no matter whether it requests the label or not,and the reward would guide agent to choose the right choice?

markpwoodward / active_osl Goto Github PK

active_osl's People

Contributors

Stargazers

Watchers

Forkers

active_osl's Issues

Help me understand :)

Regression

Using model for predictions

how can the lstm learn from request label?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent