yunzhuli / infogail Goto Github PK

[NIPS 2017] InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations

License: MIT License

Makefile 5.26% M4 0.07% Roff 4.15% C 16.77% Batchfile 6.17% Shell 0.81% HTML 10.04% CSS 0.03% JavaScript 0.01% C++ 51.68% RobotFramework 0.33% XSLT 0.38% TeX 2.15% Python 2.13% M 0.01%

infogail's Issues

About the implementation of formula in the paper

Dear authors,

I apologize for the interruption. I would like to know how you implemented the mutual information formula (2) in your paper using code. Alternatively, where can I find the code related to this section? Thank you for your response!

Best regards!

Can I run your code on GPU

Dear

I have tried to run your code on GPU by these steps.

Make conda virtual enviroment with packages in README.md
pip install tensorflow-gpu on the above conda virtual enviroment
run python wgail_info.py

But there are some errors like

Traceback (most recent call last):
File "wgail_info.py", line 12, in
from models import TRPOAgent
File "/home/dl-box/InfoGAIL_Survey/original_InfoGAIL/wgail_info_0/models.py", line 1, in
from utils import *
File "/home/dl-box/InfoGAIL_Survey/original_InfoGAIL/wgail_info_0/utils.py", line 22, in
tf.set_random_seed(seed)
AttributeError: 'module' object has no attribute 'set_random_seed'

I suppose that the version tensorflow-gpu doesn't match with your code.
If you have run your code on GPU, Please tell me tensorflow-gpu's version or give me some guides to run on GPU.

Thank you very much.

What did you observe when training the Discriminator K times?

Hi @YunzhuLi ,

Thanks for releasing the code. I observed here that you usually train the discriminator 10 times for each generator update. Is it standard (at least in GAIL-like settings) to train the Discriminator more times than the Generator? The GAN tutorials seem to imply one gradient update for each of the players each iteration, but does the Discriminator need to be "strengthened" more here? I am also wondering what other values you tested with (other than 10) and if you had comments on those.

Thank you.

Could you give instructions for running experiments described in the paper?

It would be great to know the commands to run to reproduce table 2, for instance.

Great paper, BTW!

reward for wasserstein gail

hello. Can anyone give the reward formula for wasserstein gail in this paper.
The author said he use wasserstein for infogail

Thank you.

SetFromFlat function doesn't work

I'm sure there is a bug in set_from_flat function, where shapes must be a list:

shapes = list(map(var_shape, var_list))

instead of a map

shapes = map(var_shape, var_list)

Therefore it does not iterate:

        for (shape, v) in zip(shapes, var_list):
            size = np.prod(shape)
            assigns.append(tf.assign(v, tf.reshape(theta[start:start + size], shape)))
            start += size

As this function was not working the generator policy was not updating neither, because it didn't set xnew (which called "new theta") to discriminator. Therefore fval was equal to newfval which are surrogate losses (it's amazing why all variable should be named so wrong, really hard to read and understand the code!). It gives actual_improve and ratio == 0.

def linesearch(f, x, fullstep, expected_improve_rate):
    accept_ratio = .1
    max_backtracks = 10
    fval = f(x)
    for (_n_backtracks, stepfrac) in enumerate(.5 ** np.arange(max_backtracks)):
        xnew = x + stepfrac * fullstep
        newfval = f(xnew)
        actual_improve = fval - newfval
        expected_improve = expected_improve_rate * stepfrac
        ratio = actual_improve / expected_improve
        if ratio > accept_ratio and actual_improve > 0:
            return xnew
    return x

You can try to run simple test:

from keras.initializers import VarianceScaling
from keras.layers import Conv2D

from wgail_info_0.utils import *


def create_generator(feats_tensor, auxs_tensor, encodes_tensor):
    feats = Input(tensor=feats_tensor)
    x = Conv2D(256, (3, 3))(feats)
    x = LeakyReLU()(x)
    x = Conv2D(256, (3, 3), strides=(2, 2))(x)
    x = LeakyReLU()(x)
    x = Flatten()(x)
    auxs = Input(tensor=auxs_tensor)
    h = merge.concatenate([x, auxs])
    h = Dense(256)(h)
    h = LeakyReLU()(h)
    h = Dense(128)(h)
    encodes = Input(tensor=encodes_tensor)
    c = Dense(128)(encodes)
    h = merge.add([h, c])
    h = LeakyReLU()(h)

    steer = Dense(1, activation='tanh', kernel_initializer=lambda shape: VarianceScaling(scale=1e-4)(shape))(h)
    accel = Dense(1, activation='sigmoid', kernel_initializer=lambda shape: VarianceScaling(scale=1e-4)(shape))(h)
    brake = Dense(1, activation='sigmoid', kernel_initializer=lambda shape: VarianceScaling(scale=1e-4)(shape))(h)
    actions = merge.concatenate([steer, accel, brake])
    return Model(inputs=[feats, auxs, encodes], outputs=actions)


if __name__ == '__main__':
    np.random.seed(1024)

    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True
    sess = tf.Session(config=config)
    from keras import backend as K

    K.set_session(sess)

    feat_dim = [7, 13, 1024]
    img_dim = [50, 50, 3]
    aux_dim = 10
    encode_dim = 2
    action_dim = 3

    feats = tf.placeholder(dtype, shape=[None, feat_dim[0], feat_dim[1], feat_dim[2]], name="feats")
    auxs = tf.placeholder(dtype, shape=[None, aux_dim], name="auxs")
    encodes = tf.placeholder(dtype, shape=[None, encode_dim], name="encodes")

    pi = create_generator(feats, auxs, encodes)
    var_list = pi.trainable_weights
    get_flat = GetFlat(sess, var_list)
    set_from_flat = SetFromFlat(sess, var_list)

    sess.run(tf.global_variables_initializer())

    th_prev = get_flat()
    th_new = th_prev * 2
    print("th_prev equals th_new:", np.array_equal(th_prev, th_new))

    print("Set th_new")
    set_from_flat(th_new)
    th_after = get_flat()

    print("Didn't update" if np.array_equal(th_prev, th_after) else "Updated")

    print("th_new equals th_after:", np.array_equal(th_new, th_after))

can you please upload the code that you have converted image data in to npz files.

I am referring to the information on demo.npz. How did you create it with those folders containing trajectory images?

class Generator is only used by behavior clone?

class Generator in line 533 of model.py is only used by behavior clone?

get_servers_input_tcp returns all-zero image

If I just run the file snakeoil_gym.py, I am getting an all zero image every time. If I print out recv_data values within get_servers_input_tcp() they are basically all x00. The simulation plays fine on the UI - i.e., drive_example executes fine and I can see the graphics within the Torcs window fine. Any idea why this is happening? Can you point me to the location of the torcs source files where you made the modifications to get tcp data? I might be able to decode this better then.

failed to send the image to the client

hi,@ YunzhuLi:
I tried to run the demo with your pretrained model.but it appeared as " Error: cannot send image via tcp" and the image kept still.
could you help me to make it work ?
thanks a lot~~~

please explain about what are the pre_actions in wgail_info.py file.

There is a pre_action.npz file in the human0 dataset and inside that there is a file with 200 rows and 3 columns. please explain what is that.

Issur about error: cannot send image via tcp

I have installed your code, torcs and all required package, but faced some problems.

When run / train python drive.py / wgail_info.py, torcs windows opened and stuck at "SET" after new game. On the terminal, it shows "Error: cannot send image via tcp" repeatedly.

I try to modify torcs.1.3.4/src/drivers/human/human.cpp (Which is the only files in your code include absolute path '/home/yunzhu/....' and I modify it. But the problem still exist.

Could you please tell me what happened?

Thanks.

Error in using np.concatenation !!!

In this line you have used np.concatenation to merge all the RGB image pixels into the imgs_d array. But when you trying to print one 50503 image it gives distorted image(even before reducing 128). I think there is an error when you use np.concatenate to 3 dim arrays. So the effect of the 50503 image gets very less to the discriminator and prior network.

standardize the advantage function

in models,py

        # Standardize the advantage function to have mean=0 and std=1
        advants_n = np.concatenate([path["advants"] for path in paths])
        # advants_n -= advants_n.mean()
        advants_n /= (advants_n.std() + 1e-8)

Is that a typo that you comment up the advants_n -= advants_n.mean()? Cause you said the the mean value of advantage function should be 0.

one question about clip

In the original paper of wasserstein gan, the weight has to be clipped between [-c, c]
However, I see in the code. The gradient has been clipped. Why could be gradient not weight?
Thank you in advanced.

self.gradients = gradients = tf.gradients(loss, self.network.var_list)
clipped_gradients = hgail.misc.tf_utils.clip_gradients(
gradients, self.grad_norm_rescale, self.grad_norm_clip)

self.global_step = tf.Variable(0, name='critic/global_step', trainable=False)
self.train_op = self.optimizer.apply_gradients([(g,v)
for (g,v) in zip(clipped_gradients, self.network.var_list)],
global_step=self.global_step)

What's the difference between 0 and 1?

What's the difference between wgail_info_0 and wgail_info_1?

Segfault while loading track

The below segfault occurs when running python drive.py with code 0. Any idea how to fix this? Thanks.

`Loading Track chenyi-street-1...
GfParmReadFile: Openning "tracks/road/chenyi-street-1/chenyi-street-1.xml" (0x233add0)
Loading Track Geometry...

Track Name no name
Track Author none
Track Length 0.00 m
Track Width 15.00 m
++++++++++++ Track ++++++++++++
name = no name
author = none
filename = tracks/road/chenyi-street-1/chenyi-street-1.xml
nseg = 0
version = 0
length = 0.000000
width = 15.000000
XSize = 0.000000
YSize = 0.000000
ZSize = 0.000000
Pits = none
Loading Track 3D Description...
GfParmReadFile: Openning "tracks/road/chenyi-street-1/chenyi-street-1.xml" (0x233b050)
Loading data/textures/background.png
Loading Environment Mapping Image env.png
Loading data/textures/env.png
Loading data/textures/env.png
Loading data/textures/envshadow.png
File shadow2.rgb not found
File Path was tracks/road/chenyi-street-1;data/img;data/textures;.
grSsgLoadTexState: File shadow2.rgb not found
WARNING: grscene:initBackground Failed to open shadow2.rgb for reading
WARNING: no shadow mapping on cars for this track
LoadAC3D loading track.ac
WARNING: ssgLoadAC: Failed to open ' tracks/road/chenyi-street-1/track.ac' for reading
/usr/local/bin/torcs: line 53: 20563 Segmentation fault (core dumped) $LIBDIR/torcs-bin -l $LOCAL_CONF -L $LIBDIR -D $DATADIR $*
Start driving ...
Waiting for server ............
Waiting for server ............
`

Generating new expert data

Could you share the code for extracting human expert data?
Thanks for sharing the code btw.

some issues about keras.initializations in the code

when I try to Run with pretrained weights(python wgail_info_0/drive.py),it occurs like below:
Traceback (most recent call last):
File "/home/lunatic/下载/InfoGAIL-master/wgail_info_0/models.py", line 6, in
from keras.initializations import normal, identity, uniform
ImportError: No module named initializations

so I try to modify 'keras.initializations' into 'keras.initializers',it still fail to run the code.

can u help me fix this problem.
thanks for sharing the code and it's interesting to see the paper running experiment in torcs BTW!

What is the meaning of logstd in the code?

What is the meaning of logstd in the code? And how to initialize the value for it? Why did you assignment it as [-2.8, -3.5] or [-3.5, -4].

What is the lambda of mutual information L?

Hi:
the codes show:
output_d.flatten() * 0.1 + np.sum(np.log(output_p) * path["encodes"], axis=1)
Can it be seen as the lambda of L is 10?

how to train a long time end-to-end driving

Hi @YunzhuLi :
I've train a pass model, but I noticed that there is a parameter 'max_step_limit' which defaults to be 300, so the car can only drive for a short time.
I saw the video on youtube, it shows the long time end-to-end driving. So does I just need to change the parameter 'max_step_limit's value?
I've tried changing its value to 10000, but my GPU memory is not enough due to the conjugate gradient method needing all batch data. Then what should I do?
By the way, I found that the drive performance gets worse with the long time training(max_step_limit is 300). What's wrong with my training?
Thank you.

yunzhuli / infogail Goto Github PK

infogail's Issues

Recommend Projects

Recommend Topics

Recommend Org