shekkizh / fcn.tensorflow Goto Github PK

View Code? Open in Web Editor NEW

1.3K 54.0 530.0 1.08 MB

Tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation (http://fcn.berkeleyvision.org)

License: MIT License

Python 17.61% Jupyter Notebook 82.39%

segmentation tensorflow fcn

fcn.tensorflow's People

Contributors

Stargazers

Watchers

Forkers

barongeng xuanaux papamadeleine2022 benjamesbabala joelkronander anjanameenakshikumar simmoncn neurorobotictech 4nonymou5 yiqinggit lach76 ichito chenbangfeng shichaosuper junmyung pollymcc chagge robertzhang19 anyong298 yanyuzhy manila95 leehomyc yranibro jajenqin strawmakiyo orientier7 richard-chau ferrarisf50 scatiger wanjinchang chankeh lyk125 yuejack geoyi nitika-verma technologiclee gxpjia huazhuoyu yyuzhong drzhanying obendidi geraore kewenjing1020 nasirml thatfreesky yuechengli zilongzhong davidtranno1 kyocen aicarmark kevinoriginal nearxdu mandy1007 lg-code-repo abhagat-splunk zqaidwj1314 huluhu pengkiki samgregoost amuthu1996 ishitatakeshi cdathuraliya ssduthb xi-studio ligua chenzhaobin polybahn phoenix1992 jlertle digitalglobe pneha2612 forloverj ccsuzjj niranjan-kotha sharonzhu hughsyx yankai317 koryako chenfei-wu goooodday kongmo ayanzadeh93 coderzbx justdoit1990 joseph-zhong kerzhao digitalbrain79 superkey79 mangotreemango roszcz zhangzhizz siskaj mumu-cheng neuralnetworkingtechnologies hxkwindwizard hanqingwangai deepdriving chillpen flyflywang wkcn

fcn.tensorflow's Issues

The code in FCN.py,line 86, the final conv layer should be conv5_4 not conv5_3

Hi there,
I notice that in FCN.py,line 86, it says:
conv_final_layer = image_net["conv5_3"]
but actually according to the model you're using ,the conv_final_layer should be conv5_4.
Is that right? Or could you please tell me why you use conv5_3 instead?
Thanks

Clean the checkpoint after any modify

It's about checkpoint..

[Solved] Problems with TensorFlow 1.0 and Windows

Hi there,

First, I wanted to say thanks for sharing! I'm working through the code to help with my own segmentation project and having something to work from is a big help.

Second, I came across a few issues (minor really) that I've figured out and wanted to share:

TensorFlow 1.0 replaced tf.pack() with tf.stack().
In TensorFlow 1.0, variables should be initialised using tf.global_variables_initializer()
In Windows, the os.path.splittext() should use "\ \", rather than '/'. Otherwise, the program can't find any files to pickle (and the MITSceneParsing.pickle file is empty), which in turn means 0 records are found and the feed dict instruction doesn't work.

Like I said, pretty minor stuff, but I wanted to post in case anyone else had any issues.

Best regards,

Frazer

P.S. If you get an out of memory error, it's likely because you're trying to work with 20,000 images, which might be a bit too much. I deleted some of the training images and it worked.

FCN vs Deepmask

Hi, you have more experience in ML and computer vision so I just want to know your opinion on Image Segmentation.

What do you think will yield better results in a case of image segmentation, this implementation of FCN or Deepmask from Facebook? Can you also elaborate on why?

Right know I am in the phase of learning, so thank you for any insight.

ValueError: Cannot feed value of shape (1, 6000, 6000, 3, 1) for Tensor 'annotation:0', which has shape '(?, 6000, 6000, 1)'

Thanks, for sharing your resource Mr. Shekkizh. In annotation it's it brings to me error .. I used python 3.6 and Tensorflow 1.0 . Is it from my working environment ? I only changed the dataset.

setting up vgg initialized conv layers ...
Setting up summary op...
Setting up image reader...

Found pickle file!
40
8
Setting up dataset reader

Initializing Batch Dataset Reader...
{'resize': True, 'resize_size': 6000}
(40, 6000, 6000, 3)
(40, 6000, 6000, 3, 1)

Initializing Batch Dataset Reader...
{'resize': True, 'resize_size': 6000}
(8, 6000, 6000, 3)
(8, 6000, 6000, 3, 1)
Setting up Saver...

Traceback (most recent call last):

File "", line 1, in
runfile('C:/Users/Yared/Desktop/Project/FCN.py', wdir='C:/Users/Yared/Desktop/Project')

File "C:\Users\Yared\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
execfile(filename, namespace)

File "C:\Users\Yared\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "C:/Users/Yared/Desktop/Project/FCN.py", line 225, in
tf.app.run()

File "C:\Users\Yared\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))

File "C:/Users/Yared/Desktop/Project/FCN.py", line 196, in main
sess.run(train_op, feed_dict=feed_dict)

File "C:\Users\Yared\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 767, in run
run_metadata_ptr)

File "C:\Users\Yared\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 944, in _run
% (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))

ValueError: Cannot feed value of shape (1, 6000, 6000, 3, 1) for Tensor 'annotation:0', which has shape '(?, 6000, 6000, 1)'

How can i solve these problems?

runfile('C:/Users/PROCOMP-9/Desktop/FCN.tensorflow-master/FCN.tensorflow-master/FCN.py', wdir='C:/Users/PROCOMP-9/Desktop/FCN.tensorflow-master/FCN.tensorflow-master')
setting up vgg initialized conv layers ...
Setting up summary op...
Setting up image reader...
Found pickle file!
0
0
Setting up dataset reader
Initializing Batch Dataset Reader...
{'resize': True, 'resize_size': 224}
(0,)
(0,)
Initializing Batch Dataset Reader...
{'resize': True, 'resize_size': 224}
(0,)
(0,)
Setting up Saver...
****************** Epochs completed: 1******************

Traceback (most recent call last):

  File "<ipython-input-1-6062f5716837>", line 1, in <module>
    runfile('C:/Users/PROCOMP-9/Desktop/FCN.tensorflow-master/FCN.tensorflow-master/FCN.py', wdir='C:/Users/PROCOMP-9/Desktop/FCN.tensorflow-master/FCN.tensorflow-master')

  File "C:\Users\PROCOMP-9\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
    execfile(filename, namespace)

  File "C:\Users\PROCOMP-9\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "C:/Users/PROCOMP-9/Desktop/FCN.tensorflow-master/FCN.tensorflow-master/FCN.py", line 223, in <module>
    tf.app.run()

  File "C:\Users\PROCOMP-9\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 43, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))

  File "C:/Users/PROCOMP-9/Desktop/FCN.tensorflow-master/FCN.tensorflow-master/FCN.py", line 194, in main
    sess.run(train_op, feed_dict=feed_dict)

  File "C:\Users\PROCOMP-9\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 766, in run
    run_metadata_ptr)

  File "C:\Users\PROCOMP-9\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 943, in _run
    % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))

ValueError: Cannot feed value of shape (0,) for Tensor 'input_image:0', which has shape '(?, 224, 224, 3)'

How to train on Pascal dataset?

Hello,

I was wondering what all is need to be done to get this working with the Pascal dataset. I see that the output placeholder channel size is hardcoded to 1 and the pascal annotations are rgb images so that would need to be changed to 3, along with the number of classes and whatnot. I've tried changing those but its giving me an error in the loss function line and I'm having a hard time understanding how logits which is of shape (?, ?, ?, num_classes) can be compared with y_output which is of shape (?, width, height, channel).

Also a separate question, but do you know how to compute intersection over union for the output?

Thanks

edit: Spent a little bit of time looking around and it looks like I need to figure out how to map the color mapped segmentation labels they give us in the dataset with a 0-20 integer indexed version

Test with single image

How can we test it with single image with out giving it training label/mask ? To predict the segmentation ?

errors with loss

I use tensorflow 1.0 gpu on windows and I got error:

Traceback (most recent call last):
File "FCN.py", line 221, in
tf.app.run()
File "C:\Users\SEELE\AppData\Local\Programs\Python\Python35\lib\site-packages
tensorflow\python\platform\app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "FCN.py", line 152, in main
loss = tf.reduce_mean((tf.nn.sparse_softmax_cross_entropy_with_logits(logits
,tf.squeeze(annotation, squeeze_dims=[3]),name="entropy")))
File "C:\Users\SEELE\AppData\Local\Programs\Python\Python35\lib\site-packages
tensorflow\python\ops\nn_ops.py", line 1684, in sparse_softmax_cross_entropy_wit
h_logits
labels, logits)
File "C:\Users\SEELE\AppData\Local\Programs\Python\Python35\lib\site-packages
tensorflow\python\ops\nn_ops.py", line 1533, in _ensure_xent_args
"named arguments (labels=..., logits=..., ...)" % name)
ValueError: Only call sparse_softmax_cross_entropy_with_logits with named argu
ments (labels=..., logits=..., ...)

How can I solve it?

AttributeError: 'module' object has no attribute 'get_model_data'

/usr/bin/python2.7 /home/yared/Desktop/Project/FCN.py
  setting up vgg initialized conv layers ...
     Traceback (most recent call last):
     File "/home/yared/Desktop/Project/FCN.py", line 225, in <module>
       tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/default/_app.py", line 30, in run
    sys.exit(main(sys.argv))
  File "/home/yared/Desktop/Project/FCN.py", line 150, in main
    pred_annotation, logits = inference(image, keep_probability)
  File "/home/yared/Desktop/Project/FCN.py", line 75, in inference
    model_data = utils.get_model_data(FLAGS.model_dir, MODEL_URL)
AttributeError: 'module' object has no attribute 'get_model_data'

A possible bug when resizing annotations.

If resize==true, the image and annotation will be resized to a given size.
However, in the resizing process of the annotation, the interpolation method is default: bilinear, which causes some mistakes labels near the edge of objects.
So, I think the interpolation method should be 'nearest' when resizing annotations.

Thanks a lot!

Test the trained model?

Dear all,
How I can test the trained model on a group of new images?
thanks for your help.

Question: FCN32, FCN16, or FCN8?

I just wanted to know if this is an implementation of an FCN 32, 16, or 8...Thanks in advance.

[SOLVED] Loss won't decrease, predictions are all the same

Hi! I would like to reproduce your results.
Just running the code like python FCN.py doesn't seem to do the job for me.
The default parameters are:

IMAGE_SIZE = 224 ~~(changing it to 256 does not affect the results)~~
learning_rate = 1e-4
batch_size = 2

What I get is that training and validation loss start at about 400, and very quickly (200 iterations) decrease until they settle to about 3.

Step: 0, Train_loss:415.754
2016-10-13 12:19:13.407670 ---> Validation_loss: 395.876
Step: 10, Train_loss:28.7208
Step: 20, Train_loss:10.2944
Step: 30, Train_loss:5.06159
Step: 40, Train_loss:4.51668
Step: 50, Train_loss:4.17936
Step: 60, Train_loss:4.55051
Step: 70, Train_loss:4.98752
Step: 80, Train_loss:3.63942
Step: 90, Train_loss:3.56676
Step: 100, Train_loss:3.96641
Step: 110, Train_loss:3.72767
Step: 120, Train_loss:3.26587
Step: 130, Train_loss:3.89015
Step: 140, Train_loss:5.48371
Step: 150, Train_loss:4.27173
Step: 160, Train_loss:3.81378
Step: 170, Train_loss:3.58391
Step: 180, Train_loss:2.79207
Step: 190, Train_loss:4.10269
Step: 200, Train_loss:4.57686
Step: 210, Train_loss:4.00551
Step: 220, Train_loss:3.1667
Step: 230, Train_loss:3.7841
Step: 240, Train_loss:3.74983
Step: 250, Train_loss:3.03212
Step: 260, Train_loss:2.85248
Step: 270, Train_loss:3.64257
Step: 280, Train_loss:3.765
Step: 290, Train_loss:4.16679
Step: 300, Train_loss:4.0291
Step: 310, Train_loss:3.95092
Step: 320, Train_loss:3.38709
Step: 330, Train_loss:2.48646
Step: 340, Train_loss:2.98015
Step: 350, Train_loss:3.59501
Step: 360, Train_loss:3.80755
Step: 370, Train_loss:3.73314
Step: 380, Train_loss:3.40185
Step: 390, Train_loss:3.89394
Step: 400, Train_loss:3.80676
Step: 410, Train_loss:2.78324
Step: 420, Train_loss:3.14695
Step: 430, Train_loss:3.29019
Step: 440, Train_loss:3.16163
Step: 450, Train_loss:3.64598
Step: 460, Train_loss:2.74009
Step: 470, Train_loss:3.93917
Step: 480, Train_loss:3.815
Step: 490, Train_loss:3.83076
Step: 500, Train_loss:4.45192
2016-10-13 12:24:10.606606 ---> Validation_loss: 3.02666

I kept it running up to 35000 iterations, which should be about 3.5 epochs, but the loss won't decrease any further.
If I then validate the model at 35000 iterations (Train_loss:2.73392, Validation_loss: 3.51286) with python FCN.py --mode visualize I get always the same prediction, whichever the input image is:

This is also, by the way, the same prediction I get with an earlier model (200 iterations).

Is there something I'm getting wrong?
Thank you

Why the scale factor in caffemodel is ignored in tf equivalent? Should we scale the input to 1/256 or 1/128 as caffe used to do?

ADEChallengeData2016 is not available

This url is not working now http://sceneparsing.csail.mit.edu/data/ADEChallengeData2016.zip

regarding patch wise training and convolutional training

Hi Sarath,

Thanks for sharing the code. I have a question regarding generating the training data set.

In the FCN paper, the authors discuss the patch wise training and fully convolutional training. What is the difference between these two?

Please refer to section 4.4 attached in the following.

It seems to me that the training mechanism is as follows, Assume the original image is MM, then iterate the MM pixels to extract N*N patch (where N<M). The iteration stride can some number like N/3 to generate overlapping patches. Moreover, assume each single image corresponds to 20 patches, then we can put these 20 patches or 60 patches(if we want to have 3 images) into a single mini-batch for training. Is this understanding right? It seems to me that this so-called fully convolutional training is the same as patch-wise training.

Fusing layer

According to the paper, we should add a 1x1 convolutional layer on top of pool4 to get a score for each class and use that score to fuse with the final layer in FC32. Finally, we use a devconv layer to get the target image.
However, in your implementation, you convert final layer of FC32, using a devconv layer, to have the same shape with pool4 layer. Then, you directly fuse pool4 with that score.
I just want to know whether the order of these operation matters?

How to test and demo?

hi,i have completed the train but i don't know how to test it?and how to demo it?can you help me ?

some problem about the result(output map)

@shekkizh when i run FCN.py to train network on my laptop , the loss value from 400 drops to 3 ,but the segmentation map is so terrible. i can hardly see the shape of segmentation from output map .should i continue to train network , or there are other factor causing the terrible situation

Train the model on our own dataset !

Hello , I've been trying to train the model on my own dataset , I've formatted my annotation to gray scale so that it's the same format as MIT , but when I launch the training , I get a training loss of nan for all iterations, is there something I'm doing wrong , or something I didn't take into consideration ?
Thank you

[[Node: entropy/entropy = SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](entropy/Reshape, entropy/Reshape_1)]]

----- 7 198 197 196 196 197 196 196 201 197 197 198 190 166 133 130 131 131 132 132 132 132 132 133 133 133 134 129 114 104 96 69 17 10 10 28 124 133 132 132 132 132 132 131 131 131 131 131 130 131 133 132 132 132 133 133 132 132 132 132 132 132 132 132 132 132 132 133 134 133 133 133 133 133 133 133 133 133 133 127
[[Node: entropy/entropy = SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](entropy/Reshape, entropy/Reshape_1)]]

This error stoped the traing. Is this related to softmax ?
loss = tf.reduce_mean((tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits,
labels=tf.squeeze(annotation, squeeze_dims=[3]),
name="entropy")))

How to calculate the segmentation accuracy of the model？

hello，thanks for your code!
I have finished training the model, how to calculate the segmentation accuracy of the model？

how can I use colored images annotation

Hi... The code uses annotations that are gray level images. What modification do I need to do, to use colored annotations?
Thanks in advance

CUDA_ERROR_OUT_OF_MEMORY

Hi, @shekkizh ,

I got the following error:

root@milton-OptiPlex-9010:/data/code/FCN.tensorflow# python FCN.py 
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so.7.5 locally
setting up vgg initialized conv layers ...
Setting up summary op...
Setting up image reader...
Found pickle file!
20210
2000
Setting up dataset reader
Initializing Batch Dataset Reader...
{'resize_size': 224, 'resize': True}
(20210, 224, 224, 3)
(20210, 224, 224, 1)
Initializing Batch Dataset Reader...
{'resize_size': 224, 'resize': True}
(2000, 224, 224, 3)
(2000, 224, 224, 1)
E tensorflow/core/common_runtime/direct_session.cc:135] Internal: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OUT_OF_MEMORY; total memory reported: 18446744073709551615
Traceback (most recent call last):
  File "FCN.py", line 223, in <module>
    tf.app.run()
  File "/root/anaconda3/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 43, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "FCN.py", line 177, in main
    sess = tf.Session()
  File "/root/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1186, in __init__
    super(Session, self).__init__(target, graph, config=config)
  File "/root/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 551, in __init__
    self._session = tf_session.TF_NewDeprecatedSession(opts, status)
  File "/root/anaconda3/lib/python3.5/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/root/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.

Could you suggest me how to fix this error: "CUDA_ERROR_OUT_OF_MEMORY; total memory reported: 18446744073709551615"

Can I train the net with images without resizing?

Can I use images with different size to train the FCN?

Why are you padding on the sixth convolutional layer?!

Hi,

On the sixth convolutional layer, the code is:

    W6 = utils.weight_variable([7, 7, 512, 4096], name="W6")
    b6 = utils.bias_variable([4096], name="b6")
    conv6 = utils.conv2d_basic(pool5, W6, b6)

At this stage, pool5 has size [batch_size, 7, 7, 512]. Now, as far as I understand, you are using a filter of size 7 by 7, in order to make your feature map of size [batch_size, 1, 1, 4096]. However, if you look at the code of conv2d_basic(...), the code is:

    def conv2d_basic(x, W, bias):
        conv = tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding="SAME")
        return tf.nn.bias_add(conv, bias)

The problem here is that the output of conv6 actually remains [batch_size, 7, 7, 4096] because you're using padding, and I am not sure that this is what we want. If we look at the official code, we'll see the sixth convolutional layer coded as:

layer {
    name: "fc6"
    type: "Convolution
    bottom: "pool5"
    top: "fc6"
    param {
        lr_mult: 1
        decay_mult: 1
    }
    param {
        lr_mult: 2
        decay_mult: 0
    }
    convolution_param {
    num_output: 4096
    pad: 0
    kernel_size: 7
    stride: 1
    }
}

They aren't using padding in this layer, which means that conv6 is actually of size [batch_size, 1, 1, 4096]. Pretty sure this is the entire point of conv6.

Am I missing something, or that part of the code was a mistake from your part?

Anyway, cheers for the code. The cleanest TF implementation of F-CNN I have seen so far.

using --mode=visualize after trained the model, but I got the wrong pred_x.png

the output file pred_x.png has no color except black.

Line 152 in FCN.py

Sorry to bother you Mr.Shekkizh, I just have a problem when I run this code and get following

Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tensorflow/python/framework/tensor_shape.py", line 547, in merge_with
self.assert_same_rank(other)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tensorflow/python/framework/tensor_shape.py", line 593, in assert_same_rank
"Shapes %s and %s must have the same rank" % (self, other))
ValueError: Shapes (?, ?, ?, 151) and (?, ?) must have the same rank

And I find out that this exception comes from here
loss = tf.reduce_mean((tf.nn.sparse_softmax_cross_entropy_with_logits(logits, tf.squeeze(annotation, squeeze_dims=[3]), name="entropy")))

I test on the shape of logits and tf.squeeze(annotation, squeeze_dims=[3]), they are (?, ?, ?, 151) and (?, 224, 224) seperately.
So is there anything wrong? I'm a little confused and hoping for your advice. Thanks a lot.

higher order of channel?

is there anyway to use this model for images with higher than 3 channel?
like 4 or 5 channel?

Line 47 BatchDatsetReader.py

Hi @shekkizh
First, thank you for sharing the code with us.
I have a question, the if statement on line 47 on BatchDatsetReader says:

if self.image_options.get("resize", False) and self.image_options["resize"]:
    resize_size = int(self.image_options["resize_size"])
    resize_image = misc.imresize(image,
                                  [resize_size, resize_size], interp='nearest')
else:
     resize_image = image

I think is should be:

if self.image_options.get("resize", True) and self.image_options["resize_size"]:
    resize_size = int(self.image_options["resize_size"])
    resize_image = misc.imresize(image,
                                [resize_size, resize_size], interp='nearest')
else:
    resize_image = image

Am I misinterpreting the parameters?

Thanks!

ValueError: Cannot feed value of shape (0,) for Tensor 'input_image:0', which has shape '(?, 224, 224, 3)'

why i can't run this programs？？

How is this executed?

Hi there!

This set up looks very useful for me! I am trying to run this on my own set of images and labels. I don't see any instructions on how to execute the app though. Could you perhaps shed some light on this?

Thanks in advance!
Lilly

The loss is Nan

I train my dataset which has only two class,so I set the NUM_CLASSES as 2,and the loss turned out to be Nan.I change the NUM _CLASSED to 3 or 151 without changing my dataset,and it can work.
I'm very confused with this,please help me.

I have tried to decrease the lr to le-7 and le-8,but it didn't work.

Slight modification, large loss values?

Thanks for your work here.

I'm trying to train a slightly modified* version of this network.

In the figure under the 'Observations' heading on the main github page for this repo (the image at logs/images/sparse_entropy.png') is the entropy on the y-axis the training or validation loss, with iterations on the x-axis? If so, I seem to be getting huge loss values in comparison. The plot has an entropy of 4.5 initially, decreasing to around 3.5 after 2500 iterations, whereas I'm getting an initial loss of 300-600, with training loss hovering between 30 and 100 by iteration 2500, and validation loss somewhere between 70 and 100 at iteration 2500.

Are these reasonable values, or has something gone seriously wrong here?

(*Details of the modified version of the network:

I've reintroduced relu5_3 between conv5_3 and pool5
I'm loading the parameters for the first two fully connected layers of VGG-19 from vlfeat.org/matconvnet/models/beta16/imagenet-vgg-verydeep-19.mat instead of initialising them randomly
Due to memory constraints, I'm holding all parameters fixed apart from the last fully connected layer of VGG-19 and the deconvolution/upsampling layers. )

Why Accuracy and Loss are not corresponding?

I have implemented the code and extended it according to the statistics (Accuracy and Loss). However, I expect a correspondence between Accuracy and Loss in TensorBoard. Unfortunately, the graphs do not agree (See Attachment). Does anyone know why?
Thanks in Advance!

how to decrease space used for logging?

Dear all,
I have some limitations on hard disk space on shared server,
Is there any necessity for logging? Or how I can use some smaller space for logging?
How I can decrease the size need for logging?
Please guide step by step
Any comment appreciated.
Thanks

Training loss remains nan

Hi,

I want to use the code to train my own image data.
And the data image is 512*512 gray image.
But when I train with them,
the loss remains nan:

Setting up Saver...
Step: 0, Train_loss:nan
2017-03-17 13:55:04.919050 ---> Validation_loss: nan
Step: 10, Train_loss:nan
Step: 20, Train_loss:nan
Step: 30, Train_loss:nan
Step: 40, Train_loss:nan

In the py file BatchDatsetReader I changed the way to read images like:

def _read_images(self):
    self.__channels = False
    self.images = np.array(
        [np.expand_dims(self._transform(filename['image']), axis=3) for filename in self.files])

But it did not work

How can I solve this problem to train my gray image ?
And what is the mean of NUM_OF_CLASSESS?

why you use avg_pool instead of max_pool in the vgg model?

In FCN.py, line 61, why you use avg_pool?

relu5_3

Hello,

Is there a particular reason why the Inference continue from the "conv5_3" level of the vgg_net and not from the "relu5_3" ? I mean teh input to "POOL 5" is "conv5_3" and not "relu5_3"

What am i missing?

Thanks!

GPU training Nan loss

Hi developers,

I encountered a weird issue. I tried the code on our Lab server, it worked well.

because recently I added my own graphic card on my macbook pro externally, so I want to try the model on it. Strange thing is, the graphic card on lab server is identical to the one on my laptop(980ti), but I encountered nan loss after a few steps on MBP:

Step: 880, Train_loss:4.60517
Step: 890, Train_loss:4.10488
Step: 900, Train_loss:3.88846
Step: 910, Train_loss:3.37081
Step: 920, Train_loss:2.04156
Step: 930, Train_loss:3.50961
Step: 940, Train_loss:nan
Step: 950, Train_loss:nan
Step: 960, Train_loss:nan
Step: 970, Train_loss:nan
Step: 980, Train_loss:nan
Step: 990, Train_loss:nan
Step: 1000, Train_loss:nan
2017-06-17 01:29:08.570643 ---> Validation_loss: nan
Step: 1010, Train_loss:nan
Step: 1020, Train_loss:nan

I googled a while but did not find an answer. Do you know why? It might not relevant to the code itself, but it's good to hear hints if you have any :)

Final convolution layer 5_3 and not 5_4?

Hello!

I've been reading up on the paper as well as reading your code to get a good grasp of how to do image segmentation using ConvNets. I was wondering why in FCN.py line 86 you set the conv layer to 5_3 instead of 5_4? Also to clarify for FCN in general, we're changing the fully connected layers to convolution layers as well which is what you're doing from 87 to 108 and then afterwards start working backwards, "deconvolving" the image, which it appears you do three times?

Thanks.

Prediction output of size 224x224 as opposed to original dimensions

Hey again!

Sorry for opening another issue, but I am in the home stretch of getting this codebase suited to my data and almost everything is coming together.

So the output predictions are of size 224x224x1 and for my purposes I need to reconstruct the images back together with their geo-location intact. In order to do so however, I need the output predictions to be of size 256x256x1.

Do you know if there is a way to restore the predictions to those dimensions without changing the segmentation mapping or resolution?

Thank you again for both this awesome repo and for your help!!

The order of the fc layers

Do you know how to load the fully-connected layers as you load the conv layers
is the following still true for fully-connected layers?

matconvnet: weights are [width, height, in_channels, out_channels]

tensorflow: weights are [height, width, in_channels, out_channels]

the output of pool5 with a image with shape[1,224,224,3] is [1,7,7,512], and the fc6 weight is [7,7,512,4096], so how to flatten the weight in matconvnet to fit the format of tensorflow?

How can I train arbitrary sized images?

Hello, thanks for your code firstly.

I found this problem was previously discussed as issue #18, I am very sad...I try to change the code at "image_options = {'resize': False}" and change the image and annotation's placeholder "IMAGE_SIZE" to "None", but it always throws the "ValueError: setting an array element with a sequence." So it still can't be solved now ?

But I really want to get an predict image, whose size is as same as the original input image (un-resized). How can I solve this problem?

Thanks in advance!

Inference using large amount of GPU memory

Thanks so much for you wonderful work here. I have been able to modify this code to train some really accurate segmentation models. I am now trying to get one of them running on a Jetson TX1, but I am having some issues. I have tensorflow 1.0.1 installed and running correctly on the TX1, but when I try to run the --visualize setting to just do inference I run out of memory. I went back to my regular desktop that has a Titan X Pascal and did some tests using nvidia-smi to try and see how much memory was being used. It appears that even during the inference it is using over 10 Gb of GPU memory on my system.

Here is the output from sudo watch nvidia-smi while doing inference only:
Before running: 745MiB / 12183MiB
While Running: 11630MiB / 12183MiB

Do you have any idea why that would be happening? Any ideas on how to reduce this to 1.5 Gb or less for the inference? I can see where it would need that much memory for the training, but I am not sure why it would be doing that on the inference.

How to visualize the prediction?

I have run the command "python FCN.py --mode=visualize", but I got result far different from the author's result. Here is my result:

Does anyone know what should I do, and how to apply the whole model in my own dataset and test data set?

Thanks.

The output are always an black picture

Hello!
I just run your code for 10 min, and the loss quickly converged to 3. But when I goes to the tensorboard, the prediction are always an black picture. Why this happen? Is there anything wrong with the parameter?
Thanks!

training termination

Dear all,
I have two question about the training process ?
1- what is the termination criteria for learning? it seems to work very long time, and maybe not need in some problem.
2-How I can stop training manually and restore last saved net(structure and weights) to use in testing the network?
in another word if I stop training manually(I dot know how?), can I use the saved net in testing?

bottleneck_unit() - TensorflowUtils.py - line 159

Hi, I was wondering what was the aforementioned function.
It is used nowhere in the code and I cannot find anything similar described in the paper.

Thanks in advance.