Giter VIP home page Giter VIP logo

unsupervised_detection's People

Contributors

antonilo avatar masahiroogawa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

unsupervised_detection's Issues

Can this repo predict own video?

I want detect the moving objects on my video 001.mp4,and I edit todaiura_traffic.MOV to 001.mp4 in test_video.sh,but when I run the shell,I got tensorflow.python.framework.errors_impl.NotFoundError:download/video/JPEGImages/480p/todaiura_traffic/00001.jpg; No such file or directory at inference = learner.inference(sess) in test_generator.py.

How to use in the wild?

How would one go about using the network in the wild, on a random video sequence/sequence of frames?

Why the network can learn moving segments instead of the complementary

Hi @antonilo
Thank you for sharing your impressive work. I ran your code and got results similar to yours.
However, I have some doubts about how you can ensure that the network learns the moving segment instead of the complementary? It seems like you train the moving mask and its complementary in a completely symmetrical way.

image size predicted mask on FBMS

The provided FBMS results is not consistent with the ground truth image in terms of image size. For example, the results image of the first frame of cars 10 is 384 192 but the corresponding ground truth image size is 640480.

result of post processing

i use the provided model and post processing code on davis16 dataset, the final result is 69.3 which is lower than 71.5 reported in paper. I am not sure if the result is correct.

about testing new datasets

Hi! Thank you for this repo. I was wondering, in order to test with our own dataset, do we have to create masked objects along with the original images? I did prepare my annotations with PixelAnnotation tool, but detections were not quite on the same level with your results. Was my method correct? Or do you have any other way for preparing annotations and working with test bash script?

data/segtrackv2_data_utils.py does not work as expected

I try to run the train.py script with SegTrack v2 and I stumbled over problems regarding the format of the database. I can understand if the database changed over the years but here it says that the last update of the database was on "December 9th, 2013" , while the code for reading the database was last updated on "June 7th, 2019".

First thing's first: The script looks for images in the GroundTruth but some of the experiments have multiple objects in motion, so the groundtruth of that experiment does not contain one image, but folders with images for every object. For instance "bmx" or "penguin" etc.

Second thing: The database contains raw images and the annotation for each of them. The problem with those is that the raw images have different extensions (example: frog.png, girl.bmp), while the annotation images have also different extensions (example: bmx.png, cheetah.bmp). The code can read only pngs while throwing an error when finding other formats. The fix would be rather simple.
Old code:

# Line 58
        for exp_fname in all_exp_fnames:
            current_filenames.append(os.path.join(self.image_dirs,
                                                  experiment, exp_fname + '.png'))
            assert os.path.isfile(current_filenames[-1]), \
                "Not found image {}".format(current_filenames[-1])
            current_annotations.append(os.path.join(self.annotation_dir,
                                                    experiment, exp_fname + '.png'))
            assert os.path.isfile(current_annotations[-1]), \
                "Not found image {}".format(current_annotations[-1])
            self.samples += 1

Fixed code:

# Line 58
        for exp_fname in all_exp_fnames:
            file_name = os.path.join(self.image_dirs, experiment, exp_fname + '.png')
            if not os.path.isfile(file_name):
                file_name = os.path.join(self.image_dirs, experiment, exp_fname + '.bmp')
            assert os.path.isfile(file_name), \
                "Not found image {}".format(file_name)
            current_filenames.append(file_name)

            annot_name = os.path.join(self.annotation_dir, experiment, exp_fname + '.png')
            if not os.path.isfile(annot_name):
                annot_name = os.path.join(self.annotation_dir, experiment, exp_fname + '.bmp')
            assert os.path.isfile(annot_name), \
                "Not found image {}".format(annot_name)
            current_annotations.append(annot_name)
            self.samples += 1

Let's say you somehow dodge those problems (create an image list with the experiments with only one movable object and you update the code as above). Now a new error will be thrown at you, that I believe it also comes from the way the database is read. I have tracked the error down to adversarial_learner.py at results = sess.run(fetches, feed_dict={self.is_training: True}). The error looks like this:

Traceback (most recent call last):
  File "C:\...\Anaconda3\envs\tf37\lib\site-packages\tensorflow\python\client\session.py", line 1334, in _do_call
    return fn(*args)
  File "C:\...\Anaconda3\envs\tf37\lib\site-packages\tensorflow\python\client\session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "C:\...\Anaconda3\envs\tf37\lib\site-packages\tensorflow\python\client\session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected image (JPEG, PNG, or GIF), got unknown format starting with 'BM6\334\005\000\000\000\000\0006\000\000\000(\000'
	 [[{{node DecodeJpeg}}]]
	 [[{{node data_loading/IteratorGetNext}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:/.../unsupervised_detection-master/train.py", line 46, in <module>
    main(sys.argv)
  File "C:/.../unsupervised_detection-master/train.py", line 43, in main
    _main()
  File "C:/.../unsupervised_detection-master/train.py", line 34, in _main
    trl.train(FLAGS)
  File "C:\...\unsupervised_detection-master\models\adversarial_learner.py", line 398, in train
    feed_dict={self.is_training: True})
  File "C:\...\Anaconda3\envs\tf37\lib\site-packages\tensorflow\python\client\session.py", line 929, in run
    run_metadata_ptr)
  File "C:\...\Anaconda3\envs\tf37\lib\site-packages\tensorflow\python\client\session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "C:\...\Anaconda3\envs\tf37\lib\site-packages\tensorflow\python\client\session.py", line 1328, in _do_run
    run_metadata)
  File "C:\...\Anaconda3\envs\tf37\lib\site-packages\tensorflow\python\client\session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected image (JPEG, PNG, or GIF), got unknown format starting with 'BM6\334\005\000\000\000\000\0006\000\000\000(\000'
	 [[{{node DecodeJpeg}}]]
	 [[node data_loading/IteratorGetNext (defined at C:\...\unsupervised_detection-master\data\segtrackv2_data_utils.py:216) ]]

Process finished with exit code 1

Can you help me solve this issue and update the code so that it will be able to read the SegTrackV2 properly?

Why on DAVIS 2016, the model i trained performs much worse than the trained model you provided on your project web page?

Why on DAVIS 2016, the model i trained performs much worse than the trained model you provided on your project web page?

After training for 40 epochs, my test result on DAVIS 2016 is :
"The Average over the dataset: IoU is 0.49253097810762825 and MAE is 0.13433317588858826
The Average over sequences IoU is 0.4999786164052608"

And the test result of the davis_best_model you offered, using my device, is :
"The Average over the dataset: IoU is 0.598809920117678 and MAE is 0.06665062332569166
The Average over sequences IoU is 0.597369491694537"

I wonder what caused the difference.
Is it because you trained for more epochs or other reasons?

Hoping for your reply.

Simple inference Script

I am new to tensorflow1.x
I want to run inference directly on a video.
I have experience with Pytorch and Keras. But I am finding it little hard to do it with the tensorflow1.x
Please could you help and could provide a simple script which load the model and run inference on a single image.

question about optical flow resize

hello antonilo. Thank you for such an outstanding job! At line 89 in file adversarial_learner.py, you performed a resize operation on the optical flow, through tf.image.resize. However, whether the resize operation for optical flow should be multiplied by a coefficient after resize to maintain the consistency of optical flow values before and after resize. The following is a sample code:
def resize_flow(flow, new_shape):
_, _, h, w = flow.shape
new_h, new_w = new_shape
flow = torch.nn.functional.interpolate(flow, (new_h, new_w),
mode='bilinear', align_corners=True)
scale_h, scale_w = h / float(new_h), w / float(new_w)
flow[:, 0] /= scale_w
flow[:, 1] /= scale_h
return flow

post processing soft masks original resolution?

Hello,
I'm trying to get the soft score result masks in the original resolution of input davis images from the post processing script.When i run it i get the soft score results in 384x192 and the CRF (boolean 0-1 result) in the original resolution (854x480).
Does someone know if by editing the post_processing script i can achieve that ?

train_DAVIS2016.sh

hello,I'm working on train_DAVIS2016.sh,I would like to ask if these parameters need to be modified
root_dir-----Data set location
flow_ckpt----pwcnet.ckpt-595000 location
checkpoint_dir-----checkpoint location

Now I have modified, but the error is reported
--flow_normalizer=80.0: command not found
--test_temporal_shift=1: command not found

I want to ask what these parameters mean

A bug in data/davis2016_data_utils.py

line 263-266 is:

first_fname_numbers.append(np.arange(N, N + len(fnames) - t_len,
                                    dtype=np.int32))
last_fname_numbers.append(np.arange(N + len(fnames) - t_len, N + len(fnames),
                                    dtype=np.int32))

the second line should be

last_fname_numbers.append(np.arange(N + t_len, N + len(fnames),
                                    dtype=np.int32))

Otherwise first_fname_numbers and last_fname_numbers will not have the same number of elements

Training new datasets

Hello,
I wanted to train using a new dataset and I used the script train_DAVIS2016.sh to do that(Also filled training set folder with annotations and images as in davis dataset) . When I try to run it i get the following error :
tensorflow.python.framework.errors_impl.InvalidArgumentError: buffer_size must be greater than zero.

Has anyone tried to use different datasets ? If so, did you make some changes to the scripts?

EDIT:
The problem was that I put the images inside the Jpeg and annotations folders, instead i should have created a folder inside these and putting the images inside it.

Post processing doesn't start

Hello,
when i run the post porcessing script i get the following error:
import pyflow ImportError: /home/lele/unsupervised_detection/post_processing/pyflow.so: undefined symbol: _Py_ZeroStruct
i tried installing pyflow using "pip3 install pyflow" but it does not solve the problem.

SOLVED:
the problem was with the file pyflow.so in the folder, after deleting it the script runs but there was another error:
pyflow has no attribute coarse2fine
this is because I had installed a different version of pyflow.
in order to have the exact version I had to clone this : https://github.com/pathak22/pyflow
and then run:
sudo pip3 install pyflow/
in which pyflow/ is the cloned folder.

Hope this helps others

make_initializable_iterator exception

Hello,
I'm trying to run the test_DAVIS2016_raw script. When i execute it i get the following execption:

[AttributeError: 'PrefetchDataset' object has no attribute '_make_initializable_iterator'
raised in line 2733 of /python3.8/site-packages/tensorflow/python/data/ops/dataset_ops.py" file. Has anyone had this issue?
I should note that i used tf_upgrade_v2 to update the project scripts to tensorfflow v2.

Out of memoery

after everything is setup, the main training loop throws this exception:


Training 1 Recover and 3 Generator

2020-11-24 11:32:25.434730: E tensorflow/stream_executor/cuda/cuda_driver.cc:868] failed to alloc 4294967296 bytes on host: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2020-11-24 11:32:25.435620: W ./tensorflow/core/common_runtime/gpu/cuda_host_allocator.h:44] could not allocate pinned host memory of size: 4294967296

I have a gpu of 12GB and i think maybe it's a setting rather than a hardware shortage.
Any idea? thanks a lot for any advices

results on FBMS

I use the provided model and test it on FBMS test dataset, the result is 0.498. The script I used is test_DAVIS2016_raw.sh with corresponding modification for FBMS dataset. I also tried the post processing code but the result is not good, either. I am not sure what the problem is?

How to use pretrained checkpoints?

I am trying to use pretrained checkpoints (your model and PWCNet) specified in README.md to test on DAVIS2016. I am modifying file test_DAVIS2016_raw.sh
Your model checkpoint has 3 files under davis_best_model

  • model.best.data-00000-of-00001
  • model.best.index
  • model.best.meta

PWCNet checkpoint has 4 files under the folder on google drive

  • checkpoint
  • pwcnet.ckpt-595000.data-00000-of-00001
  • pwcnet.ckpt-595000.index
  • pwcnet.ckpt-595000.meta

Which one should I specify for CKPT_FILE and PWC_CKPT_FILE?
Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.