Giter VIP home page Giter VIP logo

fasterrcnn_keras's Introduction

keras-frcnn

Keras implementation of Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.

USAGE:

  • Both theano and tensorflow backends are supported. However compile times are very high in theano, and tensorflow is highly recommended.

  • train_frcnn.py can be used to train a model. To train on Pascal VOC data, simply do: python train_frcnn.py -p /path/to/pascalvoc/.

  • the Pascal VOC data set (images and annotations for bounding boxes around the classified objects) can be obtained from: http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar

  • simple_parser.py provides an alternative way to input data, using a text file. Simply provide a text file, with each line containing:

    filepath,x1,y1,x2,y2,class_name

    For example:

    /data/imgs/img_001.jpg,837,346,981,456,cow

    /data/imgs/img_002.jpg,215,312,279,391,cat

    The classes will be inferred from the file. To use the simple parser instead of the default pascal voc style parser, use the command line option -o simple. For example python train_frcnn.py -o simple -p my_data.txt.

  • Running train_frcnn.py will write weights to disk to an hdf5 file, as well as all the setting of the training run to a pickle file. These settings can then be loaded by test_frcnn.py for any testing.

  • test_frcnn.py can be used to perform inference, given pretrained weights and a config file. Specify a path to the folder containing images: python test_frcnn.py -p /path/to/test_data/

  • Data augmentation can be applied by specifying --hf for horizontal flips, --vf for vertical flips and --rot for 90 degree rotations

NOTES:

  • config.py contains all settings for the train or test run. The default settings match those in the original Faster-RCNN paper. The anchor box sizes are [128, 256, 512] and the ratios are [1:1, 1:2, 2:1].
  • The theano backend by default uses a 7x7 pooling region, instead of 14x14 as in the frcnn paper. This cuts down compiling time slightly.
  • The tensorflow backend performs a resize on the pooling region, instead of max pooling. This is much more efficient and has little impact on results.

Example output:

ex1 ex2 ex3 ex4

ISSUES:

  • If you get this error: ValueError: There is a negative shape in the graph!
    than update keras to the newest version

  • Make sure to use python2, not python3. If you get this error: TypeError: unorderable types: dict() < dict() you are using python3

  • If you run out of memory, try reducing the number of ROIs that are processed simultaneously. Try passing a lower -n to train_frcnn.py. Alternatively, try reducing the image size from the default value of 600 (this setting is found in config.py.

fasterrcnn_keras's People

Contributors

small-yellow-duck avatar thomasjanssens avatar yhenon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

fasterrcnn_keras's Issues

Batch_size Modification

It seems that when we use train_on_batch to train our Faster RCNN model, we couldn't change the batch_size directly as what in model.fit( ). So I just simply concatenated a mini-batch of X, X2, Y, Y1 and Y2 and put them into the training process. However, the results turned out pretty bad. What's the reason of that and how could I modify the batch_size in training of Faster RCNN model?

            ...
            if len(rpn_accuracy_rpn_monitor) == epoch_length and C.verbose:
                mean_overlapping_bboxes = float(sum(rpn_accuracy_rpn_monitor))/len(rpn_accuracy_rpn_monitor)
                rpn_accuracy_rpn_monitor = []
                print('Average number of overlapping bounding boxes from RPN = {} for {} previous iterations'.format(mean_overlapping_bboxes, epoch_length))
                if mean_overlapping_bboxes == 0:
                    print('RPN is not producing bounding boxes that overlap the ground truth boxes. Check RPN settings or keep training.')

#            X, Y, img_data = next(data_gen_train)

            batch_size = 4
            X = []
            Y = [[],[]]
            img_datas = []
            iter_batch = 0
            for iter_batch in range(batch_size):
                X_s, Y_s, img_data = next(data_gen_train)
                X.append(X_s[0])
                Y[0].append(Y_s[0][0])
                Y[1].append(Y_s[1][0])
                img_datas.append(img_data)
            X = np.array(X)
            Y = [np.array(Y[0]),np.array(Y[1])]

            loss_rpn = model_rpn.train_on_batch(X, Y)
            P_rpn = model_rpn.predict_on_batch(X)
            ...
            ...
            X2s = []
            Y1s = []
            Y2s = []
            X_classifier = []
            Y_classifier = []
            iter_batch = 0
            for iter_batch in range(batch_size):

                # note: calc_iou converts from (x1,y1,x2,y2) to (x,y,w,h) format
                X2, Y1, Y2 = roi_helpers.calc_iou(R[iter_batch], img_datas[iter_batch], C, class_mapping)

                if X2 is None:
                    rpn_accuracy_rpn_monitor.append(0)
                    rpn_accuracy_for_epoch.append(0)
                    continue

                neg_samples = np.where(Y1[0, :, -1] == 1)
                pos_samples = np.where(Y1[0, :, -1] == 0)

                if len(neg_samples) > 0:
                    neg_samples = neg_samples[0]
                else:
                    neg_samples = []

                if len(pos_samples) > 0:
                    pos_samples = pos_samples[0]
                else:
                    pos_samples = []

                rpn_accuracy_rpn_monitor.append(len(pos_samples))
                rpn_accuracy_for_epoch.append((len(pos_samples)))
                if C.num_rois > 1:
                    if len(pos_samples) < C.num_rois//2:
                        selected_pos_samples = pos_samples.tolist()
                    else:
                        selected_pos_samples = np.random.choice(pos_samples, C.num_rois//2, replace=False).tolist()
                    try:
                        selected_neg_samples = np.random.choice(neg_samples, C.num_rois - len(selected_pos_samples), replace=False).tolist()
                    except:
                        selected_neg_samples = np.random.choice(neg_samples, C.num_rois - len(selected_pos_samples), replace=True).tolist()
    
                    sel_samples = selected_pos_samples + selected_neg_samples
                else:
                    # in the extreme case where num_rois = 1, we pick a random pos or neg sample
                    selected_pos_samples = pos_samples.tolist()
                    selected_neg_samples = neg_samples.tolist()
                    if np.random.randint(0, 2):
                        sel_samples = random.choice(neg_samples)
                    else:
                        sel_samples = random.choice(pos_samples)
                
                X2s.append(X2[:, sel_samples, :][0])
                Y1s.append(Y1[:, sel_samples, :][0])
                Y2s.append(Y2[:, sel_samples, :][0])

            X2s = np.array(X2s)
            Y1s = np.array(Y1s)
            Y2s = np.array(Y2s)

#            loss_class = model_classifier.train_on_batch([X, X2[:, sel_samples, :]], [Y1[:, sel_samples, :], Y2[:, sel_samples, :]])
            
            if X2s.shape[0] != batch_size:
                continue
            loss_class = model_classifier.train_on_batch([X, X2s], [Y1s, Y2s])
            ...

the mAP of voc 2007

what is the mAP of the model you trained on voc2007? I have trained the model but the mAP is 47.6 ?

Corresponding to Faster R-CNN Paper

Hi, i'd like to ask something regarding Faster R-CNN from the paper and the implementation here

  1. Why the number of kernel/filter for rpn_classifer equals to num_of_anchor instead of num_of_anchor*2 like in the paper ?
  2. Can you please explain about the loss function being used for the RPN ? I don't think it's the same as decribed in the paper
    return K.sum(y_true[:, :, :, :4 * num_anchors] * (x_bool * (0.5 * x * x) + (1 - x_bool) * (x_abs - 0.5))) / K.sum(epsilon + y_true[:, :, :, :4 * num_anchors])

return K.sum(y_true[:, :, :, :num_anchors] * K.binary_crossentropy(y_pred[:, :, :, :], y_true[:, :, :, num_anchors:])) / K.sum(epsilon + y_true[:, :, :, :num_anchors])

Thank You

train_frcnn.py

inv_map = {v: k for k, v in class_mapping.iteritems()}
应该写成
inv_map = {v: k for k, v in class_mapping.items()}

Exception generator has no attribute next

Hi

I encounter the following error on line 265 of train_frcnn.py

Exception generator has no attribute next

due to
X, Y, img_data = data_gen_train.next() on line 167

Could you please help me out :)

The test image can not detect object

After training all images, in my experiment, I have only one class except background, the loss also realize very low. But test other images, no object is detected. I am so confused
some information:
Training images per class:
{'bg': 0, 'shoe': 760}
Num classes (including bg) = 2

small target detection

What parameters of "faster rcnn" need to be adjusted when implementing small target detection?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.