akshaylamba / fasterrcnn_keras Goto Github PK

View Code? Open in Web Editor NEW

62.0 5.0 34.0 140 KB

License: Apache License 2.0

Python 100.00%

fasterrcnn_keras's Introduction

keras-frcnn

Keras implementation of Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.

USAGE:

Both theano and tensorflow backends are supported. However compile times are very high in theano, and tensorflow is highly recommended.
train_frcnn.py can be used to train a model. To train on Pascal VOC data, simply do: python train_frcnn.py -p /path/to/pascalvoc/.
the Pascal VOC data set (images and annotations for bounding boxes around the classified objects) can be obtained from: http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
simple_parser.py provides an alternative way to input data, using a text file. Simply provide a text file, with each line containing:

filepath,x1,y1,x2,y2,class_name

For example:

/data/imgs/img_001.jpg,837,346,981,456,cow

/data/imgs/img_002.jpg,215,312,279,391,cat

The classes will be inferred from the file. To use the simple parser instead of the default pascal voc style parser, use the command line option -o simple. For example python train_frcnn.py -o simple -p my_data.txt.
Running train_frcnn.py will write weights to disk to an hdf5 file, as well as all the setting of the training run to a pickle file. These settings can then be loaded by test_frcnn.py for any testing.
test_frcnn.py can be used to perform inference, given pretrained weights and a config file. Specify a path to the folder containing images: python test_frcnn.py -p /path/to/test_data/
Data augmentation can be applied by specifying --hf for horizontal flips, --vf for vertical flips and --rot for 90 degree rotations

NOTES:

config.py contains all settings for the train or test run. The default settings match those in the original Faster-RCNN paper. The anchor box sizes are [128, 256, 512] and the ratios are [1:1, 1:2, 2:1].
The theano backend by default uses a 7x7 pooling region, instead of 14x14 as in the frcnn paper. This cuts down compiling time slightly.
The tensorflow backend performs a resize on the pooling region, instead of max pooling. This is much more efficient and has little impact on results.

Example output:

ISSUES:

If you get this error: ValueError: There is a negative shape in the graph!
than update keras to the newest version
Make sure to use python2, not python3. If you get this error: TypeError: unorderable types: dict() < dict() you are using python3
If you run out of memory, try reducing the number of ROIs that are processed simultaneously. Try passing a lower -n to train_frcnn.py. Alternatively, try reducing the image size from the default value of 600 (this setting is found in config.py.

fasterrcnn_keras's People

Contributors

Stargazers

Watchers

fasterrcnn_keras's Issues

Batch_size Modification

It seems that when we use train_on_batch to train our Faster RCNN model, we couldn't change the batch_size directly as what in model.fit( ). So I just simply concatenated a mini-batch of X, X2, Y, Y1 and Y2 and put them into the training process. However, the results turned out pretty bad. What's the reason of that and how could I modify the batch_size in training of Faster RCNN model?

            ...
            if len(rpn_accuracy_rpn_monitor) == epoch_length and C.verbose:
                mean_overlapping_bboxes = float(sum(rpn_accuracy_rpn_monitor))/len(rpn_accuracy_rpn_monitor)
                rpn_accuracy_rpn_monitor = []
                print('Average number of overlapping bounding boxes from RPN = {} for {} previous iterations'.format(mean_overlapping_bboxes, epoch_length))
                if mean_overlapping_bboxes == 0:
                    print('RPN is not producing bounding boxes that overlap the ground truth boxes. Check RPN settings or keep training.')

#            X, Y, img_data = next(data_gen_train)

            batch_size = 4
            X = []
            Y = [[],[]]
            img_datas = []
            iter_batch = 0
            for iter_batch in range(batch_size):
                X_s, Y_s, img_data = next(data_gen_train)
                X.append(X_s[0])
                Y[0].append(Y_s[0][0])
                Y[1].append(Y_s[1][0])
                img_datas.append(img_data)
            X = np.array(X)
            Y = [np.array(Y[0]),np.array(Y[1])]

            loss_rpn = model_rpn.train_on_batch(X, Y)
            P_rpn = model_rpn.predict_on_batch(X)
            ...

            ...
            X2s = []
            Y1s = []
            Y2s = []
            X_classifier = []
            Y_classifier = []
            iter_batch = 0
            for iter_batch in range(batch_size):

                # note: calc_iou converts from (x1,y1,x2,y2) to (x,y,w,h) format
                X2, Y1, Y2 = roi_helpers.calc_iou(R[iter_batch], img_datas[iter_batch], C, class_mapping)

                if X2 is None:
                    rpn_accuracy_rpn_monitor.append(0)
                    rpn_accuracy_for_epoch.append(0)
                    continue

                neg_samples = np.where(Y1[0, :, -1] == 1)
                pos_samples = np.where(Y1[0, :, -1] == 0)

                if len(neg_samples) > 0:
                    neg_samples = neg_samples[0]
                else:
                    neg_samples = []

                if len(pos_samples) > 0:
                    pos_samples = pos_samples[0]
                else:
                    pos_samples = []

                rpn_accuracy_rpn_monitor.append(len(pos_samples))
                rpn_accuracy_for_epoch.append((len(pos_samples)))
                if C.num_rois > 1:
                    if len(pos_samples) < C.num_rois//2:
                        selected_pos_samples = pos_samples.tolist()
                    else:
                        selected_pos_samples = np.random.choice(pos_samples, C.num_rois//2, replace=False).tolist()
                    try:
                        selected_neg_samples = np.random.choice(neg_samples, C.num_rois - len(selected_pos_samples), replace=False).tolist()
                    except:
                        selected_neg_samples = np.random.choice(neg_samples, C.num_rois - len(selected_pos_samples), replace=True).tolist()
    
                    sel_samples = selected_pos_samples + selected_neg_samples
                else:
                    # in the extreme case where num_rois = 1, we pick a random pos or neg sample
                    selected_pos_samples = pos_samples.tolist()
                    selected_neg_samples = neg_samples.tolist()
                    if np.random.randint(0, 2):
                        sel_samples = random.choice(neg_samples)
                    else:
                        sel_samples = random.choice(pos_samples)
                
                X2s.append(X2[:, sel_samples, :][0])
                Y1s.append(Y1[:, sel_samples, :][0])
                Y2s.append(Y2[:, sel_samples, :][0])

            X2s = np.array(X2s)
            Y1s = np.array(Y1s)
            Y2s = np.array(Y2s)

#            loss_class = model_classifier.train_on_batch([X, X2[:, sel_samples, :]], [Y1[:, sel_samples, :], Y2[:, sel_samples, :]])
            
            if X2s.shape[0] != batch_size:
                continue
            loss_class = model_classifier.train_on_batch([X, X2s], [Y1s, Y2s])
            ...

the mAP of voc 2007

what is the mAP of the model you trained on voc2007? I have trained the model but the mAP is 47.6 ?

Corresponding to Faster R-CNN Paper

Hi, i'd like to ask something regarding Faster R-CNN from the paper and the implementation here

Why the number of kernel/filter for rpn_classifer equals to num_of_anchor instead of num_of_anchor*2 like in the paper ?
Can you please explain about the loss function being used for the RPN ? I don't think it's the same as decribed in the paper
return K.sum(y_true[:, :, :, :4 * num_anchors] * (x_bool * (0.5 * x * x) + (1 - x_bool) * (x_abs - 0.5))) / K.sum(epsilon + y_true[:, :, :, :4 * num_anchors])

return K.sum(y_true[:, :, :, :num_anchors] * K.binary_crossentropy(y_pred[:, :, :, :], y_true[:, :, :, num_anchors:])) / K.sum(epsilon + y_true[:, :, :, :num_anchors])

Thank You

train_frcnn.py

inv_map = {v: k for k, v in class_mapping.iteritems()}
应该写成
inv_map = {v: k for k, v in class_mapping.items()}

Exception generator has no attribute next

I encounter the following error on line 265 of train_frcnn.py

Exception generator has no attribute next

due to
X, Y, img_data = data_gen_train.next() on line 167

Could you please help me out :)

iou

The test image can not detect object

After training all images, in my experiment, I have only one class except background, the loss also realize very low. But test other images, no object is detected. I am so confused
some information:
Training images per class:
{'bg': 0, 'shoe': 760}
Num classes (including bg) = 2

small target detection

What parameters of "faster rcnn" need to be adjusted when implementing small target detection?

akshaylamba / fasterrcnn_keras Goto Github PK

fasterrcnn_keras's Introduction

keras-frcnn

fasterrcnn_keras's People

Contributors

Stargazers

Watchers

Forkers

fasterrcnn_keras's Issues

Batch_size Modification

the mAP of voc 2007

Corresponding to Faster R-CNN Paper

train_frcnn.py

Exception generator has no attribute next

iou

The test image can not detect object

small target detection

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent