Giter VIP home page Giter VIP logo

mcv_cnn_framework's Introduction

M5 Project: Scene Understanding for Autonomous Vehicles

The goal of this project is to learn the basic concepts and techniques to build deep neural networks to detect, segment and recognize specific objects, focusing on the self-driving car application. With the aim to solve the problem of automatic image understanding, the tasks performed include object recognition, detection and semantic segmentation in images recorded by an on-board vehicle camera.

Team members

Index

Applications

This repository creates a PyTorch based framework to achieve three goals:

Get Started

Object Recognition and Semantic Segmentation

Installation

Environment Set Up:

  • Python 3.7
  • Pytorch -- cudatoolkit, torchvision
pip install -r requirements.txt

Run the code

# --exp_name: directory where results are stored
# --config_dile: file where the configuration for code is set up
python3 main.py --exp_name dir_name --exp_folder ./ --config_file config/configFile.yml

Object Detection

Installation

In order to execute the framework for object detection, different steps have to be followed. First, see source repository

1. Prerequisits

  • Python 3.6
  • Pytorch 1.0
  • Cuda 8 or hihger

2. Data preparation

The framework requires COCO and PASCAL to be installed in order to work properly

  • PASCAL_VOC 07+12: Please follow the instructions in py-faster-rcnn to prepare VOC datasets. After downloading the data, create softlinks in the folder object_detection/faster-rcnn.pytorch/data/.

  • COCO: Download from the respository COCOAPI and store in folder object_detection/faster-rcnn.pytorch/data/

  • UDACITY and other nonVoc Datasets

    • First make a folder inside of the data folder with the name of the dataset.
    • Create a folder called annotations_cache
    • Create a folder called results
    • Create a folder called nameOfDatasetYear
    • Inside the nameOfDatasetYear folder, create the following structure:
    /Annotations 
    /ImageSets/Layout 
    /ImageSets/Main 
    /ImageSets/Segmentation 
    /JPEGImages 
    /test 
    /train 
    /valid 
    • Copy the images and the txt files of the dataset to the test, train and valid folders.
    • Copy all the images to the JPEGImages folder
    • Copy the convert_to_voc.py file to the /nameofDataset/nameOfDatasetYear and execute it with python
    • Clone /lib/datasets/pascal_voc.py and make the modifications to adapt it to your dataset
    • Go to /lib/datasets/factory.py and add the cll to your clone of the /lib/datasets/pascal_voc.py
    • Add the name of dataset to the options in the /test_net.py and /trainval_net.py

3. Pretrained Models

The framework uses VGG16 or Restnet101 as baseline architectures. The weights of the networks, trained with Caffe, must be stored in the folder object_detection/framework/pretrained_models/

Link to download the models from the source repository:

4. Compilation

pip install -r requirements.txt

cd lib
python setup.py build develop

Run the code

Train

LEARNING_RATE=lr
BATCH_SIZE=batchSize
DECAY_STEP=decayStep
DATASET=udacity_voc #udacity_voc or pascal_voc
NETWORK=res101 #res101 or vgg16 
EPOCHS=numberEpochs

python3 trainval_net.py --dataset $DATASET --net $NETWORK \
                       --bs $BATCH_SIZE --nw 1 \
                       --lr $LEARNING_RATE --lr_decay_step $DECAY_STEP \
                       --cuda --mGPUs --epochs $EPOCHS

Test

python3 test_net.py --dataset  $DATASET --net $NETWORK \
                       --cuda --mGPUs --checksession $CHECK_SESSION --checkepoch $CHECK_EPOCH --checkpoint $CHECK_MODEL

Demo

Script which loads the trained model and saves the result image detection in the folder object_detection/framework/images/

python demo.py --net res101 \
               --checksession $SESSION --checkepoch $EPOCH --checkpoint $CHECKPOINT --cuda --load_dir models/

Report

Object Recognition Semantic Segmentation Object Detection
Presentation Presentation Presentation

Complete Report

Overleaf Read-Access link

State of the Art publications

Weights Folder

mcv_cnn_framework's People

Contributors

joselgomez avatar gvillalonga89 avatar lauramorab avatar mgilar avatar richardseba avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.