Giter VIP home page Giter VIP logo

computer-vision-nd's Introduction

Computer Vision Nanodegree

This repository contains my exercises and projects for the Computer Vision Nanodegree at Udacity.

Project 1: Facial Keypoint Detection

Facial Keypoint Detection Project

In this project, I build a facial keypoint detection system. The system consists of a face detector that uses Haar Cascades and a Convolutional Neural Network (CNN) that predict the facial keypoints in the detected faces. The facial keypoint detection system takes in any image with faces and predicts the location of 68 distinguishing keypoints on each face.

Some results from my facial keypoint detection system:


The Udacity repository for this project: P1_Facial_Keypoints

Project 2: Image Captioning

Image Captioning Project

In this project, I design and train a CNN-RNN (Convolutional Neural Network - Recurrent Neural Network) model for automatically generating image captions. The network is trained on the Microsoft Common Objects in COntext (MS COCO) dataset. The image captioning model is displayed below.

Image Captioning Model Image source

One good and one not so good sample made by my model:

sample_171
sample_193

The Udacity repository for this project: CVND---Image-Captioning-Project

Project 3: Landmark Detection

Landmark Detection Project

In this project, I implement SLAM (Simultaneous Localization and Mapping) for a 2-dimensional world. Sensor and motion data gathered by a simulated robot is used to create a map of an environment. SLAM gives us a way to track the location of a robot in the world in real-time and identify the locations of landmarks such as buildings, trees, rocks, etc.

The Udacity repository for this project: Project_Landmark Detection

Exercises

  • Image Representation & Classification - In this exercise, I learn how images are represented numerically and implement image processing techniques, such as color masking and binary classification.
  • Convolutional Filters and Edge Detection - In this exercise, I learn about frequency in images and implement my own image filters for detecting edges and shapes in an image. Use Haar cascade classifiers from the OpenCV library to perform face detection.
  • Types of Features & Image Segmentation - In this exercise, I program a corner detector and learn techniques, like k-means clustering, for segmenting an image into unique parts.
  • Feature Vectors - In this exercise, I learn how to describe objects and images using feature vectors (ORB, FAST, BRIEF, HOG).
  • CNN Layers and Feature Visualization - In this exercise, I define and train my own convolution neural network for clothing recognition. Learn to use feature visualization techniques to see what the network had learned.
  • YOLO - In this exercise, I learn about the YOLO (You Only Look Once) multi-object detection model and work with a YOLO implementation. Implement YOLO to work with my webcam.
  • LSTMs - In this exercise, I learn about Long Short-Term Memory Networks (LSTM), and similar architectures which have the benefits of preserving long-term memory. Implement a Character-Level LSTM model.
  • Attention Mechanisms - Todo.

The Udacity repository for the exercises: CVND_Exercises

Localization Exercises

  • Optical Flow - In this exercise, I learn about and implement Optical Flow.
  • Robot Localization - In this exercise, I learn how to implement a Bayesian filter to locate a robot in space and represent uncertainty in robot motion.
  • Mini-project: 2D Histogram Filter - In this exercise, I write sense and move functions for a (and debug) 2D histogram filter.
  • Introduction to Kalman Filters - In this exercise, I learn the intuition behind the Kalman Filter, a vehicle tracking algorithm, and implement a one-dimensional tracker.
  • Representing State and Motion - In this exercise, I learn to represent the state of a car in a vector that can be modified using linear algebra.
  • Matrices and Transformation of State - In this exercise, I learn about the matrix operations that underly multidimensional Kalman Filters.
  • Simultaneous Localization and Mapping (SLAM) - In this exercise, I learn how to implement SLAM: simultaneously localize an autonomous vehicle and create a map of landmarks in an environment.
  • Vehicle Motion and Calculus - In this exercise, I review some basic calculus and learn how to derive the x and y components of a self-driving car's motion from sensor measurements and other data.

The Udacity repository for the exercises: CVND_Localization_Exercises

computer-vision-nd's People

Contributors

bjarten avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.