Giter VIP home page Giter VIP logo

clifs's Introduction

Contrastive Language-Image Forensic Search

Overview

CLIFS is a proof-of-concept for free text searching through videos for video frames with matching contents. This is done using OpenAI's CLIP model, which is trained to match images with the corresponding captions and vice versa. The searching is done by first extracting features from video frames using the CLIP image encoder and then getting the features for the search query through the CLIP text encoder. The features are then matched by similarity and the top results are returned, if above a set threshold.

To allow easy use of the CLIFS backend, a simple web server running django is used to provide an interface to the search engine.

Examples

To give an idea of the capability of this model, a few examples are shown below, with the search query in bold and the result below. These search queries are done against the 2 minute Sherbrooke video from the UrbanTracker Dataset. Only the top image result for each query is shown. Note that the model is in fact quite capable of OCR.

A truck with the text "odwalla"

alt text

A white BMW car

alt text

A truck with the text "JCN"

alt text

A bicyclist with a blue shirt

alt text

A blue SMART car

alt text

Setup

  1. Run the setup.sh script to setup the folders and optionally download a video file for testing:
./setup.sh
  1. Put your own video files that you want to index in the data/input directory

  2. Build and start the search engine and web server containers through docker-compose:

docker-compose build && docker-compose up

Optionally, a docker-compose file with GPU support can be used if the host environment has a NVIDIA GPU and is setup for docker GPU support:

docker-compose build && docker-compose -f docker-compose-gpu.yml up
  1. Once the features for the files in the data/input directory have been encoded, as shown in the log, navigate to 127.0.0.1:8000 and search away.

clifs's People

Contributors

johanmodin avatar jamesdconley avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.