Giter VIP home page Giter VIP logo

natural-language-youtube-search's Introduction

Natural Language YouTube Search

Open In Colab

Use OpenAI's CLIP neural network to search inside YouTube videos. You can try it by running the notebook on Google Colab.

New

How it works

  1. Download the YouTube video
  2. Extract every N-th frame
  3. Encode all frames using CLIP
  4. Encode a natural language search query using CLIP
  5. Find the images that best match the search query

For more details see the notebook.

Examples

Here are some example searches from this YouTube video of a car driving around San Francisco.

"A fire truck"

Search results for "A fire truck" Search results for "A fire truck" Search results for "A fire truck"

"Road works"

Search results for "Road works" Search results for "Road works" Search results for "Road works"

"People crossing the street"

Search results for "People crossing the street" Search results for "People crossing the street" Search results for "People crossing the street"

"The Embarcadero"

Search results for "The Embarcadero" Search results for "The Embarcadero" Search results for "The Embarcadero"

"Waiting at the red light"

Search results for "Waiting at the red light" Search results for "Waiting at the red light" Search results for "Waiting at the red light"

"Green bike lane"

Search results for "Green bike lane" Search results for "Green bike lane" Search results for "Green bike lane"

"A street with tram tracks"

Search results for "A street with tram tracks" Search results for "A street with tram tracks" Search results for "A street with tram tracks"

"The Transamerica Pyramid"

Search results for "The Transamerica Pyramid" Search results for "The Transamerica Pyramid" Search results for "The Transamerica Pyramid"

Natural language search on Unsplah

You can also try my other project to search from 2M photos on Unsplash using natural language queries:

natural-language-youtube-search's People

Contributors

ak391 avatar haltakov avatar matiastucci avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

natural-language-youtube-search's Issues

show seconds

Hi.
Would it be possible to show in which second each of the three frames was found?
Don't know anything about programming but think it would be useful for the application.
Congratulations for the work.

Error in YouTube(video_url).streams.... function

I get an error when going through this Colab notebook in the cell where the "streams =" is executed. I verified that the youtube video exists.

HTTPError Traceback (most recent call last)

in ()
2
3 # Choose a video stream with resolution of 360p
----> 4 streams = YouTube(video_url).streams.filter(adaptive=True, subtype="mp4", resolution="360p", only_video=True)
5
6 # Check if there is a valid stream

12 frames

/usr/lib/python3.7/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
647 class HTTPDefaultErrorHandler(BaseHandler):
648 def http_error_default(self, req, fp, code, msg, hdrs):
--> 649 raise HTTPError(req.full_url, code, msg, hdrs, fp)
650
651 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 404: Not Found

Strange CLIP results

Hi, first of all many thanks for this wonderful script :)
I've trying some searchs, and I found some strange results. Probably is how CLIP words, but not sure.

If I search "CAR" (there are a lot of cars in the video), and if I look at the value of the frame with the best similarity, I get, e.g. 26.65
Then I search something stupid like "sdfsdflksdfj", and I check at the same value...I was expecting to get a near-to-zero value, but instead I get, e.g. 21.55.
Is this a bug? Or is the way CLIP works? Is there a way to detect how good the prediction is?
Many thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.