Giter VIP home page Giter VIP logo

how-to-use-whisper-to-transcribe-a-youtube-video-tutorial's Introduction

How to use Whisper to transcribe a YouTube video - Tutorial

In this tutorial you will learn how to use OpenAIs Whisper to transcribe a YouTube video

[ Full written guide ]

What is Whisper?

Whisper is an automatic State-of-the-Art speech recognition system from OpenAI that has been trained on 680,000 hours of multilingual and multitask supervised data collected from the web. This large and diverse dataset leads to improved robustness to accents, background noise and technical language. In addition, it enables transcription in multiple languages, as well as translation from those languages into English. Unlike DALLE-2 and GPT-3, Whisper is a free and open-source model. OpenAI released the models and code to serve as a foundation for building useful applications that leverage speech recognition.

How to transcribe a YouTube video

In this tutorial we will use Whisper to transcribe a YouTube video. We will use the Python package "Pytube" to download convert the sounds into a MP4 file. You can find the repo of Pytube here

First, we need to install the Pytube Library. You can do this by running the following command in your terminal:

!pip install -— upgrade pytube

For this tutorial i'll be using this "Python in 100 Seconds" Video.

Next, we need to import Pytube, provide the link to the YouTube video, and convert the audio to MP4:

#Importing Pytube library
import pytube

# Reading the above Taken movie Youtube link
video = "https://www.youtube.com/watch?v=x7X9w_GIm1s"
data = pytube.YouTube(video)

# Converting and downloading as 'MP4' file
audio = data.streams.get_audio_only()
audio.download()

The output is a file named like the video title in your current directory. In our case, the file is named Python in 100 Seconds.mp4 Now, the next step is to convert audio into text. We can do this in three lines of code using whisper. First, we install and import whisper. Then we load the model and finally we transcribe the audio file.

# Installing Whisper libary

!pip install git+https://github.com/openai/whisper.git -q
import whisper

Load the model. We'll use the "base" model for this tutorial. You can find more information about the models here. Each one of them has tradeoffs between accuracy and speed (compute needed).

model = whisper.load_model("base")
text = model1.transcribe("Python in 100 Seconds.mp4")

And now we can print out the output.

#printing the transcribe
text['text']

You can find the full code as Jupyter Notebook here

Thank you for reading. If you enjoyed this tutorial you can find more and continue reading on our tutorial page - Fabian Stehle, Junior Data Scientist at New Native


Artificial Intelligence Hackathons, tutorials and Boilerplates

Join the LabLab Discord

Discord Banner 1
On lablab discord, we discuss this repo and many other topics related to artificial intelligence! Checkout upcoming Artificial Intelligence Hackathons Event

Acclerating innovation through acceleration

how-to-use-whisper-to-transcribe-a-youtube-video-tutorial's People

Contributors

ezzcodeezzlife avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

how-to-use-whisper-to-transcribe-a-youtube-video-tutorial's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.