theitguy
Premium Member
- Joined
- June 12, 2025
- Messages
- 60
- Reaction score
- 212
- Points
- 33
- Thread Author
- #1
First things first, let's install the Pytube library. Open your terminal and run the following command:
Code:
pip install pytube
Now that Pytube is installed, let's move on to the next step.
Next, we need to import Pytube and provide the link to the YouTube video we want to transcribe. We'll use the following code to convert the audio to MP4:
Code:
#Importing Pytube library
import pytube
# Reading the YouTube link
video = "https://www.youtube.com/watch?v=x7X9w_GIm1s"
data = pytube.YouTube(video)
# Converting and downloading as 'MP4' file
audio = data.streams.get_audio_only()
audio.download()
Now, it's time to convert audio into text using Whisper. We'll start by installing and importing the Whisper library:
Code:
!pip install git+https://github.com/openai/whisper.git -q
Code:
import whisper
Next, we'll load the model. We'll use the "base" model for this tutorial, but you can find more information about the models here. Each one has tradeoffs between accuracy and speed (compute needed).
Finally, we'll transcribe the audio file using the following code:
Code:
model = whisper.load_model("base")
text = model.transcribe("Python in 100 Seconds.mp4")
And that's it! We can print out the output:
print(transcript)
Code:
#printing the transcribe
text['text']