YouTube Video Transcription with Whisper

  • Thread starter theitguy
  • Start date
  • Tagged users None
theitguy

theitguy

Premium Member
Joined
June 12, 2025
Messages
60
Reaction score
212
Points
33
  • Thread Author
  • #1
1f44b.png
Hey there, tech enthusiasts! In this Whisper tutorial, we'll dive into the world of audio transcription using Python. We'll be working with the Pytube library to download and convert YouTube video audio into an MP4 file, and then use Whisper to transcribe the audio into text.
1f4fa.png
1f4bb.png



First things first, let's install the Pytube library. Open your terminal and run the following command:


1f4bb.png
pip install pytube
Code:
pip install pytube


Now that Pytube is installed, let's move on to the next step.
1f60a.png



Next, we need to import Pytube and provide the link to the YouTube video we want to transcribe. We'll use the following code to convert the audio to MP4:


1f4dd.png
Import Pytube

Code:
#Importing Pytube library
import pytube

# Reading the YouTube link
video = "https://www.youtube.com/watch?v=x7X9w_GIm1s"
data = pytube.YouTube(video)

# Converting and downloading as 'MP4' file
audio = data.streams.get_audio_only()
audio.download()


1f4e3.png
The output is a file named like the video title in your current directory. In our case, the file is named "Python in 100 Seconds.mp4".
1f4dd.png



Now, it's time to convert audio into text using Whisper. We'll start by installing and importing the Whisper library:


1f4bb.png
pip install whisper

Code:
!pip install git+https://github.com/openai/whisper.git -q

Code:
import whisper

Next, we'll load the model. We'll use the "base" model for this tutorial, but you can find more information about the models here. Each one has tradeoffs between accuracy and speed (compute needed).
1f914.png



Finally, we'll transcribe the audio file using the following code:


1f4dd.png
transcript

Code:
model = whisper.load_model("base")
text = model.transcribe("Python in 100 Seconds.mp4")


And that's it! We can print out the output:


1f4dd.png

print(transcript)

Code:
#printing the transcribe
text['text']
 
  • Tags
    transcription video whisper with youtube
  • Top