Skip to content

jpamorim/video-transcription

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transcription and QA based on video

Scripts designed to download video based on the URL and transcribe the video using a LLM audio-to-text transcription model to be defined. The transcribed text is then used to generate questions and answers using a LLM to be defined - probably ChatGPT. The QA model is then used to answer questions based on the video content.

Setup

  1. Create conda environment
conda create -n video-transcription python=3.12
  1. Activate conda environment
conda activate video-transcription
  1. Install requirements
pip install -r requirements.txt
  1. Run the video download script with the URL of the video as an argument
python scripts/video_downloader.py --url <URL>
  1. Run the audio from video script with the filepath of the video and the output path of the audio file as arguments
python scripts/audio_from_video.py --filepath download/video --output_path download/audio.wav
  1. Transcribe the audio file using the OpenAI whisper-large-v3 model
python scripts/audio_transcription.py --audio_path download/audio.wav --transcription_path download/transcription.json --timestamps True
  1. Add OPENAI_API_KEY to the environment variables
export OPENAI_API_KEY=<API_KEY>
  1. Ask question based on context of the transcription using ChaGPT API
python scripts/question_answer.py --transcription_path download/transcription.txt --questions_path download/questions.txt --answers_path download/answers.txt
  1. Execute bash script transcribe.sh to run the entire pipeline
./transcribe_url.sh <URL>

About

Transcription and QA based on video

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published