whispervtt

Runs Whisper over a directory of media files. For each mp4 and mp3 file, it generates a text transcript and a WebVTT caption file with FADGI metadata, using an optional CSV for Title and Media Identifier.

The CSV file should be set up with these headers: "File", "Title", "Media Identifier". Include the names of each reference file in the "File" column. (For best results, save the CSV as utf-8 encoded.) For example:

File	Title	Media Identifier
filename.mp4	media_title	uniqueID

Script is customized for Emory Libraries; FADGI metadata is generated from the script or matched from the CSV as follows:

Element	Source
Type	hard-coded
Language	auto-detected by Whisper or supplied by user
Responsible Party	hard-coded
Media Identifier	matched from CSV
Originating File	generated from media file
File Creator	hard-coded
File Creation Date	generated from script
Title	matched from CSV
Origin History	hard-coded
Local Usage Element: Software version	generated from script
Local Usage Element: Review history	hard-coded

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
LICENSE		LICENSE
README.md		README.md
whisper_with_metadata_vtt.py		whisper_with_metadata_vtt.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

whispervtt

Requirements:

About

Uh oh!

Releases

Packages

Languages

License

ninarao/whispervtt

Folders and files

Latest commit

History

Repository files navigation

whispervtt

Requirements:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages