27.5 openai transcribe

20240324

ml transcribe openai [-l LANGUAGE] [-f OUTPUT_FILE_FORMAT] [FILENAME]

The whisper package provides the functionality for the transcribe command. We expose the command through the MLHub package. The command conforms to other transcribe commands from other MLHub packages.

The input media file can be any of wav, mp4, mp3, flac. The default output is to stdout as text with other formats also supported: json, srt, tsv, txt, and vtt. The format can be chosen with -f json or --format=json for example. The output can also be saved to file with -o harvard.json or --output=harvard.json, where the output format is indicated by the file name extension.

The output by default consists of one sentence per line to stdout.

wget https://github.com/realpython/python-speech-recognition/raw/master/audio_files/harvard.wav
ml transcribe openai harvard.wav

We can also try this out with Indonesia’s President Jokowi’s speech. Download a sample from Togaware:

wget https://access.togaware.com/jokowi.wav

The following command will save the output to a file jokowi.srt. An srt file contains information useful for adding subtitles to a video. The subtitle format is implicitly indicated by the filename extension. An srt file contains timestamps and the text spoken during the timestamp periods..

ml transcribe openai --output jokowi.srt --lang id jokowi.wav

If the output file already exists then it will not be overwritten unless --force is provided on the command line.

When an output file is not specified we can specify the output format with the -f or --format options. The following will transcribe the audio and write the output to stdout in srt format:

ml transcribe openai --format=srt --lang=id jokowi.wav


Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0