22.6 azspeech transcribe
20221005
The transcribe
command will, by default, listen for up to 15 seconds of speech from
the microphone and then convert it to text, written to the
console. The command can also be used to transcribe speech from an
audio file (wav) provided as the FILENAME
argument. The source
language
may be required, though several languages are automatically
identified.
$ ml transcribe azspeech [FILENAME]
-l <lang> --lang=<lang>
A simple example, listening for the audio on the microphone:
might result in:
The machine learning hub is useful for demonstrating capability of
models as well as providing command line tools.
The command can take an audio wav file, as an optional argument, and transcribe it to the console. For large audio files this can take some time. Currently only wav files are supported through the command line (though the cloud service also supports mp3, ogg, and flac).
wget https://github.com/realpython/python-speech-recognition/raw/master/audio_files/harvard.wav
ml transcribe azspeech harvard.wav
This will transcribe the audio as the following text:
The stale smell of old beer lingers it takes heat to bring out the odor.
A cold dip restore's health and Zest, a salt pickle taste fine with
Ham tacos, Al Pastore are my favorite a zestful food is the hot cross bun.
To convert from other audio formats to a suitable wav file see the section on converting audio formats in the GNU/Linux Desktop Survival Guide.
To save the output to a text file simply use the shell redirect
operator >
.
$ ml transcribe azspeech harvard.wav > harvard.txt
$ cat harvard.txt
The stale smell of old beer lingers it takes heat to bring out the odor.
A cold dip restore's health and Zest, a salt pickle taste fine with
Ham tacos, Al Pastore are my favorite a zestful food is the hot cross bun.
The transcribe command will only record up to 15 seconds. To transcribe more that this from your own recorded voice, simply save the recording into a file and then transcribe that file. See the section on recording audio from the GNU/Linux Desktop Survival Guide for details.
A powerful graphical tool to record audio is audacity but a simple command line application to record from the computerβs microphone is arecord.
Terminate the recording session with Ctrl-C
and then trascribe the
recording:
Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0