Automation 360

IBM Watson Speech to Text package

Download as PDF

IBM Watson Speech to Text package

Download as PDF

Updated: 2021/09/02

IBM Watson Speech to Text package

This package supports the following audio file formats: flac, mpeg, mp3, ogg, pcm, wav, and webm. The following languages are supported: Arabic, Brazilian Portuguese, Chinese (Mandarin), English (United Kingdom and United States), French, German, Japanese, Korean, Spanish (Argentinian, Castilian, Chilean, Colombian, Mexican, and Peruvian).

Important: This is a beta package and is currently not available with the Automation 360 Enterprise and Cloud editions.


Feature	Description
Detect speakers	Identifies the individuals in a conversation between multiple people. Supports English, Japanese, and Spanish. Use for conversation between two people; maximum six people. For best results, use an audio file at least a minute long. The output contains the words spoken by each speaker and the timestamp.
Keyword spotting	Detects specific strings in the transcript. The output contains the timestamp(s) for each keyword and a confidence score.
Smart formatting	Converts the following types of strings into more conventional representations to make the transcript easier to read: Dates Times Series of digits and numbers Phone numbers Currency values Email and web addresses For examples, see Smart formatting results. This feature supports English, Japanese, and Spanish.
Profanity filter	Obscures profanity by replacing it with asterisks in the transcript.