IBM Watson Speech to Text package

This package supports the following audio file formats: flac, mpeg, mp3, ogg, pcm, wav, and webm. The following languages are supported: Arabic, Brazilian Portuguese, Chinese (Mandarin), English (United Kingdom and United States), French, German, Japanese, Korean, Spanish (Argentinian, Castilian, Chilean, Colombian, Mexican, and Peruvian).

Important: This is a beta package and is currently not available with the Automation 360 Enterprise and Cloud editions.
Feature Description
Detect speakers Identifies the individuals in a conversation between multiple people.
  • Supports English, Japanese, and Spanish.
  • Use for conversation between two people; maximum six people.
  • For best results, use an audio file at least a minute long.
The output contains the words spoken by each speaker and the timestamp.
Keyword spotting Detects specific strings in the transcript. The output contains the timestamp(s) for each keyword and a confidence score.
Smart formatting Converts the following types of strings into more conventional representations to make the transcript easier to read:
  • Dates
  • Times
  • Series of digits and numbers
  • Phone numbers
  • Currency values
  • Email and web addresses
For examples, see Smart formatting results. This feature supports English, Japanese, and Spanish.
Profanity filter Obscures profanity by replacing it with asterisks in the transcript.