Whisper (speech recognition system)
| Whisper (speech recognition system) | |
|---|---|
| Original author | OpenAI |
| Initial release | September 21, 2022 |
| Written in | Python |
| Type | |
| License | MIT License |
| Repository | github |
| Part of a series on |
| OpenAI |
|---|
| Products |
| Models |
| People |
| Concepts |
Whisper is a machine learning model for speech recognition and transcription, created by OpenAI and first released as open-source software in September 2022.
It is capable of transcribing speech in English and multiple other languages, and can translate several non-English languages into English. Whisper is a weakly-supervised deep learning acoustic model, made using an encoder-decoder transformer architecture. OpenAI claims that the combination of different training data and post-training filtering used in its development has led to improved recognition of accents, background noise, and jargon compared to previous approaches. While the model does not outperform larger, more specialized models and still experinces AI hallucination, it has been showed to be useful for general sound reognition and has many applications across different industries.