Machine Learning News Hubb
Advertisement Banner
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us
Machine Learning News Hubb
No Result
View All Result
Home Artificial Intelligence

Introducing Whisper

admin by admin
January 19, 2023
in Artificial Intelligence


We’ve trained and are open-sourcing a neural net called Whisper that approaches human level robustness and accuracy on English speech recognition.

Read Paper


View Code


View Model Card

Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. We show that the use of such a large and diverse dataset leads to improved robustness to accents, background noise and technical language. Moreover, it enables transcription in multiple languages, as well as translation from those languages into English. We are open-sourcing models and inference code to serve as a foundation for building useful applications and for further research on robust speech processing.

The Whisper architecture is a simple end-to-end approach, implemented as an encoder-decoder Transformer. Input audio is split into 30-second chunks, converted into a log-Mel spectrogram, and then passed into an encoder. A decoder is trained to predict the corresponding text caption, intermixed with special tokens that direct the single model to perform tasks such as language identification, phrase-level timestamps, multilingual speech transcription, and to-English speech translation.

Other existing approaches frequently use smaller, more closely paired audio-text training datasets, or use broad but unsupervised audio pretraining. Because Whisper was trained on a large and diverse dataset and was not fine-tuned to any specific one, it does not beat models that specialize in LibriSpeech performance, a famously competitive benchmark in speech recognition. However, when we measure Whisper’s zero-shot performance across many diverse datasets we find it is much more robust and makes 50% fewer errors than those models.

About a third of Whisper’s audio dataset is non-English, and it is alternately given the task of transcribing in the original language or translating to English. We find this approach is particularly effective at learning speech to text translation and outperforms the supervised SOTA on CoVoST2 to English translation zero-shot.

We hope Whisper’s high accuracy and ease of use will allow developers to add voice interfaces to a much wider set of applications. Check out the paper, model card, and code to learn more details and to try out Whisper.



Source link

Previous Post

Time Series Data: Analysis vs Forecasting

Next Post

AI Advent Calendar 2022

Next Post

AI Advent Calendar 2022

Fixing the Mismatch Between Skills and Jobs: A Pilot Project to Test Learning and Employment Records

How Artificial Intelligence Will Change Mobile Apps

Related Post

Artificial Intelligence

3 Ways to Build a Geographical Map in Python Altair | by Angelica Lo Duca | Jan, 2023

by admin
January 30, 2023
Machine Learning

Want to get a quick and profound overview of the 42 most common used Machine Learning Algorithms? | by Murat Durmus (CEO @AISOMA_AG) | Jan, 2023

by admin
January 30, 2023
Machine Learning

Scan Business Cards to Excel or Google Contacts

by admin
January 30, 2023
Artificial Intelligence

Amazon SageMaker built-in LightGBM now offers distributed training using Dask

by admin
January 30, 2023
Artificial Intelligence

Don’t blame a Data Scientist on failed projects! | by Darya Petrashka | Dec, 2022

by admin
January 30, 2023
Edge AI

BrainChip Tapes Out AKD1500 Chip in GlobalFoundries 22nm FD SOI Process

by admin
January 30, 2023

© 2023 Machine Learning News Hubb All rights reserved.

Use of these names, logos, and brands does not imply endorsement unless specified. By using this site, you agree to the Privacy Policy and Terms & Conditions.

Navigate Site

  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us

Newsletter Sign Up.

No Result
View All Result
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us

© 2023 JNews - Premium WordPress news & magazine theme by Jegtheme.