Wav2Vec2 XLSR-53 eSpeak CV FT

JSON →
meta audio
audiotext

A fine-tuned Wav2Vec2 model for cross-lingual speech recognition using eSpeak phoneme labels and Common Voice data.

releasedAug 2021