ViT Base AudioMAE AS2M FT AS20K

JSON →
gaunernst audio
audio

A Vision Transformer-based AudioMAE model fine-tuned on AudioSet for audio classification.

streaming