CLAP HTSAT Unfused

JSON →
laion multimodal
textaudio

A contrastive language-audio pretraining model using HTSAT audio encoder without fusion for zero-shot audio classification.

releasedMay 2023