MCP Server Whisper
JSON →Advanced audio transcription and processing using OpenAI's Whisper and GPT-4o models.
Tools · 8
- list_audio_files Lists audio files with comprehensive filtering and sorting options: Filter by regex pattern matching on filenames, filter by file size, duration, modification time, or format, sort by name, size, duration, modification time, or format, returns type-safe FilePathSupportParams with full metadata
- get_latest_audio Gets the most recently modified audio file with model support info
- convert_audio Converts audio files to supported formats (mp3 or wav), returns AudioProcessingResult with output path
- compress_audio Compresses audio files that exceed size limits, returns AudioProcessingResult with output path
- transcribe_audio Advanced transcription using OpenAI's models: Supports whisper-1, gpt-4o-transcribe, and gpt-4o-mini-transcribe, custom prompts for guided transcription, optional timestamp granularities for word and segment-level timing, JSON response format option, returns TranscriptionResult with text, usage data, and optional timestamps
- chat_with_audio Interactive audio analysis using GPT-4o audio models: Supports gpt-4o-audio-preview (recommended) and dated versions, note: gpt-4o-mini-audio-preview has limitations with audio chat and is not recommended, custom system and user prompts, provides conversational responses to audio content, returns ChatResult with response text
- transcribe_with_enhancement Enhanced transcription with specialized templates: detailed - includes tone, emotion, and background details, storytelling - transforms the transcript into a narrative form, professional - creates formal, business-appropriate transcriptions, analytical - adds analysis of speech patterns and key points, returns TranscriptionResult with enhanced output
- create_audio Generate text-to-speech audio using OpenAI's TTS API: Supports gpt-4o-mini-tts (preferred) and other speech models, multiple voice options (alloy, ash, ballad, coral, echo, sage, shimmer, verse, marin, cedar), speed adjustment and custom instructions, customizable output file paths, handles texts of
Environment variables
OPENAI_API_KEY
Links
★ 54 GitHub stars