{"id":9267,"library":"pyworld","title":"PyWorld: Python Wrapper for WORLD Vocoder","description":"PyWorld is a Python wrapper for the WORLD vocoder, a highly efficient and high-quality speech analysis, manipulation, and synthesis system. It allows users to extract fundamental frequency (f0), harmonic spectral envelope (sp), and aperiodic spectral envelope (ap) from speech, and subsequently synthesize speech from these parameters. The library is currently at version 0.3.5 and is actively maintained, though major releases have been infrequent. [1, 7, 9, 11]","status":"active","version":"0.3.5","language":"en","source_language":"en","source_url":"https://github.com/JeremyCCHsu/Python-Wrapper-for-World-Vocoder","tags":["speech processing","vocoder","audio analysis","audio synthesis"],"install":[{"cmd":"pip install pyworld","lang":"bash","label":"Install with pip"}],"dependencies":[{"reason":"Required for numerical array operations (audio waveforms, features).","package":"numpy","optional":false},{"reason":"Recommended for audio input/output operations. Can be replaced by scipy or librosa with careful data type handling.","package":"pysoundfile","optional":true}],"imports":[{"note":"Standard alias for brevity and convention.","symbol":"pyworld","correct":"import pyworld as pw"}],"quickstart":{"code":"import numpy as np\nimport pyworld as pw\n\n# Simulate a mono audio waveform (e.g., 2 seconds at 44.1 kHz)\nfs = 44100 # Sampling frequency\nt = np.arange(0, 2.0, 1.0/fs) # Time vector\nf0_val = 200 # Hz\nx = 0.5 * np.sin(2 * np.pi * f0_val * t).astype(np.float64)\n\n# Ensure the waveform is float64 as expected by pyworld\n# Extract WORLD features\nf0, sp, ap = pw.wav2world(x, fs)\n\nprint(f\"Extracted f0 shape: {f0.shape}\")\nprint(f\"Extracted spectral envelope shape: {sp.shape}\")\nprint(f\"Extracted aperiodicity shape: {ap.shape}\")\n\n# Synthesize speech back (optional, requires additional components like 'y')\ny_synthesized = pw.synthesize(f0, sp, ap, fs)\nprint(f\"Synthesized audio shape: {y_synthesized.shape}\")","lang":"python","description":"This quickstart demonstrates how to simulate a basic audio waveform and then use `pyworld.wav2world` to extract the fundamental frequency (f0), spectral envelope (sp), and aperiodicity (ap). It also shows how to synthesize audio back from these features. [1, 7, 9]"},"warnings":[{"fix":"Ensure input audio has a sampling frequency of 16 kHz or greater. Resample lower frequency audio before processing with PyWorld.","message":"WORLD vocoder, and thus PyWorld, is designed for speech sampled at 16 kHz or higher. Applying it to audio with a sampling rate below 16 kHz will result in failure or incorrect output. [1, 7]","severity":"breaking","affected_versions":"All versions"},{"fix":"When loading audio, explicitly cast the array: `audio_data = librosa.load(filename, sr=fs, dtype=np.float64)[0]` or `audio_data = scipy.io.wavfile.read(filename)[1].astype(np.float64)`.","message":"When loading audio with libraries like `scipy` or `librosa` for PyWorld processing, ensure the audio data is converted to `numpy.float64` (double precision). PyWorld's C backend expects this data type, and incorrect types can lead to errors or unexpected behavior. [1]","severity":"gotcha","affected_versions":"All versions"},{"fix":"Consider using `f0, t = pw.harvest(x, fs)` instead of `pw.dio(x, fs)` for noisy audio segments.","message":"For audio with a low Signal-to-Noise Ratio (SNR), the `pyworld.dio` pitch extractor may perform poorly. The `pyworld.harvest` extractor is often a better alternative in such conditions. [1]","severity":"gotcha","affected_versions":"All versions"},{"fix":"Upgrade Cython to a sufficiently recent version, e.g., `pip install --upgrade Cython>=0.24`.","message":"When installing `pyworld` from source, especially if encountering build errors, an outdated `Cython` version can be the cause. [1]","severity":"gotcha","affected_versions":"<=0.3.5"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"On Debian/Ubuntu: `sudo apt-get install libsndfile1`. On macOS: `brew install libsndfile`. On Windows, this library is usually bundled with PySoundFile wheels or needs to be manually installed/configured if building from source.","cause":"The underlying C library `libsndfile` is not installed on your system. PySoundFile, a common dependency for audio I/O with PyWorld, relies on this system library. [1]","error":"library not found: sndfile error"},{"fix":"Ensure you have a C/C++ compiler (e.g., `build-essential` on Linux, Xcode command line tools on macOS, Visual Studio on Windows). Try updating Cython (`pip install --upgrade Cython`). If installing from source, ensure `git submodule update --init` completes successfully to fetch the C++ WORLD source. Consider `pip install pyworld-prebuilt` if platform-specific wheels are available and easier to install. [3]","cause":"This often indicates a problem during the compilation of the Cython extensions or the C++ WORLD vocoder. Common causes include missing C/C++ compilers, incorrect Cython versions, or environmental issues during the `git submodule update` step for the C++ WORLD source. [1, 12]","error":"Building wheel for pyworld (pyproject.toml) ... error"},{"fix":"Check your audio's sampling rate. If it's below 16 kHz, resample it to 16 kHz or higher before passing it to PyWorld functions. E.g., `resampled_x = librosa.resample(x, orig_sr=fs, target_sr=16000)`.","cause":"This error can occur if the sampling frequency (`fs`) is too low for PyWorld (e.g., < 16 kHz), leading to invalid internal parameter calculations. [1]","error":"ValueError: frame_period must be a positive value."}]}