pyopenjtalk-plus
pyopenjtalk-plus is a Python wrapper for OpenJTalk, a Japanese Text-to-Speech (TTS) system. It enhances the original pyopenjtalk with features like automatic dictionary and voice data downloads, customizable voice presets, and improved phoneme generation. The library is actively maintained, currently at version 0.4.1.post8, with frequent minor updates addressing bug fixes and improvements.
Common errors
-
ModuleNotFoundError: No module named 'pyopenjtalk_plus'
cause The package was not installed, or the environment is not activated.fixEnsure the package is installed: `pip install pyopenjtalk-plus`. If using a virtual environment, ensure it's activated. -
FileNotFoundError: [Errno 2] No such file or directory: '.../OpenJTalk/dic/open_jtalk_dic_utf_8-1.11'
cause OpenJTalk dictionary files were not found or successfully downloaded. This often occurs due to network issues, firewall blocks, or insufficient write permissions in the default data directory.fixCheck your internet connection and proxy settings. Ensure your Python environment has write permissions in its site-packages directory or a directory specified by `OPENJTALK_DATA_DIR` environment variable. You can try manually pre-downloading data if needed. -
ImportError: DLL load failed while importing _openjtalk
cause This error typically occurs on Windows when a required native DLL for the underlying `pyopenjtalk` library is missing or cannot be loaded, often related to Visual C++ Redistributables or a corrupted installation.fixEnsure you have the latest Microsoft Visual C++ Redistributable installed. Try reinstalling `pyopenjtalk-plus` in a clean virtual environment: `pip uninstall pyopenjtalk-plus pyopenjtalk` then `pip install pyopenjtalk-plus`. -
OpenJTalkError: Cannot find voice: mei_normal
cause The specified voice preset (e.g., 'mei_normal') could not be found. This means the corresponding `.htsvoice` file was not downloaded or is missing.fixVerify that voice data was successfully downloaded. Ensure network connectivity during the first run. If you've moved data, confirm the `OPENJTALK_DATA_DIR` environment variable points to the correct location. Check the library's data directory for the presence of the voice file.
Warnings
- gotcha pyopenjtalk-plus relies on native OpenJTalk binaries and data (dictionaries, voice files). While the library attempts automatic downloads, network issues, firewall restrictions, or insufficient write permissions in the default data directory can cause failures. This often manifests as 'No such file or directory' errors for `.dic` or `.htsvoice` files.
- gotcha For optimal phoneme generation accuracy, particularly with complex or ambiguous Japanese text, it's highly recommended to install `python-mecab-t5m` (or `mecab-python3`). Without it, the library falls back to a less sophisticated internal MeCab implementation, which might produce less accurate phoneme sequences.
- gotcha The `make_audio` function returns raw audio data as a NumPy array. It does not automatically play the sound or save it to a file. Users need to integrate additional libraries for these functionalities.
- gotcha On some systems (especially Windows), the underlying `pyopenjtalk` dependency might require specific Visual C++ Redistributables or other native build tools if pre-compiled wheels are not available for your Python version/architecture. This can lead to `ImportError: DLL load failed` or compilation errors during installation.
Install
-
pip install pyopenjtalk-plus
Imports
- pyopenjtalk_plus
import pyopenjtalk_plus
- make_phoneme
from pyopenjtalk_plus import make_phoneme
- make_audio
from pyopenjtalk_plus import make_audio
Quickstart
import pyopenjtalk_plus
import os
# Set a dummy variable for demonstration if needed, otherwise rely on default behavior.
# pyopenjtalk-plus typically manages data download automatically.
# os.environ['OPENJTALK_DATA_DIR'] = '/path/to/openjtalk_data' # Optional, for custom data path
text = "こんにちは、世界。私はAIです。"
# 1. Get phonemes
phonemes = pyopenjtalk_plus.make_phoneme(text)
print(f"Phonemes: {phonemes}")
# 2. Generate audio (returns sample rate and numpy array of audio data)
# The library will attempt to download necessary dictionaries and voice data on first run.
try:
sr, y = pyopenjtalk_plus.make_audio(text)
print(f"Audio generated successfully: Sample Rate={sr}, Data shape={y.shape}")
# To save the audio to a file, you'd typically use a library like soundfile:
# import soundfile as sf
# sf.write("output.wav", y, sr)
# print("Audio saved to output.wav")
except Exception as e:
print(f"Error generating audio: {e}")
print("Ensure OpenJTalk dictionaries and voice files are accessible. \n"\
"The library usually downloads them automatically, but network or permissions issues can cause failures.")