Open-Unmix

1.3.0 · active · verified Sat Apr 11

Open-Unmix is a PyTorch-based music source separation toolkit that provides pre-trained models and a flexible framework for separating audio into its constituent parts (vocals, drums, bass, other). The current version is 1.3.0, and the library is actively maintained with regular updates addressing bug fixes, performance improvements, and new model releases.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use the `separate_audio` function to perform music source separation. It simulates loading a stereo audio signal and outputs a dictionary of separated stems, each as a PyTorch tensor. The `umxl` model is used by default for inference from version 1.2.1 onwards.

import torch
import numpy as np
from openunmix.separate import separate_audio

# Simulate stereo audio data (e.g., 10 seconds at 44.1 kHz)
sr = 44100
duration = 10  # seconds
num_frames = sr * duration

# Create a dummy audio tensor: (channels, samples)
# In a real scenario, load an audio file using torchaudio.load() or similar.
audio_data_np = np.random.randn(2, num_frames).astype(np.float32)
audio_tensor = torch.from_numpy(audio_data_np)

# Separate the audio into stems
# By default, 'umxl' model is used from v1.2.1 onwards
estimates = separate_audio(audio_tensor, rate=sr)

# 'estimates' is a dictionary with keys like 'vocals', 'drums', 'bass', 'other'
# Each value is a torch.Tensor representing the separated stem.
print("Separated stems and their shapes:")
for stem_name, stem_tensor in estimates.items():
    print(f"  {stem_name}: {stem_tensor.shape}")

# Example: access vocals
vocals = estimates['vocals']
# print(f"Vocals stem shape: {vocals.shape}")

view raw JSON →