{"id":4991,"library":"nemo-toolkit","title":"NVIDIA NeMo Toolkit","description":"NeMo is an open-source, PyTorch-based toolkit for developing state-of-the-art conversational AI models, including Automatic Speech Recognition (ASR), Text-to-Speech (TTS), Large Language Models (LLMs), and Natural Language Processing (NLP). It is currently at version 2.7.2 and receives frequent updates, typically bi-monthly or monthly, with patch releases for security and critical fixes.","status":"active","version":"2.7.2","language":"en","source_language":"en","source_url":"https://github.com/nvidia/nemo","tags":["AI","NLP","ASR","TTS","Speech","Conversational AI","Deep Learning","NVIDIA","PyTorch"],"install":[{"cmd":"pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu121\npip install nemo_toolkit[all]","lang":"bash","label":"Recommended (CUDA 12.1 example)"},{"cmd":"pip install nemo_toolkit[all]","lang":"bash","label":"Basic (requires pre-installed PyTorch)"}],"dependencies":[{"reason":"Core deep learning framework. NeMo requires a specific, user-installed PyTorch version compatible with their CUDA setup. Installing PyTorch *before* NeMo is crucial.","package":"torch","optional":false},{"reason":"Used for model training and research. NeMo builds on Lightning's capabilities.","package":"pytorch-lightning","optional":false}],"imports":[{"symbol":"nemo.collections.asr","correct":"import nemo.collections.asr as nemo_asr"},{"symbol":"nemo.collections.tts","correct":"import nemo.collections.tts as nemo_tts"},{"note":"The entire `nemo.collections.nlp` module was removed in NeMo v2.6.1. Code relying on it will break. Check the latest documentation for updated NLP functionalities.","wrong":"import nemo.collections.nlp as nemo_nlp","symbol":"nemo.collections.nlp","correct":"This module was removed in NeMo 2.6.1. Refer to NeMo documentation for alternatives."}],"quickstart":{"code":"import nemo.collections.asr as nemo_asr\n\n# This will download and load the pretrained model from NVIDIA's NGC cloud\n# The first run takes time to download the model (~1.5 GB)\nasr_model = nemo_asr.models.EncDecRNNTModel.from_pretrained(model_name=\"stt_en_fastconformer_hybrid_large_ctc_rnnt\")\n\n# Path to an audio file (replace with your own or download an example)\n# For demonstration, we'll use a placeholder. In a real scenario, you'd have an actual .wav file.\n# Example audio can be found in NeMo's tutorials or downloaded from public datasets.\n# For example: !wget https://nemo-public.s3.us-east-2.amazonaws.com/example_samples/audio_0.wav\nfilepath = \"./audio_0.wav\" # Ensure this file exists for the code to run\n\n# For demonstration, let's create a dummy file if it doesn't exist\nimport os\nif not os.path.exists(filepath):\n    try:\n        import torchaudio\n        import torch\n        sample_rate = 16000\n        duration_seconds = 5\n        waveform = torch.sin(2 * torch.pi * 440 * torch.arange(0, sample_rate * duration_seconds) / sample_rate).unsqueeze(0)\n        torchaudio.save(filepath, waveform, sample_rate)\n        print(f\"Created dummy audio file: {filepath}\")\n    except ImportError:\n        print(f\"Warning: '{filepath}' not found and torchaudio not installed to create a dummy file. Quickstart may fail.\")\n\n\ntranscriptions = asr_model.transcribe([filepath])\nprint(f\"Transcription: {transcriptions[0]}\")","lang":"python","description":"This quickstart demonstrates how to load a pre-trained ASR model and transcribe an audio file. The model will be downloaded automatically on the first run. Ensure you have a `.wav` audio file (preferably 16kHz mono) at the specified `filepath`."},"warnings":[{"fix":"Review the NeMo documentation for updated NLP capabilities or alternative approaches for versions 2.6.1 and later. Some functionalities may have been migrated or require different import paths.","message":"The entire `nemo.collections.nlp` module was removed in NeMo v2.6.1. Any code that imports or uses classes/functions from this module will break.","severity":"breaking","affected_versions":"< 2.6.1 to > 2.6.1"},{"fix":"Always install PyTorch with the correct CUDA version first, then install `nemo_toolkit[all]`. Refer to the official NeMo installation guide for specific PyTorch/CUDA version compatibility.","message":"NeMo requires a specific PyTorch version to be installed *before* installing NeMo itself, compatible with your CUDA version (if using GPU). Installing NeMo without pre-installing PyTorch can lead to dependency conflicts or incorrect CUDA setups.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Upgrade to NeMo v2.6.1 or newer to ensure full compatibility with NumPy 2.0. If unable to upgrade, you might need to pin NumPy to an older version (e.g., `numpy<2.0`).","message":"Some functionalities, especially for ASR, had compatibility issues with NumPy 2.0 prior to NeMo v2.6.1.","severity":"gotcha","affected_versions":"< 2.6.1"},{"fix":"Ensure you are on NeMo v2.7.2 or newer, as specific fixes for these packages were introduced in recent releases. If issues persist, verify your CUDA toolkit installation and driver versions.","message":"Users of `numba-cuda` and `cuda-python` packages (often implicit dependencies for GPU acceleration) experienced installation and usage issues in earlier versions.","severity":"gotcha","affected_versions":"< 2.7.2"},{"fix":"Ensure you have adequate GPU VRAM (e.g., 8GB+ for many models) and sufficient disk space. For production, consider deploying on NVIDIA GPUs. For local development, be mindful of model sizes.","message":"NeMo models (especially large language models or pre-trained ASR/TTS models) require significant GPU memory and disk space for downloading model checkpoints. Running on CPU is possible but much slower.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-12T00:00:00.000Z","next_check":"2026-07-11T00:00:00.000Z"}