MediaPipe
MediaPipe is an open-source framework from Google that provides cross-platform, customizable ML solutions for live and streaming media. It enables researchers and developers to build world-class machine learning applications for mobile, edge, cloud, and the web. The current version is 0.10.33, with frequent releases addressing bug fixes, performance improvements, and API enhancements.
Warnings
- breaking The `mediapipe.solutions` module has been removed in versions 0.10.30 and later. Code relying on this module will break.
- gotcha Official PyPI packages for MediaPipe on Windows currently lack full GPU acceleration support, and OpenGL is automatically disabled for Windows builds. While you can specify `delegate=python.BaseOptions.Delegate.GPU`, it might not leverage the GPU and could fall back to CPU, or the feature might be unavailable.
- gotcha MediaPipe often provides precompiled Python wheels for specific Python versions. Installing with very new Python versions (e.g., Python 3.13) might result in 'No matching distribution found' errors due to lack of compatible wheels.
- breaking Significant internal refactoring and migration to 'API3' has been an ongoing effort across several recent versions. While primarily affecting C++ users building custom calculators, advanced Python users interacting with lower-level framework components or custom graphs might need to adjust their code.
- gotcha Windows users, especially with newer MediaPipe versions, sometimes encounter `Import Error: DLL load failed while importing _framework_bindings`.
Install
-
pip install mediapipe
Imports
- mediapipe
import mediapipe as mp
- tasks.python.vision
from mediapipe.tasks import python from mediapipe.tasks.python import vision
- solutions
from mediapipe.solutions import ...
Quickstart
import mediapipe as mp
from mediapipe.tasks import python
from mediapipe.tasks.python import vision
import numpy as np
import os
# Placeholder for a real model file. Download a .task file (e.g., efficientdet_lite0.tflite)
# from MediaPipe's model zoo (https://developers.google.com/mediapipe/solutions/object_detector)
# or use your own. For local testing, ensure the file exists.
# Example: model_path = '~/mediapipe_models/efficientdet_lite0.tflite'
model_path = os.environ.get('MEDIAPIPE_MODEL_PATH', 'object_detector.tflite') # Replace with actual model path or env var
try:
# Create a BaseOptions object with the model asset path.
# For GPU acceleration on supported platforms, add delegate=python.BaseOptions.Delegate.GPU
base_options = python.BaseOptions(model_asset_path=model_path)
# Create an ObjectDetectorOptions object.
options = vision.ObjectDetectorOptions(base_options=base_options,
score_threshold=0.25,
max_results=5)
# Create an ObjectDetector.
detector = vision.ObjectDetector.create_from_options(options)
# Create a dummy image (e.g., a blank white image) for demonstration.
# In a real application, you'd load an image from a file or camera.
dummy_image_np = np.zeros((224, 224, 3), dtype=np.uint8) + 255 # White 224x224 image
mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=dummy_image_np)
# Perform object detection on the image.
detection_result = detector.detect(mp_image)
# Print the detection results.
print("Detection results:")
if detection_result.detections:
for detection in detection_result.detections:
for category in detection.categories:
print(f" Category: {category.category_name}, Score: {category.score:.2f}")
bbox = detection.bounding_box
print(f" Bounding Box: (x:{bbox.origin_x}, y:{bbox.origin_y}, w:{bbox.width}, h:{bbox.height})")
else:
print(" No objects detected.")
except FileNotFoundError:
print(f"Error: Model file not found at '{model_path}'. Please ensure the model exists or update MEDIAPIPE_MODEL_PATH environment variable.")
except Exception as e:
print(f"An error occurred: {e}")