{"id":5054,"library":"segment-anything","title":"Segment Anything Model (SAM)","description":"The Segment Anything Model (SAM) from Meta AI is a new foundation model for image segmentation, capable of cutting out any object in any image with a single click. It is designed to be a general-purpose segmentation model, applicable to various downstream tasks. The current stable PyPI version is 1.0, with updates generally tied to significant advancements rather than frequent releases.","status":"active","version":"1.0","language":"en","source_language":"en","source_url":"https://github.com/facebookresearch/segment-anything","tags":["AI","Computer Vision","Segmentation","ML","Foundation Model"],"install":[{"cmd":"pip install segment-anything","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"Core deep learning framework for SAM's operations.","package":"torch","optional":false},{"reason":"Utilities for vision tasks, companion library to torch.","package":"torchvision","optional":false},{"reason":"Commonly used for image loading, manipulation, and preprocessing.","package":"opencv-python","optional":true},{"reason":"Essential for array manipulation of image and mask data.","package":"numpy","optional":true},{"reason":"For visualizing segmentation masks and results.","package":"matplotlib","optional":true}],"imports":[{"symbol":"SamAutomaticMaskGenerator","correct":"from segment_anything import sam_model_registry, SamAutomaticMaskGenerator"},{"symbol":"SamPredictor","correct":"from segment_anything import sam_model_registry, SamPredictor"}],"quickstart":{"code":"import numpy as np\nimport torch\nimport os\n\n# NOTE: You must download a model checkpoint first (e.g., sam_vit_h_4b8939.pth)\n# from https://github.com/facebookresearch/segment-anything/releases/tag/v1.0\n# For this example, we'll assume a dummy path and model type.\nSAM_CHECKPOINT_PATH = os.environ.get('SAM_CHECKPOINT', 'sam_vit_h_4b8939.pth')\nMODEL_TYPE = os.environ.get('SAM_MODEL_TYPE', 'vit_h') # e.g., 'vit_h', 'vit_l', 'vit_b'\n\n# Dummy image data (replace with actual image loading, e.g., using OpenCV)\n# Assuming a 1024x1024 RGB image for demonstration\nimage = np.zeros((1024, 1024, 3), dtype=np.uint8)\n# Simulate loading a real image:\n# import cv2\n# image_path = 'path/to/your/image.jpg'\n# image = cv2.imread(image_path)\n# image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # Important: Convert BGR to RGB\n\n# Check if checkpoint exists\nif not os.path.exists(SAM_CHECKPOINT_PATH):\n    print(f\"Warning: Model checkpoint '{SAM_CHECKPOINT_PATH}' not found.\\n\"+\n          \"Please download it from the official Segment Anything GitHub releases.\")\n    # Exit or provide dummy output for demonstration purposes\n    exit()\n\nfrom segment_anything import sam_model_registry, SamPredictor\n\n# Initialize SAM model\nsam = sam_model_registry[MODEL_TYPE](checkpoint=SAM_CHECKPOINT_PATH)\n\n# Set device: 'cuda' for GPU if available, else 'cpu'\ndevice = 'cuda' if torch.cuda.is_available() else 'cpu'\nsam.to(device=device)\nprint(f\"Using device: {device}\")\n\n# Create a predictor\npredictor = SamPredictor(sam)\npredictor.set_image(image)\n\n# Example: Point prompt for a single object\ninput_point = np.array([[500, 375]]) # Coordinates [x, y]\ninput_label = np.array([1])      # 1 for foreground, 0 for background\n\n# Predict masks\nmasks, scores, logits = predictor.predict(\n    point_coords=input_point,\n    point_labels=input_label,\n    multimask_output=True,\n)\n\nprint(f\"Generated {len(masks)} masks.\")\nprint(f\"Scores: {scores}\")\n# print(f\"First mask shape: {masks[0].shape}, dtype: {masks[0].dtype}\")\n# The 'masks' array contains boolean masks: True for foreground, False for background\n","lang":"python","description":"This quickstart demonstrates how to initialize the Segment Anything Model (SAM) and use `SamPredictor` for point-based inference. It highlights the necessity of downloading a model checkpoint and correctly setting the device. For automatic mask generation, `SamAutomaticMaskGenerator` would be used instead."},"warnings":[{"fix":"Download the desired checkpoint file (e.g., ViT-H, ViT-L, ViT-B) and provide its path when initializing the model: `sam_model_registry[model_type](checkpoint='path/to/checkpoint.pth')`.","message":"Model Checkpoint Download Required. The `pip install segment-anything` command only installs the library code, not the large pre-trained model weights. Users MUST manually download a model checkpoint (e.g., `sam_vit_h_4b8939.pth`) from the official GitHub releases page.","severity":"gotcha","affected_versions":"All versions (1.0+)"},{"fix":"After initializing `sam`, set the device: `device = 'cuda' if torch.cuda.is_available() else 'cpu'; sam.to(device=device)`.","message":"Device Management for Performance. By default, SAM models might load to CPU. For significantly faster inference, especially with larger models like ViT-H, explicitly move the model to a CUDA-enabled GPU if available.","severity":"gotcha","affected_versions":"All versions (1.0+)"},{"fix":"After loading an image with `cv2.imread()`, convert its color channels using `image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)` before passing it to `SamPredictor.set_image()`.","message":"Image Color Channel Order. If using `OpenCV` (cv2) to load images, it reads them in BGR format by default. SAM models expect images in RGB format. Failing to convert will lead to incorrect or degraded segmentation results.","severity":"gotcha","affected_versions":"All versions (1.0+)"},{"fix":"Always refer to the official documentation and examples for the `segment-anything` PyPI package (v1.0+) to ensure correct API usage. The PyPI package provides `sam_model_registry`, `SamPredictor`, and `SamAutomaticMaskGenerator`.","message":"API differences between Research Repo and PyPI Package. The initial research codebase (direct GitHub clone) had some helper functions and class structures that differ from the stable `segment-anything` PyPI package (v1.0+). Relying on old examples from the research repo might lead to `ImportError` or `AttributeError`.","severity":"deprecated","affected_versions":"Prior to v1.0 (if using research repo code)"}],"env_vars":null,"last_verified":"2026-04-12T00:00:00.000Z","next_check":"2026-07-11T00:00:00.000Z"}