What is Roboflow Supervision?

Roboflow Supervision is a data science tool that enables efficient data analysis, processing, and visualization for modern workflows.

Does Roboflow Supervision support GPU acceleration?

Many data science tools offer GPU acceleration for faster processing. Check the installation docs for CUDA or GPU setup instructions.

What formats does Roboflow Supervision support?

Roboflow Supervision typically supports common data formats including CSV, JSON, Parquet, and various database connections.

Roboflow Supervision: The Python Computer Vision Annotation Toolkit

Introduction #

Computer vision has become one of the most impactful applications of machine learning, powering everything from autonomous vehicles and quality inspection systems to medical imaging and retail analytics. But building production-grade CV systems requires more than just training models — it demands robust tools for data annotation, evaluation, visualization, and debugging.

Supervision by Roboflow is the answer to this need. With 43,972 GitHub stars, it has become the go-to toolkit for computer vision practitioners who need reusable, well-designed Python tools. Their tagline says it all: “We write your reusable computer vision tools.”

Disclosure: This article may contain affiliate links. If you sign up through them, I may earn a small commission at no extra cost to you. Disclosure Policy

DigitalOcean - Reliable cloud infrastructure for your CV deployments. HTStack - High-performance server hosting. WebShare - Premium proxy services for AI data pipelines.

architecture diagram for 2026-06-11-supervision

Architecture overview (source: dibi8.com)

What Is Supervision? #

Supervision is a Python library that provides a comprehensive set of tools for computer vision tasks. It covers the entire CV pipeline — from annotating training data and evaluating model outputs to visualizing detection results and processing video streams.

The library is built around a simple philosophy: make the most common CV operations trivially easy while keeping the door open for custom workflows. Whether you are annotating images for object detection, evaluating segmentation model outputs, or visualizing tracking results in a video, Supervision has you covered.

Feature Image:

Core Features #

Supervision provides tools across the entire computer vision lifecycle:

Data Annotation #

Supervision provides utilities for creating, manipulating, and converting annotation formats. It supports COCO, YOLO, Pascal VOC, and custom formats, making it easy to work with different ML frameworks and pipelines.

# Import supervision
from supervision import *

# Load existing annotations
annotations = load_annotations("annotations/coco_format.json")

# Convert between annotation formats
coco_to_yolo(
    input_path="annotations/coco_format.json",
    output_path="annotations/yolo_format.txt",
    class_map={"person": 0, "car": 1, "dog": 2}
)

# Inspect annotation statistics
stats = get_annotation_stats(annotations)
print(f"Total objects: {stats.total_objects}")
print(f"Classes: {stats.classes}")
print(f"Images: {stats.total_images}")

Detection Processing #

Supervision provides powerful tools for processing detection model outputs, including confidence filtering, non-maximum suppression, and result visualization.

import supervision as sv
import cv2

# Load a detection model (works with YOLO, Detectron, etc.)
detections = sv.Detections.from_yolo_output(
    prediction,  # model output tensor
    original_image_size,  # image dimensions
    confidence_threshold=0.5,
    class_id=0  # filter by class
)

# Apply non-maximum suppression
detections = sv.NMS(detections, iou_threshold=0.45)

# Filter by confidence
detections = detections[detections.confidence > 0.6]

Visualization and Annotation Drawing #

One of Supervision’s strengths is its visualization toolkit. Drawing bounding boxes, segmentation masks, keypoints, and tracking IDs on images and video frames is straightforward:

# Create annotation context for drawing
annotation_context = sv.BoxAnnotator(
    thickness=2,
    color_lookup=sv.ColorLookup.INDEX
)

# Load image
image = cv2.imread("scene.jpg")

# Draw bounding boxes
annotated_image = annotation_context.annotate(
    scene=image,
    detections=detections
)

# Draw segmentation masks
mask_annotator = sv.MaskAnnotator(
    opacity=0.5,
    color_lookup=sv.ColorLookup.INDEX
)
annotated_image = mask_annotator.annotate(
    scene=annotated_image,
    detections=detections
)

# Draw class labels with confidence
label_annotator = sv.LabelAnnotator(
    text_scale=0.5,
    text_thickness=1,
    color_lookup=sv.ColorLookup.INDEX
)
annotated_image = label_annotator.annotate(
    scene=annotated_image,
    detections=detections
)

# Save result
cv2.imwrite("annotated_scene.jpg", annotated_image)

Tracking Support #

Supervision has first-class support for object tracking, with built-in integration for popular tracking algorithms:

# Initialize a tracker
tracker = sv.Tracker(
    tracker_type="ocsort",  # or "bytetrack"
    max_age=30,
    min_hits=3,
    iou_threshold=0.3
)

# Track objects across video frames
video_path = "traffic_camera.mp4"
for frame_number, frame in enumerate(
    sv.VideoInfo.from_video_path(video_path).iter_frames()
):
    detections = detect_objects(frame)  # your detection model
    detections = tracker.update_with_detections(detections)
    
    # Annotated frame with tracking IDs
    annotated_frame = draw_tracking_ids(frame, detections)

Metric Computation #

Supervision provides tools for computing common CV evaluation metrics:

# Compute confusion matrix
confusion_matrix = sv.ConfusionMatrix(
    num_classes=10,
    task="multiclass"
)
confusion_matrix.compute(
    predictions=predicted_labels,
    targets=ground_truth_labels
)

# Display the confusion matrix
confusion_matrix.plot(title="Model Performance")

# Get precision, recall, and F1 per class
for class_name, metrics in confusion_matrix.class_metrics().items():
    print(f"{class_name}: precision={metrics.precision:.3f}, recall={metrics.recall:.3f}, f1={metrics.f1:.3f}")

How It Works #

Supervision operates through a clean, consistent API that follows a few core design patterns:

Detections as Data Structures #

The heart of Supervision is the Detections class, which provides a unified representation for all types of object detection outputs — bounding boxes, segmentation masks, keypoints, and orientation angles.

from supervision import Detections

# Create detections from scratch
detections = Detections(
    xyxy=np.array([  # bounding boxes [x1, y1, x2, y2]
        [100, 50, 300, 250],
        [400, 100, 600, 300]
    ]),
    confidence=np.array([0.95, 0.87]),
    class_id=np.array([0, 2]),
    mask=np.array([mask_1, mask_2]),  # optional segmentation masks
    keypoints=np.array([keypoints_1, keypoints_2])  # optional keypoints
)

# Filter detections
person_detections = detections[detections.class_id == 0]
high_confidence = detections[detections.confidence > 0.8]

# Compute IoU between two detection sets
ious = sv.match_iou(detections_a, detections_b, iou_threshold=0.5)

Pipeline Composition #

Supervision encourages composing operations into pipelines. Each step takes a Detections object and produces a new one:

# Build a detection pipeline
pipeline = [
    {"operation": "filter_confidence", "threshold": 0.5},
    {"operation": "non_max_suppression", "iou_threshold": 0.45},
    {"operation": "filter_class", "class_ids": [0, 1, 2]},
    {"operation": "compute_metrics", "metric": "ap50"}
]

# Execute the pipeline
results = apply_pipeline(original_detections, pipeline)

Installation #

Installing Supervision is simple:

# Install via pip
pip install supervision

# Verify installation
python -c "import supervision as sv; print(sv.__version__)"

# Install with all optional dependencies for maximum compatibility
pip install supervision[all]

Installation with PyTorch #

For deep learning workflows, install with PyTorch:

# Install with PyTorch (CPU)
pip install supervision torch torchvision

# Install with PyTorch (CUDA 12.x)
pip install supervision torch torchvision --index-url https://download.pytorch.org/whl/cu121

Colab Demo #

Roboflow provides an interactive Colab notebook for exploring Supervision’s capabilities:

# Open the interactive Colab demo
# https://colab.research.google.com/github/roboflow/supervision/blob/main/demo.ipynb

# Or run locally:
# Clone the repository to access the demo notebook
git clone https://github.com/roboflow/supervision.git
cd supervision
jupyter notebook demo.ipynb

Integration Patterns #

YOLO Integration #

Supervision has first-class integration with YOLO models:

# Integration with YOLOv8 (Ultralytics)
from ultralytics import YOLO
import supervision as sv

# Load YOLOv8 model
model = YOLO("yolov8n.pt")

# Run inference
results = model.predict("image.jpg", conf=0.25)

# Convert YOLO results to Supervision detections
detections = sv.Detections.from_ultralytics(results[0])

# Visualize
annotator = sv.BoxAnnotator()
annotated_frame = annotator.annotate(
    scene=results[0].plot(),
    detections=detections
)

MediaPipe Integration #

For pose estimation and landmark detection:

import supervision as sv
from mediapipe import solutions

# Load MediaPipe pose model
pose = solutions.pose.Pose(static_image_mode=True)

# Run pose detection
results = pose.process(image)

# Convert to Supervision keypoint format
if results.pose_landmarks:
    keypoints = sv.KeyPoints.from_mediapipe(results.pose_landmarks)

ONNX Runtime Integration #

For optimized inference:

import supervision as sv
from onnxruntime import InferenceSession

# Load ONNX model
session = InferenceSession("model.onnx")

# Run inference and convert to Supervision format
outputs = session.run(None, {session.get_inputs()[0].name: input_tensor})
detections = sv.Detections.from_onnx(outputs)

Benchmarks and Performance #

Evaluation Speed #

Supervision’s evaluation functions are optimized for speed: