Python 物件偵測初學者指南

發佈於2024-11-02

A Beginner’s Guide to Object Detection in Python

目標偵測是電腦視覺中最令人興奮的領域之一，它允許機器識別和定位影像或影片中的目標。本指南將向您介紹使用 Python 進行物件檢測，幫助您使用流行的程式庫實現基本的檢測管道。無論您是初學者還是想要增強現有技能，本教程都將提供入門所需的基本見解。

什麼是物體檢測？？

物件偵測涉及兩個主要任務：

影像分類：確定影像中存在哪個物件。
物件定位：使用邊界框找出物件的位置。

這使得它比簡單的圖像分類更複雜，其中模型僅預測類別標籤。物件偵測需要預測影像中物件的類別和位置。

流行的物體偵測演算法？

1. YOLO（你只看一次）

YOLO 以其速度而聞名，是一種即時目標檢測系統，可同時預測邊界框和類別機率。

2.SSD（單次多盒子探測器）

SSD 單次偵測物體，擅長使用特徵圖偵測不同尺度的物體。

3.更快的R-CNN

一個兩階段模型，首先產生區域提案，然後將它們分類。它比 YOLO 和 SSD 更準確，但速度較慢。

設定你的 Python 環境？ ️

要開始在 Python 中進行物件檢測，您需要一些函式庫。

第1步：安裝Python

前往 python.org 並下載最新版本的 Python (3.8 )。

第 2 步：安裝所需的庫

我們將使用OpenCV進行影像處理，使用TensorFlow進行物件偵測。

pip install opencv-python tensorflow

（選購）安裝Matplotlib以視覺化檢測結果。

pip install matplotlib

用於目標偵測的預訓練模型？

不用從頭開始訓練，而是使用 TensorFlow 的物件偵測 API 或 PyTorch 中的預訓練模型。預訓練模型透過利用 COCO（上下文中的通用物件）等資料集來節省資源。

在本教程中，我們將使用 TensorFlow 的 ssd_mobilenet_v2，這是一種快速且準確的預訓練模型。

使用 TensorFlow 和 OpenCV 進行物體偵測 ?‍?

以下是如何實現簡單的物件偵測管道。

第 1 步：載入預訓練模型

import tensorflow as tf

# Load the pre-trained model
model = tf.saved_model.load("ssd_mobilenet_v2_fpnlite_320x320/saved_model")

您可以從 TensorFlow 的模型動物園下載模型。

第 2 步：載入並處理圖像

import cv2
import numpy as np

# Load an image using OpenCV
image_path = 'image.jpg'
image = cv2.imread(image_path)

# Convert the image to a tensor
input_tensor = tf.convert_to_tensor(image)
input_tensor = input_tensor[tf.newaxis, ...]

第 3 步：執行物體偵測

# Run inference on the image
detections = model(input_tensor)

# Extract relevant information like bounding boxes, classes, and scores
num_detections = int(detections.pop('num_detections'))
detections = {key: value[0, :num_detections].numpy() for key, value in detections.items()}
boxes = detections['detection_boxes']
scores = detections['detection_scores']
classes = detections['detection_classes'].astype(np.int64)

第 4 步：可視化結果

# Draw bounding boxes on the image
for i in range(num_detections):
    if scores[i] > 0.5:  # Confidence threshold
        box = boxes[i]
        h, w, _ = image.shape
        y_min, x_min, y_max, x_max = box

        start_point = (int(x_min * w), int(y_min * h))
        end_point = (int(x_max * w), int(y_max * h))

        # Draw rectangle
        cv2.rectangle(image, start_point, end_point, (0, 255, 0), 2)

# Display the image
cv2.imshow("Detections", image)
cv2.waitKey(0)
cv2.destroyAllWindows()

此程式碼載入映像、偵測物件並使用邊界框將它們視覺化。置信度閾值設定為 50%，過濾掉低置信度檢測。