Apple MPS -> CPU NMS fallback strategy (#9600)

Until more ops are fully supported this update will allow for seamless MPS inference (but slower MPS to CPU transfer before NMS, so slower NMS times). Partially resolves https://github.com/ultralytics/yolov5/issues/9596Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com> Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com>

Apple MPS -> CPU NMS fallback strategy (#9600)
c4c0ee8f · Glenn Jocher · GitHub · bd9c0c42 · c4c0ee8f
--- a/utils/general.py
+++ b/utils/general.py
@@ -843,6 +843,8 @@ def non_max_suppression(
    if isinstance(prediction, (list, tuple)):  # YOLOv5 model in validation model, output = (inference_out, loss_out)
        prediction = prediction[0]  # select only inference output

+    if 'mps' in prediction.device.type:  # MPS not fully supported yet, convert tensors to CPU before NMS
+        prediction = prediction.cpu()
    bs = prediction.shape[0]  # batch size
    nc = prediction.shape[2] - nm - 5  # number of classes
    xc = prediction[..., 4] > conf_thres  # candidates