New YOLOv5 Classification Models (#8956)

* Update * Logger step fix: Increment step with epochs (#8654) * enhance * revert * allow training from scratch * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update --img argument from train.py single line * fix image size from 640 to 128 * suport custom dataloader and augmentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * format * Update dataloaders.py * Single line return, single line comment, remove unused argument * address PR comments * fix spelling * don't augment eval set * use fstring * update augmentations.py * new maning convention for transforms * reverse if statement, inline ops * reverse if statement, inline ops * updates * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update dataloaders * Remove additional if statement * Remove is_train as redundant * Cleanup * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup2 * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update classifier.py * Update augmentations.py * fix: imshow clip warning * update * Revert ToTensorV2 removal * Update classifier.py * Update normalize values, revert uint8 * normalize image using cv2 * remove dedundant comment * Update classifier.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * replace print with logger * commit steps * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ciCo-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com> * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Allow logging models from GenericLogger (#8676) * enhance * revert * allow training from scratch * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update --img argument from train.py single line * fix image size from 640 to 128 * suport custom dataloader and augmentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * format * Update dataloaders.py * Single line return, single line comment, remove unused argument * address PR comments * fix spelling * don't augment eval set * use fstring * update augmentations.py * new maning convention for transforms * reverse if statement, inline ops * reverse if statement, inline ops * updates * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update dataloaders * Remove additional if statement * Remove is_train as redundant * Cleanup * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup2 * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update classifier.py * Update augmentations.py * fix: imshow clip warning * update * Revert ToTensorV2 removal * Update classifier.py * Update normalize values, revert uint8 * normalize image using cv2 * remove dedundant comment * Update classifier.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * replace print with logger * commit steps * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * support final model logging * update * update * update * update * remove curses * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update classifier.py * Update __init__.py Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com> * Update * Update * Update * Update * Update dataset download * Update dataset download * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Pass imgsz to classify_transforms() * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Cos scheduler * Cos scheduler * Remove unused args * Update * Add seed * Add seed * Update * Update * Add run(), main() * Merge master * Merge master * Update * Update * Update * Update * Update * Update * Update * Create YOLOv5 BaseModel class (#8829) * Create BaseModel * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix * Hub load device fix * Update Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Add experiment * Merge master * Attach names * weight decay = 1e-4 * weight decay = 5e-5 * update smart_optimizer console printout * fashion-mnist fix * Merge master * Update Table * Update Table * Remove destroy process group * add kwargs to forward() * fuse fix for resnet50 * nc, names fix for resnet50 * nc, names fix for resnet50 * ONNX CPU inference fix * revert * cuda * if augment or visualize * if augment or visualize * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * New smart_inference_mode() * Update README * Refactor into /classify dir * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * reset defaults * reset defaults * fix gpu predict * warmup * ema half fix * spacing * remove data * remove cache * remove denormalize * save run settings * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * verbose false on initial plots * new save_yaml() function * Update ci-testing.yml * Path(data) CI fix * Separate classification CI * fix val * fix val * fix val * smartCrossEntropyLoss * skip validation on hub load * autodownload with working dir root * str(data) * Dataset usage example * im_show normalize * im_show normalize * add imagenet simple names to multibackend * Add validation speeds * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * 24-space names * Update bash scripts * Update permissions * Add bash script arguments * remove verbose * TRT data fix * names generator fix * optimize if names * update usage * Add local loading * Verbose=False * update names printing * Add Usage examples * Add Usage examples * Add Usage examples * Add Usage examples * named_children * reshape_classifier_outputs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update * update * fix CI * fix incorrect class substitution * fix incorrect class substitution * remove denormalize * ravel fix * cleanup * update opt file printing * update opt file printing * update defaults * add opt to checkpoint * Add warning * Add comment * plot half bug fix * Use NotImplementedError * fix export shape report * Fix TRT load * cleanup CI * profile comment * CI fix * Add cls models * avoid inplace error * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix usage examples * Update README * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update README * Update README * Update README * Update README * Update README * Update README * Update README * Update README * Update README * Update README * Update README * Update README * Update README * Update README * Update README * Update README Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

New YOLOv5 Classification Models (#8956)
d3ea0df8 · Glenn Jocher · GitHub · 3456fe65 · d3ea0df8 · d3ea0df8
--- a/.github/README_cn.md
+++ b/.github/README_cn.md
--- a/.github/workflows/ci-testing.yml
+++ b/.github/workflows/ci-testing.yml
@@ -5,9 +5,9 @@ name: YOLOv5 CI
 on:
  push:
-    branches: [master]
+    branches: [ master ]
  pull_request:
-    branches: [master]
+    branches: [ master ]
  schedule:
    - cron: '0 0 * * *'  # runs at 00:00 UTC every day
@@ -16,9 +16,9 @@ jobs:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
-        os: [ubuntu-latest]
+        os: [ ubuntu-latest ]
-        python-version: ['3.9']  # requires python<=3.9
+        python-version: [ '3.9' ]  # requires python<=3.9
-        model: [yolov5n]
+        model: [ yolov5n ]
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-python@v4
@@ -47,9 +47,9 @@ jobs:
    strategy:
      fail-fast: false
      matrix:
-        os: [ubuntu-latest, macos-latest, windows-latest]
+        os: [ ubuntu-latest, macos-latest, windows-latest ]
-        python-version: ['3.10']
+        python-version: [ '3.10' ]
-        model: [yolov5n]
+        model: [ yolov5n ]
        include:
          - os: ubuntu-latest
            python-version: '3.7'  # '3.6.8' min
@@ -87,7 +87,7 @@ jobs:
          else
              pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cpu
          fi
-        shell: bash  # required for Windows compatibility
+        shell: bash  # for Windows compatibility
      - name: Check environment
        run: |
          python -c "import utils; utils.notebook_init()"
@@ -100,8 +100,8 @@ jobs:
          python --version
          pip --version
          pip list
-      - name: Run tests
+      - name: Test detection
-        shell: bash
+        shell: bash  # for Windows compatibility
        run: |
          # export PYTHONPATH="$PWD"  # to run '$ python *.py' files in subdirectories
          m=${{ matrix.model }}  # official weights
@@ -123,3 +123,13 @@ jobs:
              model = torch.hub.load('.', 'custom', path=path, source='local')
              print(model('data/images/bus.jpg'))
          EOF
+      - name: Test classification
+        shell: bash  # for Windows compatibility
+        run: |
+          m=${{ matrix.model }}-cls.pt  # official weights
+          b=runs/train-cls/exp/weights/best.pt  # best.pt checkpoint
+          python classify/train.py --imgsz 32 --model $m --data mnist2560 --epochs 1  # train
+          python classify/val.py --imgsz 32 --weights $b --data ../datasets/mnist2560  # val
+          python classify/predict.py --imgsz 32 --weights $b --source ../datasets/mnist2560/test/7/60.png  # predict
+          python classify/predict.py --imgsz 32 --weights $m --source data/images/bus.jpg  # predict
+          python export.py --weights $b --img 64 --imgsz 224 --include torchscript  # export
--- a/README.md
+++ b/README.md
@@ -201,14 +201,6 @@ Get started in seconds with our verified environments. Click each icon below for
 |:-:|:-:|:-:|:-:|
 |Automatically compile and quantize YOLOv5 for better inference performance in one click at [Deci](https://bit.ly/yolov5-deci-platform)|Automatically track, visualize and even remotely train YOLOv5 using [ClearML](https://cutt.ly/yolov5-readme-clearml) (open-source!)|Label and export your custom datasets directly to YOLOv5 for training with [Roboflow](https://roboflow.com/?ref=ultralytics) |Automatically track and visualize all your YOLOv5 training runs in the cloud with [Weights & Biases](https://wandb.ai/site?utm_campaign=repo_yolo_readme)
-<!-- ## <div align="center">Compete and Win</div>
-We are super excited about our first-ever Ultralytics YOLOv5 🚀 EXPORT Competition with **$10,000** in cash prizes!
-<p align="center">
-  <a href="https://github.com/ultralytics/yolov5/discussions/3213">
-  <img width="850" src="https://github.com/ultralytics/yolov5/releases/download/v1.0/banner-export-competition.png"></a>
-</p> -->
 ## <div align="center">Why YOLOv5</div>
@@ -254,6 +246,83 @@ We are super excited about our first-ever Ultralytics YOLOv5 🚀 EXPORT Competi
 </details>
+## <div align="center">Classification ⭐ NEW</div>
+YOLOv5 [release v6.2](https://github.com/ultralytics/yolov5/releases) brings support for classification model training, validation, prediction and export! We've made training classifier models super simple. Click below to get started.
+<details>
+  <summary>Classification Checkpoints (click to expand)</summary>
+<br>
+We trained YOLOv5-cls classification models on ImageNet for 90 epochs using a 4xA100 instance, and we trained ResNet and EfficientNet models alongside with the same default training settings to compare. We exported all models to ONNX FP32 for CPU speed tests and to TensorRT FP16 for GPU speed tests. We ran all speed tests on Google [Colab Pro](https://colab.research.google.com/signup) for easy reproducibility.
+| Model                                                                                              | size<br><sup>(pixels) | acc<br><sup>top1 | acc<br><sup>top5 | Training<br><sup>90 epochs<br>4xA100 (hours) | Speed<br><sup>ONNX CPU<br>(ms) | Speed<br><sup>TensorRT V100<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>@224 (B) |
+|----------------------------------------------------------------------------------------------------|-----------------------|------------------|------------------|----------------------------------------------|--------------------------------|-------------------------------------|--------------------|------------------------|
+| [YOLOv5n-cls](https://github.com/ultralytics/yolov5/releases/download/v6.2/yolov5n-cls.pt)         | 224                   | 64.6             | 85.4             | 7:59                                         | **3.3**                        | **0.5**                             | **2.5**            | **0.5**                |
+| [YOLOv5s-cls](https://github.com/ultralytics/yolov5/releases/download/v6.2/yolov5s-cls.pt)         | 224                   | 71.5             | 90.2             | 8:09                                         | 6.6                            | 0.6                                 | 5.4                | 1.4                    |
+| [YOLOv5m-cls](https://github.com/ultralytics/yolov5/releases/download/v6.2/yolov5m-cls.pt)         | 224                   | 75.9             | 92.9             | 10:06                                        | 15.5                           | 0.9                                 | 12.9               | 3.9                    |
+| [YOLOv5l-cls](https://github.com/ultralytics/yolov5/releases/download/v6.2/yolov5l-cls.pt)         | 224                   | 78.0             | 94.0             | 11:56                                        | 26.9                           | 1.4                                 | 26.5               | 8.5                    |
+| [YOLOv5x-cls](https://github.com/ultralytics/yolov5/releases/download/v6.2/yolov5x-cls.pt)         | 224                   | **79.0**         | **94.4**         | 15:04                                        | 54.3                           | 1.8                                 | 48.1               | 15.9                   |
+|                                                                                                    |
+| [ResNet18](https://github.com/ultralytics/yolov5/releases/download/v6.2/resnet18.pt)               | 224                   | 70.3             | 89.5             | **6:47**                                     | 11.2                           | 0.5                                 | 11.7               | 3.7                    |
+| [ResNet34](https://github.com/ultralytics/yolov5/releases/download/v6.2/resnet34.pt)               | 224                   | 73.9             | 91.8             | 8:33                                         | 20.6                           | 0.9                                 | 21.8               | 7.4                    |
+| [ResNet50](https://github.com/ultralytics/yolov5/releases/download/v6.2/resnet50.pt)               | 224                   | 76.8             | 93.4             | 11:10                                        | 23.4                           | 1.0                                 | 25.6               | 8.5                    |
+| [ResNet101](https://github.com/ultralytics/yolov5/releases/download/v6.2/resnet101.pt)             | 224                   | 78.5             | 94.3             | 17:10                                        | 42.1                           | 1.9                                 | 44.5               | 15.9                   |
+|                                                                                                    |
+| [EfficientNet_b0](https://github.com/ultralytics/yolov5/releases/download/v6.2/efficientnet_b0.pt) | 224                   | 75.1             | 92.4             | 13:03                                        | 12.5                           | 1.3                                 | 5.3                | 1.0                    |
+| [EfficientNet_b1](https://github.com/ultralytics/yolov5/releases/download/v6.2/efficientnet_b1.pt) | 224                   | 76.4             | 93.2             | 17:04                                        | 14.9                           | 1.6                                 | 7.8                | 1.5                    |
+| [EfficientNet_b2](https://github.com/ultralytics/yolov5/releases/download/v6.2/efficientnet_b2.pt) | 224                   | 76.6             | 93.4             | 17:10                                        | 15.9                           | 1.6                                 | 9.1                | 1.7                    |
+| [EfficientNet_b3](https://github.com/ultralytics/yolov5/releases/download/v6.2/efficientnet_b3.pt) | 224                   | 77.7             | 94.0             | 19:19                                        | 18.9                           | 1.9                                 | 12.2               | 2.4                    |
+<details>
+  <summary>Table Notes (click to expand)</summary>
+- All checkpoints are trained to 90 epochs with SGD optimizer with lr0=0.001 at image size 224 and all default settings. Runs logged to https://wandb.ai/glenn-jocher/YOLOv5-Classifier-v6-2.
+- **Accuracy** values are for single-model single-scale on [ImageNet-1k](https://www.image-net.org/index.php) dataset.<br>Reproduce by `python classify/val.py --data ../datasets/imagenet --img 224`
+- **Speed** averaged over 100 inference images using a Google [Colab Pro](https://colab.research.google.com/signup) V100 High-RAM instance.<br>Reproduce by `python classify/val.py --data ../datasets/imagenet --img 224 --batch 1`
+- **Export** to ONNX at FP32 and TensorRT at FP16 done with `export.py`. <br>Reproduce by `python export.py --weights yolov5s-cls.pt --include engine onnx --imgsz 224`
+</details>
+</details>
+<details>
+  <summary>Classification Usage Examples (click to expand)</summary>
+### Train
+YOLOv5 classification training supports auto-download of MNIST, Fashion-MNIST, CIFAR10, CIFAR100, Imagenette, Imagewoof, and ImageNet datasets with the `--data` argument. To start training on MNIST for example use `--data mnist`.
+```bash
+# Single-GPU
+python classify/train.py --model yolov5s-cls.pt --data cifar100 --epochs 5 --img 224 --batch 128
+# Multi-GPU DDP
+python -m torch.distributed.run --nproc_per_node 4 --master_port 1 classify/train.py --model yolov5s-cls.pt --data imagenet --epochs 5 --img 224 --device 0,1,2,3
+```
+### Val
+Validate accuracy on a pretrained model. To validate YOLOv5s-cls accuracy on ImageNet.
+```bash
+bash data/scripts/get_imagenet.sh --val  # download ImageNet val split (6.3G, 50000 images)
+python classify/val.py --weights yolov5s-cls.pt --data ../datasets/imagenet --img 224
+```
+### Predict
+Run a classification prediction on an image.
+```bash
+python classify/predict.py --weights yolov5s-cls.pt --data data/images/bus.jpg
+```
+```python
+model = torch.hub.load('ultralytics/yolov5', 'custom', 'yolov5s-cls.pt')  # load from PyTorch Hub
+```
+### Export
+Export a group of trained YOLOv5-cls, ResNet and EfficientNet models to ONNX and TensorRT.
+```bash
+python export.py --weights yolov5s-cls.pt resnet50.pt efficientnet_b0.pt --include onnx engine --img 224
+```
+</details>
 ## <div align="center">Contribute</div>
 We love your input! We want to make contributing to YOLOv5 as easy and transparent as possible. Please see our [Contributing Guide](CONTRIBUTING.md) to get started, and fill out the [YOLOv5 Survey](https://ultralytics.com/survey?utm_source=github&utm_medium=social&utm_campaign=Survey) to send us feedback on your experiences. Thank you to all our contributors!

--- a/classify/predict.py
+++ b/classify/predict.py
+# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
+"""
+Run classification inference on images
+Usage:
+    $ python classify/predict.py --weights yolov5s-cls.pt --source im.jpg
+"""
+import argparse
+import os
+import sys
+from pathlib import Path
+import cv2
+import torch.nn.functional as F
+FILE = Path(__file__).resolve()
+ROOT = FILE.parents[1]  # YOLOv5 root directory
+if str(ROOT) not in sys.path:
+    sys.path.append(str(ROOT))  # add ROOT to PATH
+ROOT = Path(os.path.relpath(ROOT, Path.cwd()))  # relative
+from classify.train import imshow_cls
+from models.common import DetectMultiBackend
+from utils.augmentations import classify_transforms
+from utils.general import LOGGER, check_requirements, colorstr, increment_path, print_args
+from utils.torch_utils import select_device, smart_inference_mode, time_sync
+@smart_inference_mode()
+def run(
+        weights=ROOT / 'yolov5s-cls.pt',  # model.pt path(s)
+        source=ROOT / 'data/images/bus.jpg',  # file/dir/URL/glob, 0 for webcam
+        imgsz=224,  # inference size
+        device='',  # cuda device, i.e. 0 or 0,1,2,3 or cpu
+        half=False,  # use FP16 half-precision inference
+        dnn=False,  # use OpenCV DNN for ONNX inference
+        show=True,
+        project=ROOT / 'runs/predict-cls',  # save to project/name
+        name='exp',  # save to project/name
+        exist_ok=False,  # existing project/name ok, do not increment
+):
+    file = str(source)
+    seen, dt = 1, [0.0, 0.0, 0.0]
+    device = select_device(device)
+    # Directories
+    save_dir = increment_path(Path(project) / name, exist_ok=exist_ok)  # increment run
+    save_dir.mkdir(parents=True, exist_ok=True)  # make dir
+    # Transforms
+    transforms = classify_transforms(imgsz)
+    # Load model
+    model = DetectMultiBackend(weights, device=device, dnn=dnn, fp16=half)
+    model.warmup(imgsz=(1, 3, imgsz, imgsz))  # warmup
+    # Image
+    t1 = time_sync()
+    im = cv2.cvtColor(cv2.imread(file), cv2.COLOR_BGR2RGB)
+    im = transforms(im).unsqueeze(0).to(device)
+    im = im.half() if model.fp16 else im.float()
+    t2 = time_sync()
+    dt[0] += t2 - t1
+    # Inference
+    results = model(im)
+    t3 = time_sync()
+    dt[1] += t3 - t2
+    p = F.softmax(results, dim=1)  # probabilities
+    i = p.argsort(1, descending=True)[:, :5].squeeze()  # top 5 indices
+    dt[2] += time_sync() - t3
+    LOGGER.info(f"image 1/1 {file}: {imgsz}x{imgsz} {', '.join(f'{model.names[j]} {p[0, j]:.2f}' for j in i)}")
+    # Print results
+    t = tuple(x / seen * 1E3 for x in dt)  # speeds per image
+    shape = (1, 3, imgsz, imgsz)
+    LOGGER.info(f'Speed: %.1fms pre-process, %.1fms inference, %.1fms post-process per image at shape {shape}' % t)
+    if show:
+        imshow_cls(im, f=save_dir / Path(file).name, verbose=True)
+    LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}")
+    return p
+def parse_opt():
+    parser = argparse.ArgumentParser()
+    parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'yolov5s-cls.pt', help='model path(s)')
+    parser.add_argument('--source', type=str, default=ROOT / 'data/images/bus.jpg', help='file')
+    parser.add_argument('--imgsz', '--img', '--img-size', type=int, default=224, help='train, val image size (pixels)')
+    parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
+    parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference')
+    parser.add_argument('--dnn', action='store_true', help='use OpenCV DNN for ONNX inference')
+    parser.add_argument('--project', default=ROOT / 'runs/predict-cls', help='save to project/name')
+    parser.add_argument('--name', default='exp', help='save to project/name')
+    parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
+    opt = parser.parse_args()
+    print_args(vars(opt))
+    return opt
+def main(opt):
+    check_requirements(exclude=('tensorboard', 'thop'))
+    run(**vars(opt))
+if __name__ == "__main__":
+    opt = parse_opt()
+    main(opt)
--- a/classify/train.py
+++ b/classify/train.py
--- a/classify/val.py
+++ b/classify/val.py
+# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
+"""
+Validate a classification model on a dataset
+Usage:
+    $ python classify/val.py --weights yolov5s-cls.pt --data ../datasets/imagenet
+"""
+import argparse
+import os
+import sys
+from pathlib import Path
+import torch
+from tqdm import tqdm
+FILE = Path(__file__).resolve()
+ROOT = FILE.parents[1]  # YOLOv5 root directory
+if str(ROOT) not in sys.path:
+    sys.path.append(str(ROOT))  # add ROOT to PATH
+ROOT = Path(os.path.relpath(ROOT, Path.cwd()))  # relative
+from models.common import DetectMultiBackend
+from utils.dataloaders import create_classification_dataloader
+from utils.general import LOGGER, check_img_size, check_requirements, colorstr, increment_path, print_args
+from utils.torch_utils import select_device, smart_inference_mode, time_sync
+@smart_inference_mode()
+def run(
+    data=ROOT / '../datasets/mnist',  # dataset dir
+    weights=ROOT / 'yolov5s-cls.pt',  # model.pt path(s)
+    batch_size=128,  # batch size
+    imgsz=224,  # inference size (pixels)
+    device='',  # cuda device, i.e. 0 or 0,1,2,3 or cpu
+    workers=8,  # max dataloader workers (per RANK in DDP mode)
+    verbose=False,  # verbose output
+    project=ROOT / 'runs/val-cls',  # save to project/name
+    name='exp',  # save to project/name
+    exist_ok=False,  # existing project/name ok, do not increment
+    half=True,  # use FP16 half-precision inference
+    dnn=False,  # use OpenCV DNN for ONNX inference
+    model=None,
+    dataloader=None,
+    criterion=None,
+    pbar=None,
+):
+    # Initialize/load model and set device
+    training = model is not None
+    if training:  # called by train.py
+        device, pt, jit, engine = next(model.parameters()).device, True, False, False  # get model device, PyTorch model
+        half &= device.type != 'cpu'  # half precision only supported on CUDA
+        model.half() if half else model.float()
+    else:  # called directly
+        device = select_device(device, batch_size=batch_size)
+        # Directories
+        save_dir = increment_path(Path(project) / name, exist_ok=exist_ok)  # increment run
+        save_dir.mkdir(parents=True, exist_ok=True)  # make dir
+        # Load model
+        model = DetectMultiBackend(weights, device=device, dnn=dnn, fp16=half)
+        stride, pt, jit, engine = model.stride, model.pt, model.jit, model.engine
+        imgsz = check_img_size(imgsz, s=stride)  # check image size
+        half = model.fp16  # FP16 supported on limited backends with CUDA
+        if engine:
+            batch_size = model.batch_size
+        else:
+            device = model.device
+            if not (pt or jit):
+                batch_size = 1  # export.py models default to batch-size 1
+                LOGGER.info(f'Forcing --batch-size 1 square inference (1,3,{imgsz},{imgsz}) for non-PyTorch models')
+        # Dataloader
+        data = Path(data)
+        test_dir = data / 'test' if (data / 'test').exists() else data / 'val'  # data/test or data/val
+        dataloader = create_classification_dataloader(path=test_dir,
+                                                      imgsz=imgsz,
+                                                      batch_size=batch_size,
+                                                      augment=False,
+                                                      rank=-1,
+                                                      workers=workers)
+    model.eval()
+    pred, targets, loss, dt = [], [], 0, [0.0, 0.0, 0.0]
+    n = len(dataloader)  # number of batches
+    action = 'validating' if dataloader.dataset.root.stem == 'val' else 'testing'
+    desc = f"{pbar.desc[:-36]}{action:>36}" if pbar else f"{action}"
+    bar = tqdm(dataloader, desc, n, not training, bar_format='{l_bar}{bar:10}{r_bar}{bar:-10b}', position=0)
+    with torch.cuda.amp.autocast(enabled=device.type != 'cpu'):
+        for images, labels in bar:
+            t1 = time_sync()
+            images, labels = images.to(device, non_blocking=True), labels.to(device)
+            t2 = time_sync()
+            dt[0] += t2 - t1
+            y = model(images)
+            t3 = time_sync()
+            dt[1] += t3 - t2
+            pred.append(y.argsort(1, descending=True)[:, :5])
+            targets.append(labels)
+            if criterion:
+                loss += criterion(y, labels)
+            dt[2] += time_sync() - t3
+    loss /= n
+    pred, targets = torch.cat(pred), torch.cat(targets)
+    correct = (targets[:, None] == pred).float()
+    acc = torch.stack((correct[:, 0], correct.max(1).values), dim=1)  # (top1, top5) accuracy
+    top1, top5 = acc.mean(0).tolist()
+    if pbar:
+        pbar.desc = f"{pbar.desc[:-36]}{loss:>12.3g}{top1:>12.3g}{top5:>12.3g}"
+    if verbose:  # all classes
+        LOGGER.info(f"{'Class':>24}{'Images':>12}{'top1_acc':>12}{'top5_acc':>12}")
+        LOGGER.info(f"{'all':>24}{targets.shape[0]:>12}{top1:>12.3g}{top5:>12.3g}")
+        for i, c in enumerate(model.names):
+            aci = acc[targets == i]
+            top1i, top5i = aci.mean(0).tolist()
+            LOGGER.info(f"{c:>24}{aci.shape[0]:>12}{top1i:>12.3g}{top5i:>12.3g}")
+        # Print results
+        t = tuple(x / len(dataloader.dataset.samples) * 1E3 for x in dt)  # speeds per image
+        shape = (1, 3, imgsz, imgsz)
+        LOGGER.info(f'Speed: %.1fms pre-process, %.1fms inference, %.1fms post-process per image at shape {shape}' % t)
+        LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}")
+    return top1, top5, loss
+def parse_opt():
+    parser = argparse.ArgumentParser()
+    parser.add_argument('--data', type=str, default=ROOT / '../datasets/mnist', help='dataset path')
+    parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'yolov5s-cls.pt', help='model.pt path(s)')
+    parser.add_argument('--batch-size', type=int, default=128, help='batch size')
+    parser.add_argument('--imgsz', '--img', '--img-size', type=int, default=224, help='inference size (pixels)')
+    parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
+    parser.add_argument('--workers', type=int, default=8, help='max dataloader workers (per RANK in DDP mode)')
+    parser.add_argument('--verbose', nargs='?', const=True, default=True, help='verbose output')
+    parser.add_argument('--project', default=ROOT / 'runs/val-cls', help='save to project/name')
+    parser.add_argument('--name', default='exp', help='save to project/name')
+    parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
+    parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference')
+    parser.add_argument('--dnn', action='store_true', help='use OpenCV DNN for ONNX inference')
+    opt = parser.parse_args()
+    print_args(vars(opt))
+    return opt
+def main(opt):
+    check_requirements(exclude=('tensorboard', 'thop'))
+    run(**vars(opt))
+if __name__ == "__main__":
+    opt = parse_opt()
+    main(opt)
--- a/data/ImageNet.yaml
+++ b/data/ImageNet.yaml
--- a/data/scripts/download_weights.sh
+++ b/data/scripts/download_weights.sh
 #!/bin/bash
 # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
 # Download latest models from https://github.com/ultralytics/yolov5/releases
-# Example usage: bash path/to/download_weights.sh
+# Example usage: bash data/scripts/download_weights.sh
 # parent
 # └── yolov5
 #     ├── yolov5s.pt  ← downloads here
@@ -11,10 +11,11 @@
 python - <<EOF
 from utils.downloads import attempt_download
-models = ['n', 's', 'm', 'l', 'x']
+p5 = ['n', 's', 'm', 'l', 'x']  # P5 models
-models.extend([x + '6' for x in models])  # add P6 models
+p6 = [f'{x}6' for x in p5]  # P6 models
+cls = [f'{x}-cls' for x in p5]  # classification models
-for x in models:
+for x in p5 + p6 + cls:
-    attempt_download(f'yolov5{x}.pt')
+    attempt_download(f'weights/yolov5{x}.pt')
 EOF
--- a/data/scripts/get_coco.sh
+++ b/data/scripts/get_coco.sh
@@ -33,7 +33,7 @@ else
  f='coco2017labels.zip' # 168 MB
 fi
 echo 'Downloading' $url$f ' ...'
-curl -L $url$f -o $f && unzip -q $f -d $d && rm $f &
+curl -L $url$f -o $f -# && unzip -q $f -d $d && rm $f &
 # Download/unzip images
 d='../datasets/coco/images' # unzip directory
@@ -41,16 +41,16 @@ url=http://images.cocodataset.org/zips/
 if [ "$train" == "true" ]; then
  f='train2017.zip' # 19G, 118k images
  echo 'Downloading' $url$f '...'
-  curl -L $url$f -o $f && unzip -q $f -d $d && rm $f &
+  curl -L $url$f -o $f -# && unzip -q $f -d $d && rm $f &
 fi
 if [ "$val" == "true" ]; then
  f='val2017.zip' # 1G, 5k images
  echo 'Downloading' $url$f '...'
-  curl -L $url$f -o $f && unzip -q $f -d $d && rm $f &
+  curl -L $url$f -o $f -# && unzip -q $f -d $d && rm $f &
 fi
 if [ "$test" == "true" ]; then
  f='test2017.zip' # 7G, 41k images (optional)
  echo 'Downloading' $url$f '...'
-  curl -L $url$f -o $f && unzip -q $f -d $d && rm $f &
+  curl -L $url$f -o $f -# && unzip -q $f -d $d && rm $f &
 fi
 wait # finish background tasks
--- a/data/scripts/get_coco128.sh
+++ b/data/scripts/get_coco128.sh
@@ -12,6 +12,6 @@ d='../datasets' # unzip directory
 url=https://github.com/ultralytics/yolov5/releases/download/v1.0/
 f='coco128.zip' # or 'coco128-segments.zip', 68 MB
 echo 'Downloading' $url$f ' ...'
-curl -L $url$f -o $f && unzip -q $f -d $d && rm $f &
+curl -L $url$f -o $f -# && unzip -q $f -d $d && rm $f &
 wait # finish background tasks
--- a/data/scripts/get_imagenet.sh
+++ b/data/scripts/get_imagenet.sh
+#!/bin/bash
+# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
+# Download ILSVRC2012 ImageNet dataset https://image-net.org
+# Example usage: bash data/scripts/get_imagenet.sh
+# parent
+# ├── yolov5
+# └── datasets
+#     └── imagenet  ← downloads here
+# Arguments (optional) Usage: bash data/scripts/get_imagenet.sh --train --val
+if [ "$#" -gt 0 ]; then
+  for opt in "$@"; do
+    case "${opt}" in
+    --train) train=true ;;
+    --val) val=true ;;
+    esac
+  done
+else
+  train=true
+  val=true
+fi
+# Make dir
+d='../datasets/imagenet' # unzip directory
+mkdir -p $d && cd $d
+# Download/unzip train
+if [ "$train" == "true" ]; then
+  wget https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_train.tar # download 138G, 1281167 images
+  mkdir train && mv ILSVRC2012_img_train.tar train/ && cd train
+  tar -xf ILSVRC2012_img_train.tar && rm -f ILSVRC2012_img_train.tar
+  find . -name "*.tar" | while read NAME; do
+    mkdir -p "${NAME%.tar}"
+    tar -xf "${NAME}" -C "${NAME%.tar}"
+    rm -f "${NAME}"
+  done
+  cd ..
+fi
+# Download/unzip val
+if [ "$val" == "true" ]; then
+  wget https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_val.tar # download 6.3G, 50000 images
+  mkdir val && mv ILSVRC2012_img_val.tar val/ && cd val && tar -xf ILSVRC2012_img_val.tar
+  wget -qO- https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh | bash # move into subdirs
+fi
+# Delete corrupted image (optional: PNG under JPEG name that may cause dataloaders to fail)
+# rm train/n04266014/n04266014_10835.JPEG
+# TFRecords (optional)
+# wget https://raw.githubusercontent.com/tensorflow/models/master/research/slim/datasets/imagenet_lsvrc_2015_synsets.txt
--- a/export.py
+++ b/export.py
@@ -495,11 +495,9 @@ def run(
        assert device.type != 'cpu' or coreml, '--half only compatible with GPU export, i.e. use --device 0'
        assert not dynamic, '--half not compatible with --dynamic, i.e. use either --half or --dynamic but not both'
    model = attempt_load(weights, device=device, inplace=True, fuse=True)  # load FP32 model
-    nc, names = model.nc, model.names  # number of classes, class names
    # Checks
    imgsz *= 2 if len(imgsz) == 1 else 1  # expand
-    assert nc == len(names), f'Model class count {nc} != len(names) {len(names)}'
    if optimize:
        assert device.type == 'cpu', '--optimize not compatible with cuda devices, i.e. use --device cpu'
@@ -520,7 +518,7 @@ def run(
        y = model(im)  # dry runs
    if half and not coreml:
        im, model = im.half(), model.half()  # to FP16
-    shape = tuple(y[0].shape)  # model output shape
+    shape = tuple((y[0] if isinstance(y, tuple) else y).shape)  # model output shape
    LOGGER.info(f"\n{colorstr('PyTorch:')} starting from {file} with output shape {shape} ({file_size(file):.1f} MB)")
    # Exports

--- a/models/common.py
+++ b/models/common.py
@@ -17,13 +17,12 @@ import pandas as pd
 import requests
 import torch
 import torch.nn as nn
-import yaml
 from PIL import Image
 from torch.cuda import amp
 from utils.dataloaders import exif_transpose, letterbox
-from utils.general import (LOGGER, check_requirements, check_suffix, check_version, colorstr, increment_path,
+from utils.general import (LOGGER, ROOT, check_requirements, check_suffix, check_version, colorstr, increment_path,
-                           make_divisible, non_max_suppression, scale_coords, xywh2xyxy, xyxy2xywh)
+                           make_divisible, non_max_suppression, scale_coords, xywh2xyxy, xyxy2xywh, yaml_load)
 from utils.plots import Annotator, colors, save_one_box
 from utils.torch_utils import copy_attr, smart_inference_mode, time_sync
@@ -322,13 +321,10 @@ class DetectMultiBackend(nn.Module):
        super().__init__()
        w = str(weights[0] if isinstance(weights, list) else weights)
-        pt, jit, onnx, xml, engine, coreml, saved_model, pb, tflite, edgetpu, tfjs = self.model_type(w)  # get backend
+        pt, jit, onnx, xml, engine, coreml, saved_model, pb, tflite, edgetpu, tfjs = self._model_type(w)  # get backend
        w = attempt_download(w)  # download if not local
-        fp16 &= (pt or jit or onnx or engine) and device.type != 'cpu'  # FP16
+        fp16 &= pt or jit or onnx or engine  # FP16
-        stride, names = 32, [f'class{i}' for i in range(1000)]  # assign defaults
+        stride = 32  # default stride
-        if data:  # assign class names (optional)
-            with open(data, errors='ignore') as f:
-                names = yaml.safe_load(f)['names']
        if pt:  # PyTorch
            model = attempt_load(weights if isinstance(weights, list) else w, device=device, inplace=True, fuse=fuse)
@@ -350,7 +346,7 @@ class DetectMultiBackend(nn.Module):
            net = cv2.dnn.readNetFromONNX(w)
        elif onnx:  # ONNX Runtime
            LOGGER.info(f'Loading {w} for ONNX Runtime inference...')
-            cuda = torch.cuda.is_available()
+            cuda = torch.cuda.is_available() and device.type != 'cpu'
            check_requirements(('onnx', 'onnxruntime-gpu' if cuda else 'onnxruntime'))
            import onnxruntime
            providers = ['CUDAExecutionProvider', 'CPUExecutionProvider'] if cuda else ['CPUExecutionProvider']
@@ -380,6 +376,8 @@ class DetectMultiBackend(nn.Module):
            LOGGER.info(f'Loading {w} for TensorRT inference...')
            import tensorrt as trt  # https://developer.nvidia.com/nvidia-tensorrt-download
            check_version(trt.__version__, '7.0.0', hard=True)  # require tensorrt>=7.0.0
+            if device.type == 'cpu':
+                device = torch.device('cuda:0')
            Binding = namedtuple('Binding', ('name', 'dtype', 'shape', 'data', 'ptr'))
            logger = trt.Logger(trt.Logger.INFO)
            with open(w, 'rb') as f, trt.Runtime(logger) as runtime:
@@ -398,8 +396,8 @@ class DetectMultiBackend(nn.Module):
                    if dtype == np.float16:
                        fp16 = True
                shape = tuple(context.get_binding_shape(index))
-                data = torch.from_numpy(np.empty(shape, dtype=np.dtype(dtype))).to(device)
+                im = torch.from_numpy(np.empty(shape, dtype=dtype)).to(device)
-                bindings[name] = Binding(name, dtype, shape, data, int(data.data_ptr()))
+                bindings[name] = Binding(name, dtype, shape, im, int(im.data_ptr()))
            binding_addrs = OrderedDict((n, d.ptr) for n, d in bindings.items())
            batch_size = bindings['images'].shape[0]  # if dynamic, this is instead max batch size
        elif coreml:  # CoreML
@@ -445,9 +443,16 @@ class DetectMultiBackend(nn.Module):
                input_details = interpreter.get_input_details()  # inputs
                output_details = interpreter.get_output_details()  # outputs
            elif tfjs:
-                raise Exception('ERROR: YOLOv5 TF.js inference is not supported')
+                raise NotImplementedError('ERROR: YOLOv5 TF.js inference is not supported')
            else:
-                raise Exception(f'ERROR: {w} is not a supported format')
+                raise NotImplementedError(f'ERROR: {w} is not a supported format')
+        # class names
+        if 'names' not in locals():
+            names = yaml_load(data)['names'] if data else [f'class{i}' for i in range(999)]
+        if names[0] == 'n01440764' and len(names) == 1000:  # ImageNet
+            names = yaml_load(ROOT / 'data/ImageNet.yaml')['names']  # human-readable names
        self.__dict__.update(locals())  # assign all variables to self
    def forward(self, im, augment=False, visualize=False, val=False):
@@ -457,7 +462,9 @@ class DetectMultiBackend(nn.Module):
            im = im.half()  # to FP16
        if self.pt:  # PyTorch
-            y = self.model(im, augment=augment, visualize=visualize)[0]
+            y = self.model(im, augment=augment, visualize=visualize) if augment or visualize else self.model(im)
+            if isinstance(y, tuple):
+                y = y[0]
        elif self.jit:  # TorchScript
            y = self.model(im)[0]
        elif self.dnn:  # ONNX OpenCV DNN
@@ -526,7 +533,7 @@ class DetectMultiBackend(nn.Module):
                self.forward(im)  # warmup
    @staticmethod
-    def model_type(p='path/to/model.pt'):
+    def _model_type(p='path/to/model.pt'):
        # Return model type from model path, i.e. path='path/to/model.onnx' -> type=onnx
        from export import export_formats
        suffixes = list(export_formats().Suffix) + ['.xml']  # export suffixes
@@ -540,8 +547,7 @@ class DetectMultiBackend(nn.Module):
    @staticmethod
    def _load_metadata(f='path/to/meta.yaml'):
        # Load metadata from meta.yaml if it exists
-        with open(f, errors='ignore') as f:
+        d = yaml_load(f)
-            d = yaml.safe_load(f)
        return d['stride'], d['names']  # assign stride, names
@@ -753,10 +759,13 @@ class Classify(nn.Module):
    # Classification head, i.e. x(b,c1,20,20) to x(b,c2)
    def __init__(self, c1, c2, k=1, s=1, p=None, g=1):  # ch_in, ch_out, kernel, stride, padding, groups
        super().__init__()
-        self.aap = nn.AdaptiveAvgPool2d(1)  # to x(b,c1,1,1)
+        c_ = 1280  # efficientnet_b0 size
-        self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g)  # to x(b,c2,1,1)
+        self.conv = Conv(c1, c_, k, s, autopad(k, p), g)
-        self.flat = nn.Flatten()
+        self.pool = nn.AdaptiveAvgPool2d(1)  # to x(b,c_,1,1)
+        self.drop = nn.Dropout(p=0.0, inplace=True)
+        self.linear = nn.Linear(c_, c2)  # to x(b,c2)
    def forward(self, x):
-        z = torch.cat([self.aap(y) for y in (x if isinstance(x, list) else [x])], 1)  # cat if list
+        if isinstance(x, list):
-        return self.flat(self.conv(z))  # flatten to x(b,c2)
+            x = torch.cat(x, 1)
+        return self.linear(self.drop(self.pool(self.conv(x)).flatten(1)))
--- a/models/experimental.py
+++ b/models/experimental.py
@@ -79,7 +79,9 @@ def attempt_load(weights, device=None, inplace=True, fuse=True):
    for w in weights if isinstance(weights, list) else [weights]:
        ckpt = torch.load(attempt_download(w), map_location='cpu')  # load
        ckpt = (ckpt.get('ema') or ckpt['model']).to(device).float()  # FP32 model
-        model.append(ckpt.fuse().eval() if fuse else ckpt.eval())  # fused or un-fused model in eval mode
+        if not hasattr(ckpt, 'stride'):
+            ckpt.stride = torch.tensor([32.])  # compatibility update for ResNet etc.
+        model.append(ckpt.fuse().eval() if fuse and hasattr(ckpt, 'fuse') else ckpt.eval())  # model in eval mode
    # Compatibility updates
    for m in model.modules():
@@ -92,11 +94,14 @@ def attempt_load(weights, device=None, inplace=True, fuse=True):
        elif t is nn.Upsample and not hasattr(m, 'recompute_scale_factor'):
            m.recompute_scale_factor = None  # torch 1.11.0 compatibility
+    # Return model
    if len(model) == 1:
-        return model[-1]  # return model
+        return model[-1]
+    # Return detection ensemble
    print(f'Ensemble created with {weights}\n')
    for k in 'names', 'nc', 'yaml':
        setattr(model, k, getattr(model[0], k))
    model.stride = model[torch.argmax(torch.tensor([m.stride.max() for m in model])).int()].stride  # max stride
    assert all(model[0].nc == m.nc for m in model), f'Models have different class counts: {[m.nc for m in model]}'
-    return model  # return ensemble
+    return model
--- a/models/yolo.py
+++ b/models/yolo.py
@@ -90,8 +90,64 @@ class Detect(nn.Module):
        return grid, anchor_grid
-class Model(nn.Module):
+class BaseModel(nn.Module):
-    # YOLOv5 model
+    # YOLOv5 base model
+    def forward(self, x, profile=False, visualize=False):
+        return self._forward_once(x, profile, visualize)  # single-scale inference, train
+    def _forward_once(self, x, profile=False, visualize=False):
+        y, dt = [], []  # outputs
+        for m in self.model:
+            if m.f != -1:  # if not from previous layer
+                x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f]  # from earlier layers
+            if profile:
+                self._profile_one_layer(m, x, dt)
+            x = m(x)  # run
+            y.append(x if m.i in self.save else None)  # save output
+            if visualize:
+                feature_visualization(x, m.type, m.i, save_dir=visualize)
+        return x
+    def _profile_one_layer(self, m, x, dt):
+        c = m == self.model[-1]  # is final layer, copy input as inplace fix
+        o = thop.profile(m, inputs=(x.copy() if c else x,), verbose=False)[0] / 1E9 * 2 if thop else 0  # FLOPs
+        t = time_sync()
+        for _ in range(10):
+            m(x.copy() if c else x)
+        dt.append((time_sync() - t) * 100)
+        if m == self.model[0]:
+            LOGGER.info(f"{'time (ms)':>10s} {'GFLOPs':>10s} {'params':>10s}  module")
+        LOGGER.info(f'{dt[-1]:10.2f} {o:10.2f} {m.np:10.0f}  {m.type}')
+        if c:
+            LOGGER.info(f"{sum(dt):10.2f} {'-':>10s} {'-':>10s}  Total")
+    def fuse(self):  # fuse model Conv2d() + BatchNorm2d() layers
+        LOGGER.info('Fusing layers... ')
+        for m in self.model.modules():
+            if isinstance(m, (Conv, DWConv)) and hasattr(m, 'bn'):
+                m.conv = fuse_conv_and_bn(m.conv, m.bn)  # update conv
+                delattr(m, 'bn')  # remove batchnorm
+                m.forward = m.forward_fuse  # update forward
+        self.info()
+        return self
+    def info(self, verbose=False, img_size=640):  # print model information
+        model_info(self, verbose, img_size)
+    def _apply(self, fn):
+        # Apply to(), cpu(), cuda(), half() to model tensors that are not parameters or registered buffers
+        self = super()._apply(fn)
+        m = self.model[-1]  # Detect()
+        if isinstance(m, Detect):
+            m.stride = fn(m.stride)
+            m.grid = list(map(fn, m.grid))
+            if isinstance(m.anchor_grid, list):
+                m.anchor_grid = list(map(fn, m.anchor_grid))
+        return self
+class DetectionModel(BaseModel):
+    # YOLOv5 detection model
    def __init__(self, cfg='yolov5s.yaml', ch=3, nc=None, anchors=None):  # model, input channels, number of classes
        super().__init__()
        if isinstance(cfg, dict):
@@ -149,19 +205,6 @@ class Model(nn.Module):
        y = self._clip_augmented(y)  # clip augmented tails
        return torch.cat(y, 1), None  # augmented inference, train
-    def _forward_once(self, x, profile=False, visualize=False):
-        y, dt = [], []  # outputs
-        for m in self.model:
-            if m.f != -1:  # if not from previous layer
-                x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f]  # from earlier layers
-            if profile:
-                self._profile_one_layer(m, x, dt)
-            x = m(x)  # run
-            y.append(x if m.i in self.save else None)  # save output
-            if visualize:
-                feature_visualization(x, m.type, m.i, save_dir=visualize)
-        return x
    def _descale_pred(self, p, flips, scale, img_size):
        # de-scale predictions following augmented inference (inverse operation)
        if self.inplace:
@@ -190,19 +233,6 @@ class Model(nn.Module):
        y[-1] = y[-1][:, i:]  # small
        return y
-    def _profile_one_layer(self, m, x, dt):
-        c = isinstance(m, Detect)  # is final layer, copy input as inplace fix
-        o = thop.profile(m, inputs=(x.copy() if c else x,), verbose=False)[0] / 1E9 * 2 if thop else 0  # FLOPs
-        t = time_sync()
-        for _ in range(10):
-            m(x.copy() if c else x)
-        dt.append((time_sync() - t) * 100)
-        if m == self.model[0]:
-            LOGGER.info(f"{'time (ms)':>10s} {'GFLOPs':>10s} {'params':>10s}  module")
-        LOGGER.info(f'{dt[-1]:10.2f} {o:10.2f} {m.np:10.0f}  {m.type}')
-        if c:
-            LOGGER.info(f"{sum(dt):10.2f} {'-':>10s} {'-':>10s}  Total")
    def _initialize_biases(self, cf=None):  # initialize biases into Detect(), cf is class frequency
        # https://arxiv.org/abs/1708.02002 section 3.3
        # cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1.
@@ -213,41 +243,34 @@ class Model(nn.Module):
            b[:, 5:] += math.log(0.6 / (m.nc - 0.999999)) if cf is None else torch.log(cf / cf.sum())  # cls
            mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True)
-    def _print_biases(self):
-        m = self.model[-1]  # Detect() module
-        for mi in m.m:  # from
-            b = mi.bias.detach().view(m.na, -1).T  # conv.bias(255) to (3,85)
-            LOGGER.info(
-                ('%6g Conv2d.bias:' + '%10.3g' * 6) % (mi.weight.shape[1], *b[:5].mean(1).tolist(), b[5:].mean()))
-    # def _print_weights(self):
+Model = DetectionModel  # retain YOLOv5 'Model' class for backwards compatibility
-    #     for m in self.model.modules():
-    #         if type(m) is Bottleneck:
-    #             LOGGER.info('%10.3g' % (m.w.detach().sigmoid() * 2))  # shortcut weights
-    def fuse(self):  # fuse model Conv2d() + BatchNorm2d() layers
-        LOGGER.info('Fusing layers... ')
-        for m in self.model.modules():
-            if isinstance(m, (Conv, DWConv)) and hasattr(m, 'bn'):
-                m.conv = fuse_conv_and_bn(m.conv, m.bn)  # update conv
-                delattr(m, 'bn')  # remove batchnorm
-                m.forward = m.forward_fuse  # update forward
-        self.info()
-        return self
-    def info(self, verbose=False, img_size=640):  # print model information
+class ClassificationModel(BaseModel):
-        model_info(self, verbose, img_size)
+    # YOLOv5 classification model
+    def __init__(self, cfg=None, model=None, nc=1000, cutoff=10):  # yaml, model, number of classes, cutoff index
-    def _apply(self, fn):
+        super().__init__()
-        # Apply to(), cpu(), cuda(), half() to model tensors that are not parameters or registered buffers
+        self._from_detection_model(model, nc, cutoff) if model is not None else self._from_yaml(cfg)
-        self = super()._apply(fn)
-        m = self.model[-1]  # Detect()
+    def _from_detection_model(self, model, nc=1000, cutoff=10):
-        if isinstance(m, Detect):
+        # Create a YOLOv5 classification model from a YOLOv5 detection model
-            m.stride = fn(m.stride)
+        if isinstance(model, DetectMultiBackend):
-            m.grid = list(map(fn, m.grid))
+            model = model.model  # unwrap DetectMultiBackend
-            if isinstance(m.anchor_grid, list):
+        model.model = model.model[:cutoff]  # backbone
-                m.anchor_grid = list(map(fn, m.anchor_grid))
+        m = model.model[-1]  # last layer
-        return self
+        ch = m.conv.in_channels if hasattr(m, 'conv') else m.cv1.conv.in_channels  # ch into module
+        c = Classify(ch, nc)  # Classify()
+        c.i, c.f, c.type = m.i, m.f, 'models.common.Classify'  # index, from, type
+        model.model[-1] = c  # replace
+        self.model = model.model
+        self.stride = model.stride
+        self.save = []
+        self.nc = nc
+    def _from_yaml(self, cfg):
+        # Create a YOLOv5 classification model from a *.yaml file
+        self.model = None
 def parse_model(d, ch):  # model_dict, input_channels(3)
@@ -321,7 +344,7 @@ if __name__ == '__main__':
    # Options
    if opt.line_profile:  # profile layer by layer
-        _ = model(im, profile=True)
+        model(im, profile=True)
    elif opt.profile:  # profile forward-backward
        results = profile(input=im, ops=[model], n=3)

--- a/train.py
+++ b/train.py
@@ -47,7 +47,7 @@ from utils.downloads import attempt_download, is_url
 from utils.general import (LOGGER, check_amp, check_dataset, check_file, check_git_status, check_img_size,
                           check_requirements, check_suffix, check_yaml, colorstr, get_latest_run, increment_path,
                           init_seeds, intersect_dicts, labels_to_class_weights, labels_to_image_weights, methods,
-                           one_cycle, print_args, print_mutation, strip_optimizer)
+                           one_cycle, print_args, print_mutation, strip_optimizer, yaml_save)
 from utils.loggers import Loggers
 from utils.loggers.wandb.wandb_utils import check_wandb_resume
 from utils.loss import ComputeLoss
@@ -81,10 +81,8 @@ def train(hyp, opt, device, callbacks):  # hyp is path/to/hyp.yaml or hyp dictio
    # Save run settings
    if not evolve:
-        with open(save_dir / 'hyp.yaml', 'w') as f:
+        yaml_save(save_dir / 'hyp.yaml', hyp)
-            yaml.safe_dump(hyp, f, sort_keys=False)
+        yaml_save(save_dir / 'opt.yaml', vars(opt))
-        with open(save_dir / 'opt.yaml', 'w') as f:
-            yaml.safe_dump(vars(opt), f, sort_keys=False)
    # Loggers
    data_dict = None
@@ -484,7 +482,7 @@ def main(opt, callbacks=Callbacks()):
    if RANK in {-1, 0}:
        print_args(vars(opt))
        check_git_status()
-        check_requirements(exclude=['thop'])
+        check_requirements()
    # Resume
    if opt.resume and not (check_wandb_resume(opt) or opt.evolve):  # resume from specified or most recent last.pt

--- a/utils/augmentations.py
+++ b/utils/augmentations.py
@@ -8,15 +8,21 @@ import random
 import cv2
 import numpy as np
+import torchvision.transforms as T
+import torchvision.transforms.functional as TF
 from utils.general import LOGGER, check_version, colorstr, resample_segments, segment2box
 from utils.metrics import bbox_ioa
+IMAGENET_MEAN = 0.485, 0.456, 0.406  # RGB mean
+IMAGENET_STD = 0.229, 0.224, 0.225  # RGB standard deviation
 class Albumentations:
    # YOLOv5 Albumentations class (optional, only used if package is installed)
    def __init__(self):
        self.transform = None
+        prefix = colorstr('albumentations: ')
        try:
            import albumentations as A
            check_version(A.__version__, '1.0.3', hard=True)  # version requirement
@@ -31,11 +37,11 @@ class Albumentations:
                A.ImageCompression(quality_lower=75, p=0.0)]  # transforms
            self.transform = A.Compose(T, bbox_params=A.BboxParams(format='yolo', label_fields=['class_labels']))
-            LOGGER.info(colorstr('albumentations: ') + ', '.join(f'{x}' for x in self.transform.transforms if x.p))
+            LOGGER.info(prefix + ', '.join(f'{x}'.replace('always_apply=False, ', '') for x in T if x.p))
        except ImportError:  # package not installed, skip
            pass
        except Exception as e:
-            LOGGER.info(colorstr('albumentations: ') + f'{e}')
+            LOGGER.info(f'{prefix}{e}')
    def __call__(self, im, labels, p=1.0):
        if self.transform and random.random() < p:
@@ -44,6 +50,18 @@ class Albumentations:
        return im, labels
+def normalize(x, mean=IMAGENET_MEAN, std=IMAGENET_STD, inplace=False):
+    # Denormalize RGB images x per ImageNet stats in BCHW format, i.e. = (x - mean) / std
+    return TF.normalize(x, mean, std, inplace=inplace)
+def denormalize(x, mean=IMAGENET_MEAN, std=IMAGENET_STD):
+    # Denormalize RGB images x per ImageNet stats in BCHW format, i.e. = x * std + mean
+    for i in range(3):
+        x[:, i] = x[:, i] * std[i] + mean[i]
+    return x
 def augment_hsv(im, hgain=0.5, sgain=0.5, vgain=0.5):
    # HSV color-space augmentation
    if hgain or sgain or vgain:
@@ -282,3 +300,48 @@ def box_candidates(box1, box2, wh_thr=2, ar_thr=100, area_thr=0.1, eps=1e-16):
    w2, h2 = box2[2] - box2[0], box2[3] - box2[1]
    ar = np.maximum(w2 / (h2 + eps), h2 / (w2 + eps))  # aspect ratio
    return (w2 > wh_thr) & (h2 > wh_thr) & (w2 * h2 / (w1 * h1 + eps) > area_thr) & (ar < ar_thr)  # candidates
+def classify_albumentations(augment=True,
+                            size=224,
+                            scale=(0.08, 1.0),
+                            hflip=0.5,
+                            vflip=0.0,
+                            jitter=0.4,
+                            mean=IMAGENET_MEAN,
+                            std=IMAGENET_STD,
+                            auto_aug=False):
+    # YOLOv5 classification Albumentations (optional, only used if package is installed)
+    prefix = colorstr('albumentations: ')
+    try:
+        import albumentations as A
+        from albumentations.pytorch import ToTensorV2
+        check_version(A.__version__, '1.0.3', hard=True)  # version requirement
+        if augment:  # Resize and crop
+            T = [A.RandomResizedCrop(height=size, width=size, scale=scale)]
+            if auto_aug:
+                # TODO: implement AugMix, AutoAug & RandAug in albumentation
+                LOGGER.info(f'{prefix}auto augmentations are currently not supported')
+            else:
+                if hflip > 0:
+                    T += [A.HorizontalFlip(p=hflip)]
+                if vflip > 0:
+                    T += [A.VerticalFlip(p=vflip)]
+                if jitter > 0:
+                    color_jitter = (float(jitter),) * 3  # repeat value for brightness, contrast, satuaration, 0 hue
+                    T += [A.ColorJitter(*color_jitter, 0)]
+        else:  # Use fixed crop for eval set (reproducibility)
+            T = [A.SmallestMaxSize(max_size=size), A.CenterCrop(height=size, width=size)]
+        T += [A.Normalize(mean=mean, std=std), ToTensorV2()]  # Normalize and convert to Tensor
+        LOGGER.info(prefix + ', '.join(f'{x}'.replace('always_apply=False, ', '') for x in T if x.p))
+        return A.Compose(T)
+    except ImportError:  # package not installed, skip
+        pass
+    except Exception as e:
+        LOGGER.info(f'{prefix}{e}')
+def classify_transforms(size=224):
+    # Transforms to apply if albumentations not installed
+    return T.Compose([T.ToTensor(), T.Resize(size), T.CenterCrop(size), T.Normalize(IMAGENET_MEAN, IMAGENET_STD)])
--- a/utils/dataloaders.py
+++ b/utils/dataloaders.py
@@ -22,12 +22,14 @@ from zipfile import ZipFile
 import numpy as np
 import torch
 import torch.nn.functional as F
+import torchvision
 import yaml
 from PIL import ExifTags, Image, ImageOps
 from torch.utils.data import DataLoader, Dataset, dataloader, distributed
 from tqdm import tqdm
-from utils.augmentations import Albumentations, augment_hsv, copy_paste, letterbox, mixup, random_perspective
+from utils.augmentations import (Albumentations, augment_hsv, classify_albumentations, classify_transforms, copy_paste,
+                                 letterbox, mixup, random_perspective)
 from utils.general import (DATASETS_DIR, LOGGER, NUM_THREADS, check_dataset, check_requirements, check_yaml, clean_str,
                           cv2, is_colab, is_kaggle, segments2boxes, xyn2xy, xywh2xyxy, xywhn2xyxy, xyxy2xywhn)
 from utils.torch_utils import torch_distributed_zero_first
@@ -870,7 +872,7 @@ def flatten_recursive(path=DATASETS_DIR / 'coco128'):
 def extract_boxes(path=DATASETS_DIR / 'coco128'):  # from utils.dataloaders import *; extract_boxes()
    # Convert detection dataset into classification dataset, with one directory per class
    path = Path(path)  # images dir
-    shutil.rmtree(path / 'classifier') if (path / 'classifier').is_dir() else None  # remove existing
+    shutil.rmtree(path / 'classification') if (path / 'classification').is_dir() else None  # remove existing
    files = list(path.rglob('*.*'))
    n = len(files)  # number of files
    for im_file in tqdm(files, total=n):
@@ -1090,3 +1092,65 @@ class HUBDatasetStats():
                pass
        print(f'Done. All images saved to {self.im_dir}')
        return self.im_dir
+# Classification dataloaders -------------------------------------------------------------------------------------------
+class ClassificationDataset(torchvision.datasets.ImageFolder):
+    """
+    YOLOv5 Classification Dataset.
+    Arguments
+        root:  Dataset path
+        transform:  torchvision transforms, used by default
+        album_transform: Albumentations transforms, used if installed
+    """
+    def __init__(self, root, augment, imgsz, cache=False):
+        super().__init__(root=root)
+        self.torch_transforms = classify_transforms(imgsz)
+        self.album_transforms = classify_albumentations(augment, imgsz) if augment else None
+        self.cache_ram = cache is True or cache == 'ram'
+        self.cache_disk = cache == 'disk'
+        self.samples = [list(x) + [Path(x[0]).with_suffix('.npy'), None] for x in self.samples]  # file, index, npy, im
+    def __getitem__(self, i):
+        f, j, fn, im = self.samples[i]  # filename, index, filename.with_suffix('.npy'), image
+        if self.album_transforms:
+            if self.cache_ram and im is None:
+                im = self.samples[i][3] = cv2.imread(f)
+            elif self.cache_disk:
+                if not fn.exists():  # load npy
+                    np.save(fn.as_posix(), cv2.imread(f))
+                im = np.load(fn)
+            else:  # read image
+                im = cv2.imread(f)  # BGR
+            sample = self.album_transforms(image=cv2.cvtColor(im, cv2.COLOR_BGR2RGB))["image"]
+        else:
+            sample = self.torch_transforms(self.loader(f))
+        return sample, j
+def create_classification_dataloader(path,
+                                     imgsz=224,
+                                     batch_size=16,
+                                     augment=True,
+                                     cache=False,
+                                     rank=-1,
+                                     workers=8,
+                                     shuffle=True):
+    # Returns Dataloader object to be used with YOLOv5 Classifier
+    with torch_distributed_zero_first(rank):  # init dataset *.cache only once if DDP
+        dataset = ClassificationDataset(root=path, imgsz=imgsz, augment=augment, cache=cache)
+    batch_size = min(batch_size, len(dataset))
+    nd = torch.cuda.device_count()
+    nw = min([os.cpu_count() // max(nd, 1), batch_size if batch_size > 1 else 0, workers])
+    sampler = None if rank == -1 else distributed.DistributedSampler(dataset, shuffle=shuffle)
+    generator = torch.Generator()
+    generator.manual_seed(0)
+    return InfiniteDataLoader(dataset,
+                              batch_size=batch_size,
+                              shuffle=shuffle and sampler is None,
+                              num_workers=nw,
+                              sampler=sampler,
+                              pin_memory=True,
+                              worker_init_fn=seed_worker,
+                              generator=generator)  # or DataLoader(persistent_workers=True)
--- a/utils/general.py
+++ b/utils/general.py
@@ -217,7 +217,11 @@ def print_args(args: Optional[dict] = None, show_file=True, show_fcn=False):
    if args is None:  # get args automatically
        args, _, _, frm = inspect.getargvalues(x)
        args = {k: v for k, v in frm.items() if k in args}
-    s = (f'{Path(file).stem}: ' if show_file else '') + (f'{fcn}: ' if show_fcn else '')
+    try:
+        file = Path(file).resolve().relative_to(ROOT).with_suffix('')
+    except ValueError:
+        file = Path(file).stem
+    s = (f'{file}: ' if show_file else '') + (f'{fcn}: ' if show_fcn else '')
    LOGGER.info(colorstr(s) + ', '.join(f'{k}={v}' for k, v in args.items()))
@@ -345,7 +349,7 @@ def check_version(current='0.0.0', minimum='0.0.0', name='version ', pinned=Fals
 @try_except
 def check_requirements(requirements=ROOT / 'requirements.txt', exclude=(), install=True, cmds=()):
-    # Check installed dependencies meet requirements (pass *.txt file or list of packages)
+    # Check installed dependencies meet YOLOv5 requirements (pass *.txt file or list of packages)
    prefix = colorstr('red', 'bold', 'requirements:')
    check_python()  # check python version
    if isinstance(requirements, (str, Path)):  # requirements.txt file
@@ -549,6 +553,18 @@ def check_amp(model):
        return False
+def yaml_load(file='data.yaml'):
+    # Single-line safe yaml loading
+    with open(file, errors='ignore') as f:
+        return yaml.safe_load(f)
+def yaml_save(file='data.yaml', data={}):
+    # Single-line safe yaml saving
+    with open(file, 'w') as f:
+        yaml.safe_dump({k: str(v) if isinstance(v, Path) else v for k, v in data.items()}, f, sort_keys=False)
 def url2file(url):
    # Convert URL to filename, i.e. https://url.com/file.txt?auth -> file.txt
    url = str(Path(url)).replace(':/', '://')  # Pathlib turns :// -> :/

--- a/utils/loggers/__init__.py
+++ b/utils/loggers/__init__.py
@@ -5,6 +5,7 @@ Logging utils
 import os
 import warnings
+from pathlib import Path
 import pkg_resources as pkg
 import torch
@@ -76,7 +77,7 @@ class Loggers():
            self.logger.info(s)
        if not clearml:
            prefix = colorstr('ClearML: ')
-            s = f"{prefix}run 'pip install clearml' to automatically track, visualize and remotely train YOLOv5 🚀 runs in ClearML"
+            s = f"{prefix}run 'pip install clearml' to automatically track, visualize and remotely train YOLOv5 🚀 in ClearML"
            self.logger.info(s)
        # TensorBoard
@@ -121,11 +122,8 @@ class Loggers():
        # Callback runs on train batch end
        # ni: number integrated batches (since train start)
        if plots:
-            if ni == 0:
+            if ni == 0 and not self.opt.sync_bn and self.tb:
-                if self.tb and not self.opt.sync_bn:  # --sync known issue https://github.com/ultralytics/yolov5/issues/3754
+                log_tensorboard_graph(self.tb, model, imgsz=list(imgs.shape[2:4]))
-                    with warnings.catch_warnings():
-                        warnings.simplefilter('ignore')  # suppress jit trace warning
-                        self.tb.add_graph(torch.jit.trace(de_parallel(model), imgs[0:1], strict=False), [])
            if ni < 3:
                f = self.save_dir / f'train_batch{ni}.jpg'  # filename
                plot_images(imgs, targets, paths, f)
@@ -233,3 +231,78 @@ class Loggers():
        # params: A dict containing {param: value} pairs
        if self.wandb:
            self.wandb.wandb_run.config.update(params, allow_val_change=True)
+class GenericLogger:
+    """
+    YOLOv5 General purpose logger for non-task specific logging
+    Usage: from utils.loggers import GenericLogger; logger = GenericLogger(...)
+    Arguments
+        opt:             Run arguments
+        console_logger:  Console logger
+        include:         loggers to include
+    """
+    def __init__(self, opt, console_logger, include=('tb', 'wandb')):
+        # init default loggers
+        self.save_dir = opt.save_dir
+        self.include = include
+        self.console_logger = console_logger
+        if 'tb' in self.include:
+            prefix = colorstr('TensorBoard: ')
+            self.console_logger.info(
+                f"{prefix}Start with 'tensorboard --logdir {self.save_dir.parent}', view at http://localhost:6006/")
+            self.tb = SummaryWriter(str(self.save_dir))
+        if wandb and 'wandb' in self.include:
+            self.wandb = wandb.init(project="YOLOv5-Classifier" if opt.project == "runs/train" else opt.project,
+                                    name=None if opt.name == "exp" else opt.name,
+                                    config=opt)
+        else:
+            self.wandb = None
+    def log_metrics(self, metrics_dict, epoch):
+        # Log metrics dictionary to all loggers
+        if self.tb:
+            for k, v in metrics_dict.items():
+                self.tb.add_scalar(k, v, epoch)
+        if self.wandb:
+            self.wandb.log(metrics_dict, step=epoch)
+    def log_images(self, files, name='Images', epoch=0):
+        # Log images to all loggers
+        files = [Path(f) for f in (files if isinstance(files, (tuple, list)) else [files])]  # to Path
+        files = [f for f in files if f.exists()]  # filter by exists
+        if self.tb:
+            for f in files:
+                self.tb.add_image(f.stem, cv2.imread(str(f))[..., ::-1], epoch, dataformats='HWC')
+        if self.wandb:
+            self.wandb.log({name: [wandb.Image(str(f), caption=f.name) for f in files]}, step=epoch)
+    def log_graph(self, model, imgsz=(640, 640)):
+        # Log model graph to all loggers
+        if self.tb:
+            log_tensorboard_graph(self.tb, model, imgsz)
+    def log_model(self, model_path, epoch=0, metadata={}):
+        # Log model to all loggers
+        if self.wandb:
+            art = wandb.Artifact(name=f"run_{wandb.run.id}_model", type="model", metadata=metadata)
+            art.add_file(str(model_path))
+            wandb.log_artifact(art)
+def log_tensorboard_graph(tb, model, imgsz=(640, 640)):
+    # Log model graph to TensorBoard
+    try:
+        p = next(model.parameters())  # for device, type
+        imgsz = (imgsz, imgsz) if isinstance(imgsz, int) else imgsz  # expand
+        im = torch.zeros((1, 3, *imgsz)).to(p.device).type_as(p)  # input image
+        with warnings.catch_warnings():
+            warnings.simplefilter('ignore')  # suppress jit trace warning
+            tb.add_graph(torch.jit.trace(de_parallel(model), im, strict=False), [])
+    except Exception:
+        print('WARNING: TensorBoard graph visualization failure')
--- a/utils/plots.py
+++ b/utils/plots.py
@@ -388,6 +388,35 @@ def plot_labels(labels, names=(), save_dir=Path('')):
    plt.close()
+def imshow_cls(im, labels=None, pred=None, names=None, nmax=25, verbose=False, f=Path('images.jpg')):
+    # Show classification image grid with labels (optional) and predictions (optional)
+    from utils.augmentations import denormalize
+    names = names or [f'class{i}' for i in range(1000)]
+    blocks = torch.chunk(denormalize(im.clone()).cpu().float(), len(im),
+                         dim=0)  # select batch index 0, block by channels
+    n = min(len(blocks), nmax)  # number of plots
+    m = min(8, round(n ** 0.5))  # 8 x 8 default
+    fig, ax = plt.subplots(math.ceil(n / m), m)  # 8 rows x n/8 cols
+    ax = ax.ravel() if m > 1 else [ax]
+    # plt.subplots_adjust(wspace=0.05, hspace=0.05)
+    for i in range(n):
+        ax[i].imshow(blocks[i].squeeze().permute((1, 2, 0)).numpy().clip(0.0, 1.0))
+        ax[i].axis('off')
+        if labels is not None:
+            s = names[labels[i]] + (f'—{names[pred[i]]}' if pred is not None else '')
+            ax[i].set_title(s, fontsize=8, verticalalignment='top')
+    plt.savefig(f, dpi=300, bbox_inches='tight')
+    plt.close()
+    if verbose:
+        LOGGER.info(f"Saving {f}")
+        if labels is not None:
+            LOGGER.info('True:     ' + ' '.join(f'{names[i]:3s}' for i in labels[:nmax]))
+        if pred is not None:
+            LOGGER.info('Predicted:' + ' '.join(f'{names[i]:3s}' for i in pred[:nmax]))
+    return f
 def plot_evolve(evolve_csv='path/to/evolve.csv'):  # from utils.plots import *; plot_evolve()
    # Plot evolve.csv hyp evolution results
    evolve_csv = Path(evolve_csv)

--- a/utils/torch_utils.py
+++ b/utils/torch_utils.py
@@ -42,6 +42,16 @@ def smart_inference_mode(torch_1_9=check_version(torch.__version__, '1.9.0')):
    return decorate
+def smartCrossEntropyLoss(label_smoothing=0.0):
+    # Returns nn.CrossEntropyLoss with label smoothing enabled for torch>=1.10.0
+    if check_version(torch.__version__, '1.10.0'):
+        return nn.CrossEntropyLoss(label_smoothing=label_smoothing)  # loss function
+    else:
+        if label_smoothing > 0:
+            LOGGER.warning(f'WARNING: label smoothing {label_smoothing} requires torch>=1.10.0')
+        return nn.CrossEntropyLoss()  # loss function
 def smart_DDP(model):
    # Model DDP creation with checks
    assert not check_version(torch.__version__, '1.12.0', pinned=True), \
@@ -53,6 +63,28 @@ def smart_DDP(model):
        return DDP(model, device_ids=[LOCAL_RANK], output_device=LOCAL_RANK)
+def reshape_classifier_output(model, n=1000):
+    # Update a TorchVision classification model to class count 'n' if required
+    from models.common import Classify
+    name, m = list((model.model if hasattr(model, 'model') else model).named_children())[-1]  # last module
+    if isinstance(m, Classify):  # YOLOv5 Classify() head
+        if m.linear.out_features != n:
+            m.linear = nn.Linear(m.linear.in_features, n)
+    elif isinstance(m, nn.Linear):  # ResNet, EfficientNet
+        if m.out_features != n:
+            setattr(model, name, nn.Linear(m.in_features, n))
+    elif isinstance(m, nn.Sequential):
+        types = [type(x) for x in m]
+        if nn.Linear in types:
+            i = types.index(nn.Linear)  # nn.Linear index
+            if m[i].out_features != n:
+                m[i] = nn.Linear(m[i].in_features, n)
+        elif nn.Conv2d in types:
+            i = types.index(nn.Conv2d)  # nn.Conv2d index
+            if m[i].out_channels != n:
+                m[i] = nn.Conv2d(m[i].in_channels, n, m[i].kernel_size, m[i].stride, bias=m[i].bias)
 @contextmanager
 def torch_distributed_zero_first(local_rank: int):
    # Decorator to make all processes in distributed training wait for each local_master to do something
@@ -117,14 +149,13 @@ def time_sync():
 def profile(input, ops, n=10, device=None):
-    # YOLOv5 speed/memory/FLOPs profiler
+    """ YOLOv5 speed/memory/FLOPs profiler
-    #
+    Usage:
-    # Usage:
+        input = torch.randn(16, 3, 640, 640)
-    #     input = torch.randn(16, 3, 640, 640)
+        m1 = lambda x: x * torch.sigmoid(x)
-    #     m1 = lambda x: x * torch.sigmoid(x)
+        m2 = nn.SiLU()
-    #     m2 = nn.SiLU()
+        profile(input, [m1, m2], n=100)  # profile over 100 iterations
-    #     profile(input, [m1, m2], n=100)  # profile over 100 iterations
+    """
    results = []
    if not isinstance(device, torch.device):
        device = select_device(device)
@@ -313,6 +344,18 @@ def smart_optimizer(model, name='Adam', lr=0.001, momentum=0.9, decay=1e-5):
    return optimizer
+def smart_hub_load(repo='ultralytics/yolov5', model='yolov5s', **kwargs):
+    # YOLOv5 torch.hub.load() wrapper with smart error/issue handling
+    if check_version(torch.__version__, '1.9.1'):
+        kwargs['skip_validation'] = True  # validation causes GitHub API rate limit errors
+    if check_version(torch.__version__, '1.12.0'):
+        kwargs['trust_repo'] = True  # argument required starting in torch 0.12
+    try:
+        return torch.hub.load(repo, model, **kwargs)
+    except Exception:
+        return torch.hub.load(repo, model, force_reload=True, **kwargs)
 def smart_resume(ckpt, optimizer, ema=None, weights='yolov5s.pt', epochs=300, resume=True):
    # Resume training from a partially trained checkpoint
    best_fitness = 0.0