• yzchen's avatar
    [WIP] Feature/ddp fixed (#401) · 4102fcc9
    yzchen 提交于
    * Squashed commit of the following:
    
    commit d738487089e41c22b3b1cd73aa7c1c40320a6ebf
    Author: NanoCode012 <kevinvong@rocketmail.com>
    Date:   Tue Jul 14 17:33:38 2020 +0700
    
        Adding world_size
    
        Reduce calls to torch.distributed. For use in create_dataloader.
    
    commit e742dd9619d29306c7541821238d3d7cddcdc508
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Tue Jul 14 15:38:48 2020 +0800
    
        Make SyncBN a choice
    
    commit e90d4004387e6103fecad745f8cbc2edc918e906
    Merge: 5bf8beb cd90360
    Author: yzchen <Chenyzsjtu@gmail.com>
    Date:   Tue Jul 14 15:32:10 2020 +0800
    
        Merge pull request #6 from NanoCode012/patch-5
    
        Update train.py
    
    commit cd9036017e7f8bd519a8b62adab0f47ea67f4962
    Author: NanoCode012 <kevinvong@rocketmail.com>
    Date:   Tue Jul 14 13:39:29 2020 +0700
    
        Update train.py
    
        Remove redundant `opt.` prefix.
    
    commit 5bf8bebe8873afb18b762fe1f409aca116fac073
    Merge: c9558a9 a1c8406a
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Tue Jul 14 14:09:51 2020 +0800
    
        Merge branch 'master' of https://github.com/ultralytics/yolov5 into feature/DDP_fixed
    
    commit c9558a9b51547febb03d9c1ca42e2ef0fc15bb31
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Tue Jul 14 13:51:34 2020 +0800
    
        Add device allocation for loss compute
    
    commit 4f08c692fb5e943a89e0ee354ef6c80a50eeb28d
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Thu Jul 9 11:16:27 2020 +0800
    
        Revert drop_last
    
    commit 1dabe33a5a223b758cc761fc8741c6224205a34b
    Merge: a1ce9b1 4b8450b
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Thu Jul 9 11:15:49 2020 +0800
    
        Merge branch 'feature/DDP_fixed' of https://github.com/MagicFrogSJTU/yolov5 into feature/DDP_fixed
    
    commit a1ce9b1e96b71d7fcb9d3e8143013eb8cebe5e27
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Thu Jul 9 11:15:21 2020 +0800
    
        fix lr warning
    
    commit 4b8450b46db76e5e58cd95df965d4736077cfb0e
    Merge: b9a50ae 02c63ef
    Author: yzchen <Chenyzsjtu@gmail.com>
    Date:   Wed Jul 8 21:24:24 2020 +0800
    
        Merge pull request #4 from NanoCode012/patch-4
    
        Add drop_last for multi gpu
    
    commit 02c63ef81cf98b28b10344fe2cce08a03b143941
    Author: NanoCode012 <kevinvong@rocketmail.com>
    Date:   Wed Jul 8 10:08:30 2020 +0700
    
        Add drop_last for multi gpu
    
    commit b9a50aed48ab1536f94d49269977e2accd67748f
    Merge: ec2dc6c 121d90b3
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Tue Jul 7 19:48:04 2020 +0800
    
        Merge branch 'master' of https://github.com/ultralytics/yolov5 into feature/DDP_fixed
    
    commit ec2dc6cc56de43ddff939e14c450672d0fbf9b3d
    Merge: d0326e3 82a6182
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Tue Jul 7 19:34:31 2020 +0800
    
        Merge branch 'feature/DDP_fixed' of https://github.com/MagicFrogSJTU/yolov5 into feature/DDP_fixed
    
    commit d0326e398dfeeeac611ccc64198d4fe91b7aa969
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Tue Jul 7 19:31:24 2020 +0800
    
        Add SyncBN
    
    commit 82a6182b3ad0689a4432b631b438004e5acb3b74
    Merge: 96fa40a 050b2a5
    Author: yzchen <Chenyzsjtu@gmail.com>
    Date:   Tue Jul 7 19:21:01 2020 +0800
    
        Merge pull request #1 from NanoCode012/patch-2
    
        Convert BatchNorm to SyncBatchNorm
    
    commit 050b2a5a79a89c9405854d439a1f70f892139b1c
    Author: NanoCode012 <kevinvong@rocketmail.com>
    Date:   Tue Jul 7 12:38:14 2020 +0700
    
        Add cleanup for process_group
    
    commit 2aa330139f3cc1237aeb3132245ed7e5d6da1683
    Author: NanoCode012 <kevinvong@rocketmail.com>
    Date:   Tue Jul 7 12:07:40 2020 +0700
    
        Remove apex.parallel. Use torch.nn.parallel
    
        For future compatibility
    
    commit 77c8e27e603bea9a69e7647587ca8d509dc1990d
    Author: NanoCode012 <kevinvong@rocketmail.com>
    Date:   Tue Jul 7 01:54:39 2020 +0700
    
        Convert BatchNorm to SyncBatchNorm
    
    commit 96fa40a3a925e4ffd815fe329e1b5181ec92adc8
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Mon Jul 6 21:53:56 2020 +0800
    
        Fix the datset inconsistency problem
    
    commit 16e7c269d062c8d16c4d4ff70cc80fd87935dc95
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Mon Jul 6 11:34:03 2020 +0800
    
        Add loss multiplication to preserver the single-process performance
    
    commit e83805563065ffd2e38f85abe008fc662cc17909
    Merge: 625bb49 3bdea3f6
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Fri Jul 3 20:56:30 2020 +0800
    
        Merge branch 'master' of https://github.com/ultralytics/yolov5 into feature/DDP_fixed
    
    commit 625bb49f4e52d781143fea0af36d14e5be8b040c
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Thu Jul 2 22:45:15 2020 +0800
    
        DDP established
    
    * Squashed commit of the following:
    
    commit 94147314e559a6bdd13cb9de62490d385c27596f
    Merge: 65157e2 37acbdc0
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Thu Jul 16 14:00:17 2020 +0800
    
        Merge branch 'master' of https://github.com/ultralytics/yolov4 into feature/DDP_fixed
    
    commit 37acbdc0
    Author: Glenn Jocher <glenn.jocher@ultralytics.com>
    Date:   Wed Jul 15 20:03:41 2020 -0700
    
        update test.py --save-txt
    
    commit b8c2da4a
    Author: Glenn Jocher <glenn.jocher@ultralytics.com>
    Date:   Wed Jul 15 20:00:48 2020 -0700
    
        update test.py --save-txt
    
    commit 65157e2fc97d371bc576e18b424e130eb3026917
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Wed Jul 15 16:44:13 2020 +0800
    
        Revert the README.md removal
    
    commit 1c802bfa503623661d8617ca3f259835d27c5345
    Merge: cd55b44 0f3b8bb
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Wed Jul 15 16:43:38 2020 +0800
    
        Merge branch 'feature/DDP_fixed' of https://github.com/MagicFrogSJTU/yolov5 into feature/DDP_fixed
    
    commit cd55b445c4dcd8003ff4b0b46b64adf7c16e5ce7
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Wed Jul 15 16:42:33 2020 +0800
    
        fix the DDP performance deterioration bug.
    
    commit 0f3b8bb1fae5885474ba861bbbd1924fb622ee93
    Author: Glenn Jocher <glenn.jocher@ultralytics.com>
    Date:   Wed Jul 15 00:28:53 2020 -0700
    
        Delete README.md
    
    commit f5921ba1e35475f24b062456a890238cb7a3cf94
    Merge: 85ab2f3 bd3fdbb
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Wed Jul 15 11:20:17 2020 +0800
    
        Merge branch 'feature/DDP_fixed' of https://github.com/MagicFrogSJTU/yolov5 into feature/DDP_fixed
    
    commit bd3fdbbf1b08ef87931eef49fa8340621caa7e87
    Author: Glenn Jocher <glenn.jocher@ultralytics.com>
    Date:   Tue Jul 14 18:38:20 2020 -0700
    
        Update README.md
    
    commit c1a97a7767ccb2aa9afc7a5e72fd159e7c62ec02
    Merge: 2bf86b8 f796708b
    Author: Glenn Jocher <glenn.jocher@ultralytics.com>
    Date:   Tue Jul 14 18:36:53 2020 -0700
    
        Merge branch 'master' into feature/DDP_fixed
    
    commit 2bf86b892fa2fd712f6530903a0d9b8533d7447a
    Author: NanoCode012 <kevinvong@rocketmail.com>
    Date:   Tue Jul 14 22:18:15 2020 +0700
    
        Fixed world_size not found when called from test
    
    commit 85ab2f38cdda28b61ad15a3a5a14c3aafb620dc8
    Merge: 5a19011 c8357ad
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Tue Jul 14 22:19:58 2020 +0800
    
        Merge branch 'feature/DDP_fixed' of https://github.com/MagicFrogSJTU/yolov5 into feature/DDP_fixed
    
    commit 5a19011949398d06e744d8d5521ab4e6dfa06ab7
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Tue Jul 14 22:19:15 2020 +0800
    
        Add assertion for <=2 gpus DDP
    
    commit c8357ad5b15a0e6aeef4d7fe67ca9637f7322a4d
    Merge: e742dd9 787582f
    Author: yzchen <Chenyzsjtu@gmail.com>
    Date:   Tue Jul 14 22:10:02 2020 +0800
    
        Merge pull request #8 from MagicFrogSJTU/NanoCode012-patch-1
    
        Modify number of dataloaders' workers
    
    commit 787582f97251834f955ef05a77072b8c673a8397
    Author: NanoCode012 <kevinvong@rocketmail.com>
    Date:   Tue Jul 14 20:38:58 2020 +0700
    
        Fixed issue with single gpu not having world_size
    
    commit 63648925288d63a21174a4dd28f92dbfebfeb75a
    Author: NanoCode012 <kevinvong@rocketmail.com>
    Date:   Tue Jul 14 19:16:15 2020 +0700
    
        Add assert message for clarification
    
        Clarify why assertion was thrown to users
    
    commit 69364d6050e048d0d8834e0f30ce84da3f6a13f3
    Author: NanoCode012 <kevinvong@rocketmail.com>
    Date:   Tue Jul 14 17:36:48 2020 +0700
    
        Changed number of workers check
    
    commit d738487089e41c22b3b1cd73aa7c1c40320a6ebf
    Author: NanoCode012 <kevinvong@rocketmail.com>
    Date:   Tue Jul 14 17:33:38 2020 +0700
    
        Adding world_size
    
        Reduce calls to torch.distributed. For use in create_dataloader.
    
    commit e742dd9619d29306c7541821238d3d7cddcdc508
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Tue Jul 14 15:38:48 2020 +0800
    
        Make SyncBN a choice
    
    commit e90d4004387e6103fecad745f8cbc2edc918e906
    Merge: 5bf8beb cd90360
    Author: yzchen <Chenyzsjtu@gmail.com>
    Date:   Tue Jul 14 15:32:10 2020 +0800
    
        Merge pull request #6 from NanoCode012/patch-5
    
        Update train.py
    
    commit cd9036017e7f8bd519a8b62adab0f47ea67f4962
    Author: NanoCode012 <kevinvong@rocketmail.com>
    Date:   Tue Jul 14 13:39:29 2020 +0700
    
        Update train.py
    
        Remove redundant `opt.` prefix.
    
    commit 5bf8bebe8873afb18b762fe1f409aca116fac073
    Merge: c9558a9 a1c8406a
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Tue Jul 14 14:09:51 2020 +0800
    
        Merge branch 'master' of https://github.com/ultralytics/yolov5 into feature/DDP_fixed
    
    commit c9558a9b51547febb03d9c1ca42e2ef0fc15bb31
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Tue Jul 14 13:51:34 2020 +0800
    
        Add device allocation for loss compute
    
    commit 4f08c692fb5e943a89e0ee354ef6c80a50eeb28d
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Thu Jul 9 11:16:27 2020 +0800
    
        Revert drop_last
    
    commit 1dabe33a5a223b758cc761fc8741c6224205a34b
    Merge: a1ce9b1 4b8450b
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Thu Jul 9 11:15:49 2020 +0800
    
        Merge branch 'feature/DDP_fixed' of https://github.com/MagicFrogSJTU/yolov5 into feature/DDP_fixed
    
    commit a1ce9b1e96b71d7fcb9d3e8143013eb8cebe5e27
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Thu Jul 9 11:15:21 2020 +0800
    
        fix lr warning
    
    commit 4b8450b46db76e5e58cd95df965d4736077cfb0e
    Merge: b9a50ae 02c63ef
    Author: yzchen <Chenyzsjtu@gmail.com>
    Date:   Wed Jul 8 21:24:24 2020 +0800
    
        Merge pull request #4 from NanoCode012/patch-4
    
        Add drop_last for multi gpu
    
    commit 02c63ef81cf98b28b10344fe2cce08a03b143941
    Author: NanoCode012 <kevinvong@rocketmail.com>
    Date:   Wed Jul 8 10:08:30 2020 +0700
    
        Add drop_last for multi gpu
    
    commit b9a50aed48ab1536f94d49269977e2accd67748f
    Merge: ec2dc6c 121d90b3
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Tue Jul 7 19:48:04 2020 +0800
    
        Merge branch 'master' of https://github.com/ultralytics/yolov5 into feature/DDP_fixed
    
    commit ec2dc6cc56de43ddff939e14c450672d0fbf9b3d
    Merge: d0326e3 82a6182
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Tue Jul 7 19:34:31 2020 +0800
    
        Merge branch 'feature/DDP_fixed' of https://github.com/MagicFrogSJTU/yolov5 into feature/DDP_fixed
    
    commit d0326e398dfeeeac611ccc64198d4fe91b7aa969
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Tue Jul 7 19:31:24 2020 +0800
    
        Add SyncBN
    
    commit 82a6182b3ad0689a4432b631b438004e5acb3b74
    Merge: 96fa40a 050b2a5
    Author: yzchen <Chenyzsjtu@gmail.com>
    Date:   Tue Jul 7 19:21:01 2020 +0800
    
        Merge pull request #1 from NanoCode012/patch-2
    
        Convert BatchNorm to SyncBatchNorm
    
    commit 050b2a5a79a89c9405854d439a1f70f892139b1c
    Author: NanoCode012 <kevinvong@rocketmail.com>
    Date:   Tue Jul 7 12:38:14 2020 +0700
    
        Add cleanup for process_group
    
    commit 2aa330139f3cc1237aeb3132245ed7e5d6da1683
    Author: NanoCode012 <kevinvong@rocketmail.com>
    Date:   Tue Jul 7 12:07:40 2020 +0700
    
        Remove apex.parallel. Use torch.nn.parallel
    
        For future compatibility
    
    commit 77c8e27e603bea9a69e7647587ca8d509dc1990d
    Author: NanoCode012 <kevinvong@rocketmail.com>
    Date:   Tue Jul 7 01:54:39 2020 +0700
    
        Convert BatchNorm to SyncBatchNorm
    
    commit 96fa40a3a925e4ffd815fe329e1b5181ec92adc8
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Mon Jul 6 21:53:56 2020 +0800
    
        Fix the datset inconsistency problem
    
    commit 16e7c269d062c8d16c4d4ff70cc80fd87935dc95
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Mon Jul 6 11:34:03 2020 +0800
    
        Add loss multiplication to preserver the single-process performance
    
    commit e83805563065ffd2e38f85abe008fc662cc17909
    Merge: 625bb49 3bdea3f6
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Fri Jul 3 20:56:30 2020 +0800
    
        Merge branch 'master' of https://github.com/ultralytics/yolov5 into feature/DDP_fixed
    
    commit 625bb49f4e52d781143fea0af36d14e5be8b040c
    Author: yizhi.chen <chenyzsjtu@outlook.com>
    Date:   Thu Jul 2 22:45:15 2020 +0800
    
        DDP established
    
    * Fixed destroy_process_group in DP mode
    
    * Update torch_utils.py
    
    * Update utils.py
    
    Revert build_targets() to current master.
    
    * Update datasets.py
    
    * Fixed world_size attribute not found
    Co-authored-by: 's avatarNanoCode012 <kevinvong@rocketmail.com>
    Co-authored-by: 's avatarGlenn Jocher <glenn.jocher@ultralytics.com>
    4102fcc9
train.py 26.0 KB