Unverified 提交 f2d97ebb authored 作者: Glenn Jocher's avatar Glenn Jocher 提交者: GitHub

Remove DDP MultiHeadAttention fix (#3768)

上级 37495731
...@@ -252,9 +252,7 @@ def train(hyp, # path/to/hyp.yaml or hyp dictionary ...@@ -252,9 +252,7 @@ def train(hyp, # path/to/hyp.yaml or hyp dictionary
# DDP mode # DDP mode
if cuda and RANK != -1: if cuda and RANK != -1:
model = DDP(model, device_ids=[LOCAL_RANK], output_device=LOCAL_RANK, model = DDP(model, device_ids=[LOCAL_RANK], output_device=LOCAL_RANK)
# nn.MultiheadAttention incompatibility with DDP https://github.com/pytorch/pytorch/issues/26698
find_unused_parameters=any(isinstance(layer, nn.MultiheadAttention) for layer in model.modules()))
# Model parameters # Model parameters
hyp['box'] *= 3. / nl # scale to layers hyp['box'] *= 3. / nl # scale to layers
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论