Unverified 提交 f2d97ebb authored 作者: Glenn Jocher's avatar Glenn Jocher 提交者: GitHub

Remove DDP MultiHeadAttention fix (#3768)

上级 37495731
......@@ -252,9 +252,7 @@ def train(hyp, # path/to/hyp.yaml or hyp dictionary
# DDP mode
if cuda and RANK != -1:
model = DDP(model, device_ids=[LOCAL_RANK], output_device=LOCAL_RANK,
# nn.MultiheadAttention incompatibility with DDP https://github.com/pytorch/pytorch/issues/26698
find_unused_parameters=any(isinstance(layer, nn.MultiheadAttention) for layer in model.modules()))
model = DDP(model, device_ids=[LOCAL_RANK], output_device=LOCAL_RANK)
# Model parameters
hyp['box'] *= 3. / nl # scale to layers
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论