• yellowdolphin's avatar
    Fix warmup `accumulate` (#3722) · 3974d725
    yellowdolphin 提交于
    * gradient accumulation during warmup in train.py
    
    Context:
    `accumulate` is the number of batches/gradients accumulated before calling the next optimizer.step().
    During warmup, it is ramped up from 1 to the final value nbs / batch_size. 
    Although I have not seen this in other libraries, I like the idea. During warmup, as grads are large, too large steps are more of on issue than gradient noise due to small steps.
    
    The bug:
    The condition to perform the opt step is wrong
    > if ni % accumulate == 0:
    This produces irregular step sizes if `accumulate` is not constant. It becomes relevant when batch_size is small and `accumulate` changes many times during warmup.
    
    This demo also shows the proposed solution, to use a ">=" condition instead:
    https://colab.research.google.com/drive/1MA2z2eCXYB_BC5UZqgXueqL_y1Tz_XVq?usp=sharing
    
    Further, I propose not to restrict the number of warmup iterations to >= 1000. If the user changes hyp['warmup_epochs'], this causes unexpected behavior. Also, it makes evolution unstable if this parameter was to be optimized.
    
    * replace last_opt_step tracking by do_step(ni)
    
    * add docstrings
    
    * move down nw
    
    * Update train.py
    
    * revert math import move
    Co-authored-by: 's avatarGlenn Jocher <glenn.jocher@ultralytics.com>
    3974d725
名称
最后提交
最后更新
.github 正在载入提交数据...
data 正在载入提交数据...
models 正在载入提交数据...
utils 正在载入提交数据...
.dockerignore 正在载入提交数据...
.gitattributes 正在载入提交数据...
.gitignore 正在载入提交数据...
CONTRIBUTING.md 正在载入提交数据...
Dockerfile 正在载入提交数据...
LICENSE 正在载入提交数据...
README.md 正在载入提交数据...
detect.py 正在载入提交数据...
export.py 正在载入提交数据...
hubconf.py 正在载入提交数据...
requirements.txt 正在载入提交数据...
test.py 正在载入提交数据...
train.py 正在载入提交数据...
tutorial.ipynb 正在载入提交数据...