Reproducibility and performance in PyTorch(PyTorch 中的重现性和性能)
问题描述
文档 指出:
确定性模式可能会对性能产生影响,具体取决于您的模型.
Deterministic mode can have a performance impact, depending on your model.
我的问题是,这里的性能是什么意思.处理速度或模型质量(即最小损失)?换句话说,当设置手动种子并使模型以确定性方式执行时,这是否会导致在找到最小损失之前的训练时间更长,或者最小损失是否比模型非确定性时更糟糕?
My question is, what is meant by performance here. Processing speed or model quality (i.e. minimal loss)? In other words, when setting manual seeds and making the model perform in a deterministic way, does that cause longer training time until minimal loss is found, or is that minimal loss worse than when the model is non-deterministic?
为了完整起见,我通过设置所有这些属性手动使模型具有确定性:
For completeness' sake, I manually make the model deterministic by setting all of these properties:
def set_seed(seed):
torch.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
np.random.seed(seed)
random.seed(seed)
os.environ['PYTHONHASHSEED'] = str(seed)
推荐答案
性能指的是运行时间;CuDNN 有多种实现方式,当 cudnn.deterministic
设置为 true 时,您告诉 CuDNN 您只需要确定性实现(或我们认为它们是什么).简而言之,当你这样做时,你应该期望相同的结果在相同的系统上在 CPU 或 GPU 上提供相同的输入em>.为什么会影响性能?CuDNN 使用启发式方法来选择实现.因此,这实际上取决于您的模型 CuDNN 的行为方式;将其选择为确定性可能会影响运行时间,因为它们可能是在运行的同一点上选择它们的更快方式.
Performance refers to the run time; CuDNN has several ways of implementations, when cudnn.deterministic
is set to true, you're telling CuDNN that you only need the deterministic implementations (or what we believe they are). In a nutshell, when you are doing this, you should expect the same results on the CPU or the GPU on the same system when feeding the same inputs. Why would it affect the performance? CuDNN uses heuristics for the choice of the implementation. So, it actually depends on your model how CuDNN will behave; choosing it to be deterministic may affect the runtime because their could have been, let's say, faster way of choosing them at the same point of running.
关于你的片段,我做了精确的种子,它在 100 多个 DL 实验中运行良好(在可重复性方面).
Concerning your snippet, I do the exact seeding, it has been working good (in terms of reproducibility) for 100+ DL experiments.
这篇关于PyTorch 中的重现性和性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!