pytorch how to set .requires_grad False(pytorch 如何设置 .requires_grad False)
问题描述
我想将我的一些模型设置为冻结状态.按照官方文档:
I want to set some of my model frozen. Following the official docs:
with torch.no_grad():
linear = nn.Linear(1, 1)
linear.eval()
print(linear.weight.requires_grad)
但它打印 True
而不是 False
.如果我想将模型设置为 eval 模式,我该怎么做?
But it prints True
instead of False
. If I want to set the model in eval mode, what should I do?
推荐答案
requires_grad=False
如果您想冻结部分模型并训练其余部分,您可以将要冻结的参数的requires_grad
设置为False
.
例如,如果您只想保持 VGG16 的卷积部分固定:
For example, if you only want to keep the convolutional part of VGG16 fixed:
model = torchvision.models.vgg16(pretrained=True)
for param in model.features.parameters():
param.requires_grad = False
通过将 requires_grad
标志切换为 False
,不会保存中间缓冲区,直到计算达到某个点,其中操作的输入之一需要梯度.
By switching the requires_grad
flags to False
, no intermediate buffers will be saved, until the computation gets to some point where one of the inputs of the operation requires the gradient.
使用上下文管理器 torch.no_grad
是实现该目标的不同方式:在 no_grad
上下文中,所有计算结果都将具有 requires_grad=False
,即使输入有 requires_grad=True
.请注意,您将无法将梯度反向传播到 no_grad
之前的层.例如:
Using the context manager torch.no_grad
is a different way to achieve that goal: in the no_grad
context, all the results of the computations will have requires_grad=False
, even if the inputs have requires_grad=True
. Notice that you won't be able to backpropagate the gradient to layers before the no_grad
. For example:
x = torch.randn(2, 2)
x.requires_grad = True
lin0 = nn.Linear(2, 2)
lin1 = nn.Linear(2, 2)
lin2 = nn.Linear(2, 2)
x1 = lin0(x)
with torch.no_grad():
x2 = lin1(x1)
x3 = lin2(x2)
x3.sum().backward()
print(lin0.weight.grad, lin1.weight.grad, lin2.weight.grad)
输出:
(None, None, tensor([[-1.4481, -1.1789],
[-1.4481, -1.1789]]))
这里 lin1.weight.requires_grad
为 True,但没有计算梯度,因为操作是在 no_grad
上下文中完成的.
Here lin1.weight.requires_grad
was True, but the gradient wasn't computed because the oepration was done in the no_grad
context.
如果您的目标不是微调,而是将模型设置为推理模式,最方便的方法是使用 torch.no_grad
上下文管理器.在这种情况下,您还必须将模型设置为 evaluation 模式,这是通过在 nn.Module
上调用 eval()
来实现的,例如例子:
If your goal is not to finetune, but to set your model in inference mode, the most convenient way is to use the torch.no_grad
context manager. In this case you also have to set your model to evaluation mode, this is achieved by calling eval()
on the nn.Module
, for example:
model = torchvision.models.vgg16(pretrained=True)
model.eval()
此操作将层的属性self.training
设置为False
,实际上这将改变Dropout
或BatchNorm
必须在训练和测试时表现不同.
This operation sets the attribute self.training
of the layers to False
, in practice this will change the behavior of operations like Dropout
or BatchNorm
that must behave differently at training and test time.
这篇关于pytorch 如何设置 .requires_grad False的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!