Linear 的初始化部分：
class Linear(Module): ... __constants__ = ['bias'] def __init__(self,
in_features, out_features, bias=True): super(Linear, self).__init__() self.
in_features= in_features self.out_features = out_features self.weight =
Parameter(torch.Tensor(out_features, in_features)) if bias: self.bias =
Parameter(torch.Tensor(out_features)) else: self.register_parameter('bias', None
) self.reset_parameters() ...

y=xAT+by = xA^T+by=xAT+b

@weak_script_method def forward(self, input): return F.linear(input, self.
weight, self.bias)

weight: the learnable weights of the module of shape :math:(\text{
out\_features}, \text{in\_features}). The values are initialized from :math:
\mathcal{U}(-\sqrt{k}, \sqrt{k}), where :math:k = \frac{1}{\text{in\_features
}}

bias: the learnable bias of the module of shape :math:(\text{out\_features}).
If:attr:bias is True, the values are initialized from :math:\mathcal{U}(
-\sqrt{k}, \sqrt{k}) where :math:k = \frac{1}{\text{in\_features}}
>>> import torch >>> nn1 = torch.nn.Linear(100, 50) >>> input1 = torch.randn(
140, 100) >>> output1 = nn1(input1) >>> output1.size() torch.Size([140, 50])

[140,100]×[100,50]=[140,50][140,100]×[100,50]=[140,50][140,100]×[100,50]=[140,5
0]