Okay, let’s get to coding the Generator class and any related helper classes.

Generator Class

This time around the generator is not going to be a mirror image of the discriminator. Instead there will be a few downsampling blocks to encode the input image. Then a number of residual blocks to transform the image. And finally a few upsampling blocks to generate the output image (decode). The upsampling portion will roughly mirror the discriminator. And, of course, the downsampling blocks will use a convolutional layer and the upsampling blocks will use a transposed convolutional layer.

I have not been able to find a post or tutorial that explains the need for that process in language I could understand. So, I am going to take the word of the authors of the tutorials I am using to guide me in coding this CycleGAN.

Well, as I was reviewing this post for publication, I decided to google residual blocks. Only read one item in the search list: Residual Block. But, it said “Having skip connections allows the network to more easily learn identity-like mappings.” And, there you go. We want the generator to have said identity-like mapping. So a goodly portion of the differences in this generator network/model explained.

I will let you google skip connections.

Let’s start with a helper class for the convolutional blocks. The residual block and generator classes will both use this helper.

Generator Convolutional Block Helper Class

Because in the generator we will be having both downsampling and upsampling blocks, we will need to account for this in our class. The constructor will include a parameter determining which direction the block should go. Other than that the code should look familiar.

I am, for this helper class, using a similar name to the block helper class for the discriminator. Just changing the \(D\) to a \(G\).

class GConvBlock(nn.Module):
  def __init__(self, c_in, c_out, k_sz, s_sz=1, ip_sz=0, op_sz=1, bias=True, p_mode="reflect", activation=True, normalize=True, upsample=False):
    super().__init__()

    if upsample:
       c_layer = nn.ConvTranspose2d(c_in, c_out, kernel_size=k_sz, stride=s_sz, padding=ip_sz, output_padding=op_sz)
    else:
       c_layer = nn.Conv2d(c_in, c_out, kernel_size=k_sz, stride=s_sz, padding=ip_sz, padding_mode=p_mode)

    self.layers = nn.Sequential(
      c_layer,
      nn.InstanceNorm2d(c_out) if normalize else nn.Identity(),
      nn.ReLU(inplace=True) if activation else nn.Identity()
    )


  def forward(self, x):
    return self.layers(x)

And a wee test.

if __name__ == "__main__":
  torch.manual_seed(cfg.pt_seed)

  tst_GN = False
  tst_DCB = False
  tst_D = False
  tst_GCB = True
... ...
  if tst_GCB:
    cb_dw = GConvBlock(256, 256, 3, s_sz=2, ip_sz=1)
    print("\ndownsampling block")
    print(cb_dw)
    cb_up = GConvBlock(256, 128, 3, s_sz=2, ip_sz=1, upsample=True)
    print("\nupsampling block")
    print(cb_up)
(mclp-3.12) PS F:\learn\mcl_pytorch\proj6> python models.py

downsampling block
GConvBlock(
  (layers): Sequential(
    (0): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), padding_mode=reflect)
    (1): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (2): ReLU(inplace=True)
  )
)

upsampling block
GConvBlock(
  (layers): Sequential(
    (0): ConvTranspose2d(256, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
    (1): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (2): ReLU(inplace=True)
  )
)

Residual Block Helper Class

Okay, let’s move on to the helper class for the residual blocks. Again, the code will likely look somewhat familiar.

A residual block consists of a few standard convolutional layers followed by batch normalization and ReLU activation. The defining feature is the addition of the input, or a downsampled version of it, to the output of the convolutional layers. This “shortcut connection” helps mitigate the vanishing gradient problem in deep networks and allows for the training of very deep networks.

Building a Customized Residual CNN with PyTorch, by Chen-Yu Chang

The articles I have looked at used two convolutional layers in each residual block. Let’s have a look at input features of 16 x 16 with a kernel size of 3, a stride of 1 and a padding of 1. You will recall that for a convolutional layer:

$$n_{out} = \left[\frac{n_{in} + 2p - k}{s}\right] + 1$$

where: \(n_{out}\) is our output matrix dimension size,
\(n_{in}\) is the input matrix dimension size,
\(p\) is the convolution padding size,
\(k\) is the kernel dimension size, and
\(s\) is stride size

For the above we get:

$$n_{out} = \left[\frac{16 + 2*1 - 3}{1}\right] + 1 = 16$$

So if we apply two convolutional layers with those parameters the output feature will be the same size as the input feature. Which will allow us to add that to the original input feature without needing to mess with the shape of that input in any way.

class ResidualBlock(nn.Module):
  def __init__(self, features, k_sz=3, s_sz=1, p_sz=1, normalize=True):
    super().__init__()

    self.layers = nn.Sequential(
      GConvBlock(features, features, k_sz=k_sz, s_sz=s_sz, ip_sz=p_sz),
      GConvBlock(features, features, k_sz=k_sz, s_sz=s_sz, ip_sz=p_sz, activation=False)
    )


  def forward(self, x):
    return F.relu(self.layers(x) + x)

And the usual quick test. Which as you can see below, appears to work just fine.

if __name__ == "__main__":
  torch.manual_seed(cfg.pt_seed)

  tst_GN = False
  tst_DCB = False
  tst_D = False
  tst_GCB = False
  tst_RB = True
... ...
  if tst_RB:
    rb = ResidualBlock(1, k_sz=3, s_sz=1, p_sz=1)
    print(f"\n{rb}")
    rb_in = torch.rand(1, 16, 16, requires_grad=False)
    rb_out = rb.forward(rb_in)
    print(f"\nrb_in: {rb_in.shape}, rb_out: {rb_out.shape}")
(mclp-3.12) PS F:\learn\mcl_pytorch\proj6> python models.py

ResidualBlock(
  (layers): Sequential(
    (0): GConvBlock(
      (layers): Sequential(
        (0): Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), padding_mode=reflect)
        (1): InstanceNorm2d(1, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
        (2): ReLU(inplace=True)
      )
    )
    (1): GConvBlock(
      (layers): Sequential(
        (0): Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), padding_mode=reflect)
        (1): InstanceNorm2d(1, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
        (2): Identity()
      )
    )
  )
)

rb_in: torch.Size([1, 16, 16]), rb_out: torch.Size([1, 16, 16])

Each convolution layer in the residual block returned an output the same size as the input. The output of the last convolutional block could then be added to the original input and returned from the call to the forward method. After applying an activation function to that sum.

Done?

This is starting to become a somewhat lengthy post. So, even though I haven’t gotten around to look at coding the Generator class, I think I will consider it finished. We will tackle that generator class next time. And, perhaps look at getting started on the code for the training loop.

Until then, happy fingers.

Resources