Okay, let’s see how far we can get coding the discriminator and generator classes. I will, at least for now, be putting them all in a single new module, models.py
.
There will, for each network/model, be one or more helper classes to assist in coding the hidden network layers.
Let’s start with the discriminator as it is very much the same as we have coded in previous projects.
Discriminator Class
One thing that will be different this time is that we will be adding noise to the input to the discriminator. We could apparently use gradient penalty or other regularization methods instead. But, hey what, something new to try.
This will be done via a custom PyTorch Module
. Similar to what we did when reshaping tensors in a previous project.
Custom Gaussian Noise Module
Adding guassian noise to Discriminator layers in GAN helps really stablizing training
The reason as to why this works is that this essentially weakens the discriminator, which can be really helpful at the beginning of training. A powerful discriminator looks like a step function, and as a result, there’s no useful gradient signal for updating the generator.
mfarahmand98
I won’t bother with the imports in the new module. But here’s the code for that custom network module.
class GaussianNoise(nn.Module):
def __init__(self, rate=0.0) -> None:
super().__init__()
self.rate = rate
def forward(self, x):
return x + (self.rate * torch.rand_like(x, requires_grad=False).to(device))
And, a little test code.
if __name__ == "__main__":
torch.manual_seed(cfg.pt_seed)
tst_GN = True
if tst_GN:
# test GaussianNoise
gn = GaussianNoise(rate=1.0)
tst_tensor = torch.rand(2, 3, requires_grad=False).to(cfg.device)
print(tst_tensor)
gn_tensor = gn.forward(tst_tensor)
print(gn_tensor.detach())
print(gn_tensor - tst_tensor.detach())
And the terminal output seems to indicate it works as desired.
(mclp-3.12) PS F:\learn\mcl_pytorch\proj6> python models.py
tensor([[0.5286, 0.1616, 0.8870],
[0.6216, 0.0459, 0.3856]], device='cuda:1')
tensor([[0.6558, 0.7668, 1.2437],
[0.6826, 0.3745, 0.6171]], device='cuda:1')
tensor([[0.1272, 0.6051, 0.3567],
[0.0610, 0.3286, 0.2315]], device='cuda:1')
Convolutional Block Generator Class
Okay, there will be a few convolutional layer blocks, currently looking at 5, in the discriminator network. Though the middle three will have a different configuration from the first and last.
So let’s create a custom module class to generate them as necessary. This is a bit of a change from my past approach. Probably allows for more flexibility and control. We will need to pass in all the information required to generate each one. We’ve seen these before. E.G. input and output channels, stride, padding and such for the convolutional layer, whether or not to include activation and normalization layers in the block. And, who knows perhaps more before the coding is done.
A few more parameters in the constructor than I orginally expected.
class DConvBlock(nn.Module):
def __init__(self, c_in, c_out, ksz, strd=1, padg=0, bias=True, p_mode="reflect", activation=True, lr_slp=0.2, normalize=True):
super().__init__()
self.conv_block = nn.Sequential(
nn.Conv2d(c_in, c_out, kernel_size=ksz, stride=strd, padding=padg, bias=bias, padding_mode=p_mode),
nn.InstanceNorm2d(c_out) if normalize else nn.Identity(),
nn.LeakyReLU(lr_slp, inplace=True) if activation else nn.Identity()
)
def forward(self, x):
return self.conv_block(x)
As a test I will instantiate a block and print it.
if __name__ == "__main__":
torch.manual_seed(cfg.pt_seed)
tst_GN = False
tst_DCB = True
... ...
if tst_DCB:
# test DConvBlock
cb = DConvBlock(3, 64, ksz=4, strd=2, padg=1, normalize=False)
print(cb)
And the terminal output was as follows. Which pretty much looks like what I was expecting. Identity()
instead of InstanceNorm2d()
. And LeakyReLU()
for the activation function/layer.
(mclp-3.12) PS F:\learn\mcl_pytorch\proj6> python models.py
DConvBlock(
(conv_block): Sequential(
(0): Conv2d(3, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), padding_mode=reflect)
(1): Identity()
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
)
Discriminator Class
I will be using a value of 64 output features for the first block/layer. Doubling for the next three blocks. The final block will output a single feature—the probability the image is from the original dataset. Why 64? I am going by the tutorials/blog posts I looked at. I am uncertain at this point how one goes about determining the best value(s) for input and output features. With the exception of the very first number of input channels.
I expect it is in the end related to the size of the input tensors. But…
# For now not including all parameters needed by DConvBlock in Discriminator constructor
class Discriminator(nn.Module):
def __init__(self, c_in, f_init, gn_rt, k_szs, s_szs, p_szs, lr_slp=0.2):
super().__init__()
self.gNoise = GaussianNoise(rate=gn_rt)
self.cb1 = DConvBlock(c_in, f_init, k_szs[0], s_szs[0], p_szs[0], lr_slp=lr_slp, normalize=False)
self.cb2 = DConvBlock(f_init, f_init * 2, k_szs[1], s_szs[1], p_szs[1], lr_slp=lr_slp)
self.cb3 = DConvBlock(f_init * 2, f_init * 4, k_szs[2], s_szs[2], p_szs[2], lr_slp=lr_slp)
self.cb4 = DConvBlock(f_init * 4, f_init * 8, k_szs[3], s_szs[3], p_szs[3], lr_slp=lr_slp)
self.cb5 = DConvBlock(f_init * 8, 1, k_szs[4], s_szs[4], p_szs[4], activation=False, normalize=False)
def forward(self, x):
x = self.gNoise(x)
x = self.cb1(x)
x = self.cb2(x)
x = self.cb3(x)
x = self.cb4(x)
x = self.cb5(x)
return torch.sigmoid(x)
if __name__ == "__main__":
torch.manual_seed(cfg.pt_seed)
tst_GN = False
tst_DCB = False
tst_D = True
... ...
if tst_D:
# test Discriminator
dsc = Discriminator(3, 64, 1.0, [4, 4, 4, 4, 4], [2, 2, 2, 2, 1], [1, 0, 0, 0 ,1])
print(dsc)
And the output from the test was as follows. Which looks to match the arguments passed to the constructor.
(mclp-3.12) PS F:\learn\mcl_pytorch\proj6> python models.py
Discriminator(
(gNoise): GaussianNoise()
(cb1): DConvBlock(
(conv_block): Sequential(
(0): Conv2d(3, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), padding_mode=reflect)
(1): Identity()
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
)
(cb2): DConvBlock(
(conv_block): Sequential(
(0): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding_mode=reflect)
(1): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
)
(cb3): DConvBlock(
(conv_block): Sequential(
(0): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding_mode=reflect)
(1): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
)
(cb4): DConvBlock(
(conv_block): Sequential(
(0): Conv2d(256, 512, kernel_size=(4, 4), stride=(2, 2), padding_mode=reflect)
(1): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
)
(cb5): DConvBlock(
(conv_block): Sequential(
(0): Conv2d(512, 1, kernel_size=(4, 4), stride=(1, 1), padding=(1, 1), padding_mode=reflect)
(1): Identity()
(2): Identity()
)
)
)
Done
You know, I think that is it for this one. Maybe not much content, but a bit of code to digest. And mostly focused on the subject at hand. Next time I will tackle the generator classes.
Until then, enjoy your time working on the machine learning models of your choice.
Resources
- torch.nn.ReflectionPad2d