Decided to go for a 10 epoch training run.
Ten Epochs of Training
Here’s the terminal window output. Took about ½ hour. GPU never really got pushed; never heard the fans crank up the pace like they did while trying to train the CycleGAN and some of the other models. Wonder why?
(mclp-3.12) PS F:\learn\mcl_pytorch\proj7> python vae.py -rn rek_4 -bs 16 -ep 10
{'run_nm': 'rek_4', 'dataset_nm': 'no_nm', 'sv_img_cyc': 150, 'sv_chk_cyc': 50, 'resume': False, 'start_ep': 0, 'epochs': 10, 'batch_sz': 16, 'num_res_blks': 9, 'x_disc': 1, 'x_genr': 1, 'x_eps': 0, 'use_lrs': False, 'lrs_unit': 'batch', 'lrs_eps': 5, 'lrs_init': 0.01, 'lrs_steps': 25, 'lrs_wmup': 0}
image and checkpoint directories created: runs\rek_4_img & runs\rek_4_sv
batch size: 16, nbr batches: 278
logger: {'vae': []}
epoch 1: 100%|███████████████████████████████████████████████████████████████████████| 278/278 [02:51<00:00, 1.62it/s]
epoch 2: 100%|███████████████████████████████████████████████████████████████████████| 278/278 [02:52<00:00, 1.61it/s]
epoch 3: 100%|███████████████████████████████████████████████████████████████████████| 278/278 [02:52<00:00, 1.61it/s]
epoch 4: 100%|███████████████████████████████████████████████████████████████████████| 278/278 [02:53<00:00, 1.60it/s]
epoch 5: 100%|███████████████████████████████████████████████████████████████████████| 278/278 [02:53<00:00, 1.60it/s]
epoch 6: 100%|███████████████████████████████████████████████████████████████████████| 278/278 [02:53<00:00, 1.60it/s]
epoch 7: 100%|███████████████████████████████████████████████████████████████████████| 278/278 [02:52<00:00, 1.61it/s]
epoch 8: 100%|███████████████████████████████████████████████████████████████████████| 278/278 [02:52<00:00, 1.61it/s]
epoch 9: 100%|███████████████████████████████████████████████████████████████████████| 278/278 [02:53<00:00, 1.60it/s]
epoch 10: 100%|██████████████████████████████████████████████████████████████████████| 278/278 [02:52<00:00, 1.61it/s]
Here’s the plot of the training losses. Fair improvement over what we saw with a single epoch of training.

And here’s the images (training and regenerated) saved after five and ten epochs of training.


Perhaps not great, but I think that’s all the training I will do for now. It gets the idea across.
Playing With the Latent Space
Okay time to see what we can do with the latent space vectors.
I won’t show the code or the images generated but I did test loading the model from the checkpoint and running a small sample of images. Then plotting the real and regenerated images to be sure things were working as expected.
Load Model from File
Okay, I will load the trained model and put it in evaluation mode.
# the following is just after the code loading the training data
# was already in the module
# instantiate model and optimizer
l_dims = 100
vae=VAE(l_dims).to(cfg.device)
... ...
# if block for the new experimental code
if tst_model:
utl.ld_chkpt(cfg.sv_dir/f"vae_{cfg.epochs}.pt", vae, opt)
vae.eval()
Random Feature Vectors
First thing I am going to try is “prior sampling”. That is, use a standard normal distribution to generate the latent space vectors. Because this will generate vectors likely never seen by the model during training the generated images will be fairly poor. Especially given how little training I gave the model.
if do_rand:
# let's try generating some random latent space vectors and see what we get
with torch.no_grad():
noise = torch.randn(16,l_dims).to(cfg.device)
p_img = vae.dcdr(noise).cpu()
utl.image_grid(p_img, 4, i_show=True, epoch=cfg.epochs, b_sz=cfg.batch_sz, img_cl='random latent space vectors')
And a sample image.

Traversal of Latent Space
The idea is to take a given image, generate its feature vector, then for one or more “features” generate images over the range of possible values for the chosen feature(s). Each dimension of the latent vector represents a normal distribution. The range of values of the dimension is controlled by the mean and variance of the dimension. All information we get from the encoder.
I am going to take the mean for the particular vector and generate a range of values between the mean minus three standard deviations and the mean plus three standard deviations.
Now I don’t expect in this case that this will generate any significantly different images. The mean, standard deviation and feature vector returned from the encoder are specific to one person’s image. They in no way define the full latent space. But, I thought it might be worth a look.
A couple of functions to do the heavy lifting.
# generate a sample of feature values for single dimension of feature vector
# using supplied distribution values
def gen_z_dim(mu, std, nbr_smpl):
d_lw, d_hg = mu - 3*std, mu + 3*std
smpl = torch.linspace(d_lw, d_hg, nbr_smpl)
return smpl
def l_spc_traverse(decdr, mu, std, z_in, z_ndx, n_smpl=11):
"""
decdr: vae decoder
z_in: latent space vector for image to traverse
z_ndx: if integer: index of feature to traverse
if iterable: index of first feature and number of features to traverse
n_smpl: nbr sample images to generate
default to 11, makes 12 with original image
makes for 2x6 or 3x4 image grid
"""
if isinstance(z_ndx, int):
t_ndx, t_sz = z_ndx, 1
else:
# [start ndx, number of features to modify]
t_ndx, t_sz = z_ndx
# Initialize a latent space vector with zeros
z = z_in.repeat(n_smpl, 1).to(cfg.device)
for i in range(t_sz):
tmp_x = t_ndx + i
z_vals = gen_z_dim(mu[0, tmp_x], std[0, tmp_x], n_smpl)
# Assign the traversed values to the specified dimension
z[:, tmp_x] = z_vals[:]
# Decode the latent vectors
with torch.no_grad():
samples = decdr(z.to(cfg.device))
return samples
And the code to generate a set of images (in a new if
block of course). I am going to traverse the features at indexes 20-24.
if do_traverse:
# let's use the distributions for one image and traverse the lantent space for one or more features
dd_img, _ = imgs[10]
dd_img = dd_img.unsqueeze(0).to(cfg.device)
with torch.no_grad():
mu, std, z = vae.ncdr(dd_img)
t_img = l_spc_traverse(vae.dcdr, mu, std, z, [20, 5], n_smpl=11)
lst_img = torch.cat((dd_img, t_img), 0)
# print(lst_img.shape)
utl.image_grid(lst_img, 4, i_show=True, epoch=cfg.epochs, b_sz=12,
img_cl='traverse latent space for one or more features of image', norm=False)

Personally, I don’t see any real change in the modified images.
Linear Interpolation
Okay, let’s, as done with an earlier GAN, take a start and end image then generate a series images interpolating between the two. We are essentially interpolating between two points in the trained latent space.
Once again a function to do the interpolation given the start and ending feature vectors and the number of steps to take between the two.
def lnr_ntrpln(st, nd, stps):
# Create a linear path from start to end
# stps should be 2 less than total number of images to display
mlts = torch.linspace(0, 1, stps).to(cfg.device)
z = st
for mlt in mlts[:, None]:
with torch.no_grad():
nt = (st * (1 - mlt)) + (nd * mlt)
# print(mlt, nt.shape)
z = torch.cat((z, nt), 0)
z = torch.cat((z, nd), 0)
# print(z.shape)s
# Decode the samples along the path
vae.eval()
with torch.no_grad():
smpls = vae.dcdr(z.to(cfg.device))
return smpls
And, the if block using the function to generate the image grid.
if do_spc_ntrpl:
# interpolate between 2 dataset image latent space vectors
mx_ndx = len(imgs)
sx = cfg.rng.integers(low=0, high=mx_ndx, size=1)
nx = cfg.rng.integers(low=0, high=mx_ndx, size=1)
print(f"st: {sx}, nd: {nx}")
i1 = imgs[sx[0]][0].unsqueeze(0).to(cfg.device)
i2 = imgs[nx[0]][0].unsqueeze(0).to(cfg.device)
with torch.no_grad():
_, _, st = vae.ncdr(i1)
_, _, nd = vae.ncdr(i2)
ntrpl_smpl = lnr_ntrpln(st, nd, stps=22)
# p_img = torch.cat((i1, ntrpl_smpl, i2), 0)
utl.image_grid(ntrpl_smpl, 6, i_show=True, epoch=cfg.epochs, b_sz=24,
img_cl='interpolate between 2 latent space vectors from dataset', norm=False)
And a couple of examples. The top left and bottom right are regenerated versions of the original two images for each interpolation.


Now let’s do that again using randomly generated start and end feature vectors.
if do_rnd_ntrpl:
# interpolate between 2 random latent space vectors
st = torch.randn(1, z_dim).to(cfg.device)
nd = torch.randn(1, z_dim).to(cfg.device)
ntrpl_smpl = lnr_ntrpln(st, nd, stps=22)
utl.image_grid(ntrpl_smpl, 6, i_show=True, epoch=cfg.epochs, b_sz=24,
img_cl='interpolate between 2 random feature vectors', norm=False)

Fini for Now
I was planning to cover one more way of playing around with the latent space vectors—simple arithmetic. But I have decided that this post is plenty long enough. A fair bit of code, a fair number of images, etc.
So, the arithmetic will have to wait until next time. For what will likely be a rather short post.
Do enjoy playing around in your latent space—whatever it may be. Or, in your dynamic space if that is more to your liking.