Sorry, a lengthy introduction.
Okay, in my usual way I started looking at another book while still working on a couple of others. But there is some motivation for this one. It is Learn Generative AI with PyTorch, by Mark Liu (Manning). Another MEAP.
When I first started this blog, the goal was to learn Python and then tackle machine learning. I sort of got started on the machine learning. But the project I was working on (predicting Titanic survivors, a Kaggle beginner competition) and the complexities it involved sort of killed my motivation. Then I discovered spirographs and a good deal of time—best part of two years—was consumed by them.
But prior to giving up on my initial look at machine learning I had a PC (a.k.a ‘beast’) built with a Nvidia GPU. Nvidia provided an interface, CUDA, specifically designed to speed up machine learning by using its GPUs to do a lot of the arithmetic. It had, for the time, a pretty high end AMD cpu as well. More cores and faster clockrate than the PC I had been using for a the previous few years. Though ‘growl’ isn’t what I would call a slouch. But just after bringing beast home, things sort of went south. And so it sat unused for a couple of years (a true waste of money). Thought it was about time to see what I could do with it.
PyTorch has an implementation for CUDA enabled GPUs (Nvidia). This generally speeds model development up in a noticeable way. Though speed up is a truly relative and personal impression.
I know I just started a series of posts on data analysis projects. The first project completed (more or less) in the last post. And I, in fact, have a series of posts drafted for two more data analysis projects. An additional 16 draft posts. Sadly the fourth project did not particularly interest me. But it sounded like what would be learned completing it would be needed in a subsequent project or three. Left me in a quandry.
A little earlier than the development of that quandry, I discovered the book mentioned above. I started reading it and decided to give it a go. Use the beast for that for which it was intended. I started out by setting up the pc to use PyTorch. A new conda environment. New libraries/packages. Git clones. New Git repo or two. Etc. Most of that is covered in this post. Along with a look at PyTorch’s Tensors.
Then to get a feel for coding models in PyTorch a simple classifier (the next post). Then I move on to GANs and DCGANs. In total I have completed 3 or 4 projects. And have beyond this post and additional 10 draft posts.
My apologies for any confusion this change of directions may cause. I sincerely hope to complete the remaining data analysis projects; but, at the moment, generative AI seems a touch more interesting and satisfying.
Installation
Okay, I know my GPUs are CUDA enabled. You may not. But, if you have a Nvidia GPU, start up the terminal and enter nvidia-snu
. In my case I got the following. Note, I deleted information regarding the GPU and the section at the bottom listing processes using the GPU.
PS F:\learn\data_analysis> nvidia-smi
Fri Mar 15 08:09:24 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 536.23 Driver Version: 536.23 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
| ... ... | ... ... | ... ... |
+-----------------------------------------+----------------------+----------------------+
If you got an error message, your GPU is not an Nvidia CUDA enabled GPU. Though I recently saw something indicating that AMD may be adding CUDA capabilities to its GPUs. Otherwise, make note of your CUDA version (upper right of the framed output).
The next step is to go the Pytorch getting started site. There you will find an interactive form something like the following.
I am installing on Windows in a conda Python environment. Since CUDA 12.2 is not available I went with 12.1. As shown I need to use the command conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
to install pytorch (and a couple other modules I will supposedly be using) in my environment.
Let’s give it a go. First I will create a working directory for this book. Then clone the environment I am using for the data analysis coding. Activate the new environment and install Pytorch. Let’s not forget a git init
.
PS F:\learn\data_analysis> Set-Location -Path ..\
PS F:\learn> New-Item -Name mcl_pytorch -ItemType Directory
Directory: F:\learn
Mode LastWriteTime Length Name
---- ------------- ------ ----
d----- 2024-03-15 4:27 PM mcl_pytorch
PS F:\learn> Set-Location -Path .\mcl_pytorch
PS F:\learn\mcl_pytorch>
Okay, now I am going to open a conda terminal (should have done that from the start). And clone a new environment. Then install PyTorch. Always amazes me how much needs to be downloaded and the time it takes, glad I wasn’t doing this on my other dev system (a.k.a. growl). Going to show all the terminal output just cuz.
(base) PS C:\WINDOWS\system32> Set-Location -Path f:\learn\mcl_pytorch
(base) PS F:\learn\mcl_pytorch> conda create --name mclp-3.12 --clone data-3.12
Source: E:\appDev\Miniconda3\envs\data-3.12
Destination: E:\appDev\Miniconda3\envs\mclp-3.12
Packages: 105
Files: 0
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
# $ conda activate mclp-3.12
#
# To deactivate an active environment, use
#
# $ conda deactivate
(base) PS F:\learn\mcl_pytorch> conda activate mclp-3.12
(mclp-3.12) PS F:\learn\mcl_pytorch> conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: done
==> WARNING: A newer version of conda exists. <==
current version: 4.11.0
latest version: 24.1.2
Please update conda by running
$ conda update -n base -c defaults conda
## Package Plan ##
environment location: E:\appDev\Miniconda3\envs\mclp-3.12
added / updated specs:
- pytorch
- pytorch-cuda=12.1
- torchaudio
- torchvision
The following packages will be downloaded:
package | build
---------------------------|-----------------
certifi-2024.2.2 | py312haa95532_0 161 KB
cuda-cccl-12.4.99 | 0 1.4 MB nvidia
cuda-cudart-12.1.105 | 0 964 KB nvidia
cuda-cudart-dev-12.1.105 | 0 549 KB nvidia
cuda-cupti-12.1.105 | 0 11.6 MB nvidia
cuda-libraries-12.1.0 | 0 1 KB nvidia
cuda-libraries-dev-12.1.0 | 0 1 KB nvidia
cuda-nvrtc-12.1.105 | 0 73.2 MB nvidia
cuda-nvrtc-dev-12.1.105 | 0 16.5 MB nvidia
cuda-nvtx-12.1.105 | 0 41 KB nvidia
cuda-opencl-12.4.99 | 0 11 KB nvidia
cuda-opencl-dev-12.4.99 | 0 75 KB nvidia
cuda-profiler-api-12.4.99 | 0 19 KB nvidia
cuda-runtime-12.1.0 | 0 1 KB nvidia
filelock-3.13.1 | py312haa95532_0 23 KB
idna-3.4 | py312haa95532_0 94 KB
jinja2-3.1.3 | py312haa95532_0 332 KB
libcublas-12.1.0.26 | 0 39 KB nvidia
libcublas-dev-12.1.0.26 | 0 348.3 MB nvidia
libcufft-11.0.2.4 | 0 6 KB nvidia
libcufft-dev-11.0.2.4 | 0 102.6 MB nvidia
libcurand-10.3.5.119 | 0 4 KB nvidia
libcurand-dev-10.3.5.119 | 0 49.8 MB nvidia
libcusolver-11.4.4.55 | 0 30 KB nvidia
libcusolver-dev-11.4.4.55 | 0 95.7 MB nvidia
libcusparse-12.0.2.55 | 0 12 KB nvidia
libcusparse-dev-12.0.2.55 | 0 162.5 MB nvidia
libjpeg-turbo-2.0.0 | h196d8e1_0 618 KB
libnpp-12.0.2.50 | 0 305 KB nvidia
libnpp-dev-12.0.2.50 | 0 135.6 MB nvidia
libnvjitlink-12.1.105 | 0 67.3 MB nvidia
libnvjitlink-dev-12.1.105 | 0 13.8 MB nvidia
libnvjpeg-12.1.1.14 | 0 5 KB nvidia
libnvjpeg-dev-12.1.1.14 | 0 2.4 MB nvidia
libuv-1.44.2 | h2bbff1b_0 288 KB
markupsafe-2.1.3 | py312h2bbff1b_0 27 KB
mpmath-1.3.0 | py312haa95532_0 989 KB
networkx-3.1 | py312haa95532_0 2.9 MB
pytorch-2.2.1 |py3.12_cuda12.1_cudnn8_0 1.24 GB pytorch
pytorch-cuda-12.1 | hde6ce7c_5 4 KB pytorch
pytorch-mutex-1.0 | cuda 3 KB pytorch
pyyaml-6.0.1 | py312h2bbff1b_0 162 KB
requests-2.31.0 | py312haa95532_1 122 KB
sympy-1.12 | py312haa95532_0 14.0 MB
torchaudio-2.2.1 | py312_cu121 7.2 MB pytorch
torchvision-0.17.1 | py312_cu121 7.8 MB pytorch
typing_extensions-4.9.0 | py312haa95532_1 71 KB
urllib3-2.1.0 | py312haa95532_0 185 KB
------------------------------------------------------------
Total: 2.33 GB
The following NEW packages will be INSTALLED:
certifi pkgs/main/win-64::certifi-2024.2.2-py312haa95532_0
charset-normalizer pkgs/main/noarch::charset-normalizer-2.0.4-pyhd3eb1b0_0
cuda-cccl nvidia/win-64::cuda-cccl-12.4.99-0
cuda-cudart nvidia/win-64::cuda-cudart-12.1.105-0
cuda-cudart-dev nvidia/win-64::cuda-cudart-dev-12.1.105-0
cuda-cupti nvidia/win-64::cuda-cupti-12.1.105-0
cuda-libraries nvidia/win-64::cuda-libraries-12.1.0-0
cuda-libraries-dev nvidia/win-64::cuda-libraries-dev-12.1.0-0
cuda-nvrtc nvidia/win-64::cuda-nvrtc-12.1.105-0
cuda-nvrtc-dev nvidia/win-64::cuda-nvrtc-dev-12.1.105-0
cuda-nvtx nvidia/win-64::cuda-nvtx-12.1.105-0
cuda-opencl nvidia/win-64::cuda-opencl-12.4.99-0
cuda-opencl-dev nvidia/win-64::cuda-opencl-dev-12.4.99-0
cuda-profiler-api nvidia/win-64::cuda-profiler-api-12.4.99-0
cuda-runtime nvidia/win-64::cuda-runtime-12.1.0-0
filelock pkgs/main/win-64::filelock-3.13.1-py312haa95532_0
idna pkgs/main/win-64::idna-3.4-py312haa95532_0
jinja2 pkgs/main/win-64::jinja2-3.1.3-py312haa95532_0
libcublas nvidia/win-64::libcublas-12.1.0.26-0
libcublas-dev nvidia/win-64::libcublas-dev-12.1.0.26-0
libcufft nvidia/win-64::libcufft-11.0.2.4-0
libcufft-dev nvidia/win-64::libcufft-dev-11.0.2.4-0
libcurand nvidia/win-64::libcurand-10.3.5.119-0
libcurand-dev nvidia/win-64::libcurand-dev-10.3.5.119-0
libcusolver nvidia/win-64::libcusolver-11.4.4.55-0
libcusolver-dev nvidia/win-64::libcusolver-dev-11.4.4.55-0
libcusparse nvidia/win-64::libcusparse-12.0.2.55-0
libcusparse-dev nvidia/win-64::libcusparse-dev-12.0.2.55-0
libjpeg-turbo pkgs/main/win-64::libjpeg-turbo-2.0.0-h196d8e1_0
libnpp nvidia/win-64::libnpp-12.0.2.50-0
libnpp-dev nvidia/win-64::libnpp-dev-12.0.2.50-0
libnvjitlink nvidia/win-64::libnvjitlink-12.1.105-0
libnvjitlink-dev nvidia/win-64::libnvjitlink-dev-12.1.105-0
libnvjpeg nvidia/win-64::libnvjpeg-12.1.1.14-0
libnvjpeg-dev nvidia/win-64::libnvjpeg-dev-12.1.1.14-0
libuv pkgs/main/win-64::libuv-1.44.2-h2bbff1b_0
markupsafe pkgs/main/win-64::markupsafe-2.1.3-py312h2bbff1b_0
mpmath pkgs/main/win-64::mpmath-1.3.0-py312haa95532_0
networkx pkgs/main/win-64::networkx-3.1-py312haa95532_0
pytorch pytorch/win-64::pytorch-2.2.1-py3.12_cuda12.1_cudnn8_0
pytorch-cuda pytorch/win-64::pytorch-cuda-12.1-hde6ce7c_5
pytorch-mutex pytorch/noarch::pytorch-mutex-1.0-cuda
pyyaml pkgs/main/win-64::pyyaml-6.0.1-py312h2bbff1b_0
requests pkgs/main/win-64::requests-2.31.0-py312haa95532_1
sympy pkgs/main/win-64::sympy-1.12-py312haa95532_0
torchaudio pytorch/win-64::torchaudio-2.2.1-py312_cu121
torchvision pytorch/win-64::torchvision-0.17.1-py312_cu121
typing_extensions pkgs/main/win-64::typing_extensions-4.9.0-py312haa95532_1
urllib3 pkgs/main/win-64::urllib3-2.1.0-py312haa95532_0
yaml pkgs/main/win-64::yaml-0.2.5-he774522_0
Proceed ([y]/n)? y
Downloading and Extracting Packages
libnvjitlink-12.1.10 | 67.3 MB | ########################################################################### | 100%
libcublas-12.1.0.26 | 39 KB | ########################################################################### | 100%
mpmath-1.3.0 | 989 KB | ########################################################################### | 100%
pyyaml-6.0.1 | 162 KB | ########################################################################### | 100%
libcublas-dev-12.1.0 | 348.3 MB | ########################################################################### | 100%
libnpp-12.0.2.50 | 305 KB | ########################################################################### | 100%
sympy-1.12 | 14.0 MB | ########################################################################### | 100%
cuda-nvrtc-12.1.105 | 73.2 MB | ########################################################################### | 100%
cuda-nvtx-12.1.105 | 41 KB | ########################################################################### | 100%
cuda-cudart-dev-12.1 | 549 KB | ########################################################################### | 100%
libnvjitlink-dev-12. | 13.8 MB | ########################################################################### | 100%
jinja2-3.1.3 | 332 KB | ########################################################################### | 100%
idna-3.4 | 94 KB | ########################################################################### | 100%
markupsafe-2.1.3 | 27 KB | ########################################################################### | 100%
cuda-cudart-12.1.105 | 964 KB | ########################################################################### | 100%
pytorch-2.2.1 | 1.24 GB | ########################################################################### | 100%
libcusparse-12.0.2.5 | 12 KB | ########################################################################### | 100%
urllib3-2.1.0 | 185 KB | ########################################################################### | 100%
libcurand-dev-10.3.5 | 49.8 MB | ########################################################################### | 100%
filelock-3.13.1 | 23 KB | ########################################################################### | 100%
libcusolver-11.4.4.5 | 30 KB | ########################################################################### | 100%
libnpp-dev-12.0.2.50 | 135.6 MB | ########################################################################### | 100%
cuda-opencl-12.4.99 | 11 KB | ########################################################################### | 100%
libcusparse-dev-12.0 | 162.5 MB | ########################################################################### | 100%
libnvjpeg-dev-12.1.1 | 2.4 MB | ########################################################################### | 100%
torchaudio-2.2.1 | 7.2 MB | ########################################################################### | 100%
torchvision-0.17.1 | 7.8 MB | ########################################################################### | 100%
cuda-cupti-12.1.105 | 11.6 MB | ########################################################################### | 100%
requests-2.31.0 | 122 KB | ########################################################################### | 100%
libcufft-11.0.2.4 | 6 KB | ########################################################################### | 100%
cuda-cccl-12.4.99 | 1.4 MB | ########################################################################### | 100%
cuda-opencl-dev-12.4 | 75 KB | ########################################################################### | 100%
cuda-profiler-api-12 | 19 KB | ########################################################################### | 100%
cuda-runtime-12.1.0 | 1 KB | ########################################################################### | 100%
libjpeg-turbo-2.0.0 | 618 KB | ########################################################################### | 100%
networkx-3.1 | 2.9 MB | ########################################################################### | 100%
cuda-nvrtc-dev-12.1. | 16.5 MB | ########################################################################### | 100%
typing_extensions-4. | 71 KB | ########################################################################### | 100%
pytorch-mutex-1.0 | 3 KB | ########################################################################### | 100%
libcusolver-dev-11.4 | 95.7 MB | ########################################################################### | 100%
cuda-libraries-dev-1 | 1 KB | ########################################################################### | 100%
libuv-1.44.2 | 288 KB | ########################################################################### | 100%
libcurand-10.3.5.119 | 4 KB | ########################################################################### | 100%
certifi-2024.2.2 | 161 KB | ########################################################################### | 100%
cuda-libraries-12.1. | 1 KB | ########################################################################### | 100%
pytorch-cuda-12.1 | 4 KB | ########################################################################### | 100%
libnvjpeg-12.1.1.14 | 5 KB | ########################################################################### | 100%
libcufft-dev-11.0.2. | 102.6 MB | ########################################################################### | 100%
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Okay, now we need to check we have a working PyTorch module.
import torch, torchvision, torchaudio
print(f"torch ver: {torch.__version__}")
print(f"torchvision ver: {torchvision.__version__}")
print(f"torchaudio ver: {torchaudio.__version__}")
print(f"cuda: {torch.cuda.is_available()}")
(mclp-3.12) PS F:\learn\mcl_pytorch\rek_test> python install_test.py
torch ver: 2.2.1
torchvision ver: 0.17.1
torchaudio ver: 2.2.1
cuda: True
Some Playing Around: Tensors
PyTorch uses algebraic structures called tensors as its primary data structure. They are functionally similar to Numpy arrays; but, with a few differences. The key one in this case is GPU accelerated training. Each tensor can only contain elements of a single data type. Which data type to use is something that I will hopefully learn along the way.
Now, given tensors are fundamentally arrays, I think we will find that basic indexing, slicing and manipulation will be similar to other packages, e.g. Python, Numpy and perhaps pandas.
I am going to load a CSV file of Canadian Prime Ministers total time in office (created manually from data on the web). Do some conversion/sorting and then create a tensor using the total days in office series from the pandas dataframe. Given a tensor can only contain elements of one type, don’t don’t believe I could have done the conversion from string of information into number of days using PyTorch tensor methods. So did that in pandas.
Similarly because the tensor can only contain a single data type there is no way for me to add the PM’s names to the tensor. But again that’s not what tensors are for. That’s what pandas is for.
# tensors.py: rek, 2024.03.16, try some tensor arithmetic
# my usual imports
from pathlib import Path
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import torch
# load CSV File of total time in office by Prime Minister
fl_pm_tio = Path("cdn_pm_time.csv")
pm_tio = pd.read_csv(fl_pm_tio, sep=",", header=0)
# sort into historical order
pm_tio = pm_tio.sort_values(by="First day", axis=0, ignore_index=True)
# need to do some data manipulation before converting to tensor
# i.e. turn string like "18 years, 359 days" into total number of days
def get_days(yr_dy):
if "," in yr_dy:
yrs, _, dys, _ = yr_dy.split(" ")
return int(yrs) * 365 + int(dys)
else:
dys, _ = yr_dy.split(" ")
return int(dys)
pm_tio["days"] = pm_tio["Total time in office"].apply(get_days)
print("\n", pm_tio)
pm_tensor = torch.tensor(pm_tio.days, dtype=torch.float64)
# could also use: pm_tensor = FloatTensor(pm_tio.days)
print("\n", pm_tensor)
(mclp-3.12) PS F:\learn\mcl_pytorch\rek_test> python tensors.py
Prime Minister Total time in office First day days
0 Sir John A. Macdonald 18 years, 359 days 1867-07-01 6929
1 Alexander Mackenzie 4 years, 336 days 1873-11-07 1796
2 Sir John Abbott 1 year, 161 days 1891-06-16 526
3 Sir John Thompson 2 years, 7 days 1892-12-05 737
4 Sir Mackenzie Bowell 1 year, 128 days 1894-12-21 493
5 Sir Charles Tupper 68 days 1896-05-01 68
6 Sir Wilfrid Laurier 15 years, 86 days 1896-07-11 5561
7 Sir Robert Borden 8 years, 274 days 1911-10-10 3194
8 Arthur Meighen 1 year, 260 days 1920-07-10 625
9 William Lyon Mackenzie King 21 years, 154 days 1921-12-29 7819
10 R. B. Bennett 5 years, 77 days 1930-08-07 1902
11 Louis St. Laurent 8 years, 218 days 1948-11-15 3138
12 John Diefenbaker 5 years, 305 days 1957-06-21 2130
13 Lester B. Pearson 4 years, 363 days 1963-04-22 1823
14 Pierre Trudeau 15 years, 164 days 1968-04-20 5639
15 Joe Clark 273 days 1979-06-04 273
16 John Turner 79 days 1984-06-30 79
17 Brian Mulroney 8 years, 281 days 1984-09-17 3201
18 Kim Campbell 132 days 1993-06-25 132
19 Jean Chrétien 10 years, 38 days 1993-11-04 3688
20 Paul Martin 2 years, 56 days 2003-12-12 786
21 Stephen Harper 9 years, 271 days 2006-02-06 3556
22 Justin Trudeau 8 years, 133 days 2015-11-04 3053
tensor([6929., 1796., 526., 737., 493., 68., 5561., 3194., 625., 7819.,
1902., 3138., 2130., 1823., 5639., 273., 79., 3201., 132., 3688.,
786., 3556., 3053.], dtype=torch.float64)
Let’s do some indexing and slicing.
One Dimensional Tensor
Tensors are not limited to one dimension but for now let’s use that one-dimensional tensor we just created.
To get Sir John A. Macdonald’s days in office we would use pm_tensor[0]
. To get that value for Jean Chrétien we could use pm_tensor[19]
or pm_tensor[-4]
. I’m sure we’ve all seen this kind of thing before. Let’s have a look.
print("\n", pm_tensor[0]")
print("\n", pm_tensor[19], "=?", pm_tensor[-4])
tensor(6929., dtype=torch.float64)
tensor(3688., dtype=torch.float64) =? tensor(3688., dtype=torch.float64)
And if I use f-strings the output changes somewhat.
print(f"\n{pm_tensor[0]}")
print(f"\n{pm_tensor[19]} =? {pm_tensor[-4]}")
6929.0
3688.0 =? 3688.0
Let’s get the value for the 5th-9th Prime Ministers inclusive. And for the last 5.
print(f"\n{pm_tensor[4:8]}")
print(f"\n{pm_tensor[-5:]}")
print(f"\n{pm_tensor.shape}")
pm_len = pm_tensor.shape[0]
print(f"\n{pm_tensor[pm_len-5:]}")
tensor([ 493., 68., 5561., 3194.], dtype=torch.float64)
tensor([ 132., 3688., 786., 3556., 3053.], dtype=torch.float64)
torch.Size([23])
tensor([ 132., 3688., 786., 3556., 3053.], dtype=torch.float64)
Two Dimensional Tensor
Let’s add another dimension to our tensor. We’ll convert days to weeks and add that as our second dimension.
A bit of fooling around; take note of the comments. Note: I should likely have used stack
to join the two tensors. Might have eliminated the reshape
.
# let's add another dimension
# total weeks in office
pm_wks = pm_tensor / 7
print(f"\n{pm_wks.shape}\n{pm_wks}")
# then concatenate the two tensors, similar to pandas concat
# we can't at this point contact a second dimension as it doesn't exist
# so we concat on the existing dimension and reshape the result
pm_2_tms = torch.cat((pm_tensor, pm_wks), dim=0)
print(f"\nconcatenated tensor: {pm_2_tms.shape}")
pm_2_reshp = pm_2_tms.reshape(2, pm_len)
print(f"\n{pm_2_reshp.shape}\n{pm_2_reshp}")
torch.Size([23])
tensor([ 989.8571, 256.5714, 75.1429, 105.2857, 70.4286, 9.7143,
794.4286, 456.2857, 89.2857, 1117.0000, 271.7143, 448.2857,
304.2857, 260.4286, 805.5714, 39.0000, 11.2857, 457.2857,
18.8571, 526.8571, 112.2857, 508.0000, 436.1429],
dtype=torch.float64)
concatenated tensor: torch.Size([46])
torch.Size([2, 23])
tensor([[6929.0000, 1796.0000, 526.0000, 737.0000, 493.0000, 68.0000,
5561.0000, 3194.0000, 625.0000, 7819.0000, 1902.0000, 3138.0000,
2130.0000, 1823.0000, 5639.0000, 273.0000, 79.0000, 3201.0000,
132.0000, 3688.0000, 786.0000, 3556.0000, 3053.0000],
[ 989.8571, 256.5714, 75.1429, 105.2857, 70.4286, 9.7143,
794.4286, 456.2857, 89.2857, 1117.0000, 271.7143, 448.2857,
304.2857, 260.4286, 805.5714, 39.0000, 11.2857, 457.2857,
18.8571, 526.8571, 112.2857, 508.0000, 436.1429]],
dtype=torch.float64)
And now let’s look at indexing and slicing a 2-D tensor. Do note the order of the dimensions.
Let’s get:
- the weekly value for Wilfred Laurier (index 6)
- the daily value for Stephen Harper (penultimate value).
- number of days for most recent 5 PMs
- days and weeks for first 5 PMs
# weekly value for Wilfred Laurier (index 6) + daily value for Stephen Harper (penultimate value)
# + weeks value for last 5 + both values first 5
print(f"\nLaurier weeks: {pm_2_reshp[1, 6]}")
print(f"\nHarper days: {pm_2_reshp[0, -2]}")
print(f"\nDays most recent 5: {pm_2_reshp[0, -5:]}")
print(f"\nFirst 5 (days and weeks):\n{pm_2_reshp[:, :5]}")
Laurier weeks: 794.4285714285714
Harper days: 3556.0
Days most recent 5: tensor([ 132., 3688., 786., 3556., 3053.], dtype=torch.float64)
First 5 (days and weeks):
tensor([[6929.0000, 1796.0000, 526.0000, 737.0000, 493.0000],
[ 989.8571, 256.5714, 75.1429, 105.2857, 70.4286]],
dtype=torch.float64)
Join Along Other Dimension
Let’s try getting our concatenated tensor to have the shape [23, 2]
instead of [2, 23]
as above.
I couldn’t get cat(..., dim=1)
to work. But stack(..., dim=1)
worked just fine. I eventually found a way swap the dimensions for pm_2_reshp
using movedim
.
# let's try reshaping the other way
pm_2_reshp2 = torch.stack((pm_tensor, pm_wks), dim=1)
print(f"\n{pm_2_reshp2.shape}\n{pm_2_reshp2}")
pm_2_reshp3 = pm_2_reshp.movedim(0, -1)
print(f"\n{pm_2_reshp3.shape}\n{pm_2_reshp3}")
# weekly value for Wilfred Laurier (index 6) + daily value for Stephen Harper (penultimate value)
# + weeks value for last 5 + both values first 5
print(f"\nLaurier weeks: {pm_2_reshp2[6, 1]}, {pm_2_reshp3[6, 1]}")
print(f"\nHarper days: {pm_2_reshp2[-2, 0]}, {pm_2_reshp3[-2, 0]}")
print(f"\nDays most recent 5: {pm_2_reshp2[-5:, 0]},\n{" "*20}{pm_2_reshp3[-5:, 0]}")
print(f"\nFirst 5 (days and weeks):\n{pm_2_reshp2[:5, :]}\n{pm_2_reshp3[:5, :]}")
torch.Size([23, 2])
tensor([[6929.0000, 989.8571],
[1796.0000, 256.5714],
[ 526.0000, 75.1429],
[ 737.0000, 105.2857],
[ 493.0000, 70.4286],
[ 68.0000, 9.7143],
[5561.0000, 794.4286],
[3194.0000, 456.2857],
[ 625.0000, 89.2857],
[7819.0000, 1117.0000],
[1902.0000, 271.7143],
[3138.0000, 448.2857],
[2130.0000, 304.2857],
[1823.0000, 260.4286],
[5639.0000, 805.5714],
[ 273.0000, 39.0000],
[ 79.0000, 11.2857],
[3201.0000, 457.2857],
[ 132.0000, 18.8571],
[3688.0000, 526.8571],
[ 786.0000, 112.2857],
[3556.0000, 508.0000],
[3053.0000, 436.1429]], dtype=torch.float64)
torch.Size([23, 2])
tensor([[6929.0000, 989.8571],
[1796.0000, 256.5714],
[ 526.0000, 75.1429],
[ 737.0000, 105.2857],
[ 493.0000, 70.4286],
[ 68.0000, 9.7143],
[5561.0000, 794.4286],
[3194.0000, 456.2857],
[ 625.0000, 89.2857],
[7819.0000, 1117.0000],
[1902.0000, 271.7143],
[3138.0000, 448.2857],
[2130.0000, 304.2857],
[1823.0000, 260.4286],
[5639.0000, 805.5714],
[ 273.0000, 39.0000],
[ 79.0000, 11.2857],
[3201.0000, 457.2857],
[ 132.0000, 18.8571],
[3688.0000, 526.8571],
[ 786.0000, 112.2857],
[3556.0000, 508.0000],
[3053.0000, 436.1429]], dtype=torch.float64)
Laurier weeks: 794.4285714285714, 794.4285714285714
Harper days: 3556.0, 3556.0
Days most recent 5: tensor([ 132., 3688., 786., 3556., 3053.], dtype=torch.float64),
tensor([ 132., 3688., 786., 3556., 3053.], dtype=torch.float64)
First 5 (days and weeks):
tensor([[6929.0000, 989.8571],
[1796.0000, 256.5714],
[ 526.0000, 75.1429],
[ 737.0000, 105.2857],
[ 493.0000, 70.4286]], dtype=torch.float64)
tensor([[6929.0000, 989.8571],
[1796.0000, 256.5714],
[ 526.0000, 75.1429],
[ 737.0000, 105.2857],
[ 493.0000, 70.4286]], dtype=torch.float64)
A bit more before I call this post done.
Mathematical Operations on Tensors
Pretty sure PyTorch must provide common numerical operators for use on tensors. Things like median, min, max, etc. Let’s give it a go.
# let's apply some numerical functions to a couple of these tensors
# will use 2-D tensors in both shapes for each calculation
val1, ind1 = torch.median(pm_2_reshp, dim=1)
val2, ind2 = torch.median(pm_2_reshp3, dim=0)
print(f"\nmedian: {val1}")
print(f"{" "*8}{val2}")
val1, ind1 = torch.max(pm_2_reshp, dim=1)
val2, ind2 = torch.max(pm_2_reshp3, dim=0)
print(f"\nmax: {val1}")
print(f"{" "*5}{val2}")
# these gave the same value for both shapes ????
# val1, ind1 = torch.mean(pm_2_reshp, dim=1)
# val2, ind2 = torch.mean(pm_2_reshp3, dim=0)
# the following do not
val1 = torch.mean(pm_2_reshp, dim=1, keepdim=True)
val2 = torch.mean(pm_2_reshp3, dim=0, keepdim=True)
print(f"\nmean: {val1}")
print(f"{" "*6}{val2}")
# val1, ind1 = torch.std(pm_2_reshp, dim=1)
# val2, ind2 = torch.std(pm_2_reshp3, dim=0)
val1 = torch.std(pm_2_reshp, dim=1, keepdim=True)
val2 = torch.std(pm_2_reshp3, dim=0, keepdim=True)
print(f"\nstd: {val1}")
print(f"{" "*5}{val2}")
median: tensor([1902.0000, 271.7143], dtype=torch.float64)
tensor([1902.0000, 271.7143], dtype=torch.float64)
max: tensor([7819., 1117.], dtype=torch.float64)
tensor([7819., 1117.], dtype=torch.float64)
mean: tensor([[2484.6957], [ 354.9565]], dtype=torch.float64)
tensor([[2484.6957, 354.9565]], dtype=torch.float64)
std: tensor([[2262.4947], [ 323.2135]], dtype=torch.float64)
tensor([[2262.4947, 323.2135]], dtype=torch.float64)
Done
M’thinks this post is probably longer than a post should be.
So until next time, enjoy your time experimenting with new libraries and frameworks.