Dictionaries

While working on the 2020 Advent of Code, I got a lesson in just how useful Python’s dictionary oject, dict, can be.

I was working on Day 17: Conway Cubes. My initial plan was to use multidimensional Numpy arrays. But it was getting out of hand rather quickly. So I checked the web for inspiration.

That inspiration was: why not treat the points in my 3D space as labels rather than indices to nested arrays. I.E. I really only need to determine what is at ‘x, y, z’. Not what is at [x][y][z]. And, in Python, what’s the best way to combine data with a label. A dictionary of course.

And, it turned out to be more efficient because I didn’t have to create array values for locations I wasn’t even remotely interested in. Wow! I only needed to add coordinates to the dictionary for locations that had, at one time or another, a piece of data I was interested. With arrays or nested lists I would have to include every possible location in the space.

So, all I really needed was something like space[‘x,y,z’] to access all the points in my space. And, that could easily be modified to work with just about any number of dimensions. And, way easier than trying to create largish multidimensional array/list structures — with or without Numpy. Using arrays was proving difficult because with each iteration of the simulation the space grew. That would entail a lot of strange co-ordinate conversions or a lot of copying arrays to to new arrays. With a dictionary key, no problem.

Oh yes, and rather than use a string, I used a tuple for the coordinates. The keys to a dictionary must be immutable and hashable. And, a tuple, like a string, is both. Also, personally, I just found it easier to create tuples on the fly than strings. So, that’s what I decided to use.

Comprehensions

Have I mentioned how much I like list and dictionary comprehensions. I know on the inside Python is creating a loop on my behalf. But, it just often looks so much tidier than a loop or nested loops.

Now in that Advent of Code problem, I had to read a file and initialize my space. The initial space was a 2d dimensional slice in a 3D space. Now I could have used nested loops but, to me at least, a comprehension looks so much tidier. Compare the two options. The contents of the file are in the list lines, one line less the line break character ("\n") per list element (row, if you like). And, in this case z = 0 for all elements in the initial space, so only two loops needed.

for i in range(len(lines)):
  for j in range(len(lines[0])):
    cubes[(i, j, 0)] = lines[i, j]

Versus

cubes = [(i, j, 0): lines[i, j] for i in range(len(lines)) for j in range(len(lines[0])]

Yup, just as much typing, but oh so tidy. And, in fact, most Python programmers would likely have written it as:

cubes = {(i, j, 0): lines[i][j] 
         for i in range(len(lines)) 
         for j in range(len(lines[0]))}

Which pretty much looks like the nest loop version. But, I still like the look and feel of it way better. And, personally I think it explains what is happening in a nicer way.

Let’s look at a couple of list comprehensions. During each simulation run, I needed to check the value in each location’s immediate neighbours — in all 3 dimensions. So, I wrote a function to get all the neighbours for a given coordinate. Notice the use of the conditional. Recall that c is a tuple, essentially an immutable list.

def findNeighbors(c):
  neighbors = [(c[0]+i, c[1]+j, c[2]+k) 
               for i in range(-1, 2) 
               for j in range(-1, 2) 
               for k in range(-1, 2) 
               if not (i == 0 and j == 0 and k == 0)]
  return neighbors

Yes, the 3 nested loops are there, but doesn’t this just look better. And, I think once you get used to comprehensions just a bit more informative as well.

And another list comprehension. In one of the CS50 exercises we had to validate charge card numbers. Generating the checksum involved going over every second digit of the card (from different start points), multiplying each digit by a number (again different for each start point) and then summing the digits (with a caveat, multi-digit results had to be summed by their individual digits as well). So, I wrote a function to do this. Note that the input card number is a string — for a variety of reasons.

# sum every 2nd digit of n multiplied by mult, starting at position bgn (from the right)
# I am also assuming that mult will be 1 or 2 as used in the checksum algorithm.
def get_sum_2nd(n, bgn, mult):
    tmp_num = [int(n[i]) * mult for i in range(len(n) - bgn, -1, -2)]
    sum_digits = 0
    for nbr in tmp_num:
        if nbr < 10:
            sum_digits += n
        else:
            sum_digits += (nbr % 10) + (nbr // 10)
    return sum_digits

Again, I could have used a loop, but look how neat and tidy the comprehension is. And, it might have been possible to use reduce() instead of the for loop. But, I was too lazy to check the manual. Also, with Python 3.0 reduce() was moved from a built-in to functools and now needs to be imported. So…

I could probably have converted the result of the multiplications back to strings, then joined them together into one long string and iterated over each character (digit) converting to an int and adding to the sum. Would have saved the modulus and division operations for multiple digit numbers. Something like the following, which I have confirmed does work. Wish I had thought of it sooner. Though maybe should look at which is actually faster.

def get_sum_2nd(n, bgn, mult):
    tmp_num = [str(int(n[i]) * mult) for i in range(len(n) - bgn, -1, -2)]
    sum_digits = 0
    all_dig = ''.join(tmp_num)
    for d in all_dig:
      sum_digits += int(d)
    return sum_digits

Order and Iterating Over Dictionaries

Order

Values in a list are ordered by their position within the list. And, the order in which they are added to the list is preserved. No such ordering applies to the values in a dictionary, since the underlying storage mechanism is different. That is, something like items in consecutive memory locations versus a hash table. If you care about the order of the items in your dictionary, have a look at ordered dictionaries in the collections module.

Also, in Python 3.5 and earlier there was no guarantee regarding the order in which the keys would be returned. Since Python 3.6 dictionaries preserve the order in which keys are created (i.e. added to the dict). And, by default, return them in that order.

Iterating Over Dictionaries

We can in fact iterate over the dictionary’s keys, its values or both using specific object methods. But, beware these objects return an iterable, not a list. If you need a list you will have create it using list(). So, let’s look at what these functions provide using my cubes dictionary from above.

Okay the basic setup code is as follows. The Conway Cubes challenge had two parts to it. Hence the way I wrote the code. Also, in order to allow putting function definitions after the “main program” code, CS50 suggested setting up Python modules in a similar fashion to C. I.E. a main() function at the top. Any functions called in main() defined after that. Then a call to the main() function at the bottom of the module. This allows the actually code, main(), to call functions declared after it. If you tried to do that without putting the main code in a function, Python would exit with numerous exceptions.

fl_cc = '../advent_code_2020/d17_init_array.txt'

DEBUG = False
do_dev = True

def part1():
  with open(fl_cc) as fp:
    lines = [fline.rstrip() for fline in fp.readlines()]

  cubes = {(i, j, 0):lines[i][j] 
           for i in range(len(lines)) 
           for j in range(len(lines[0]))}

if __name__ == "__main__":
  part1()

Let’s first try a for loop on the list. So after the cubes comprehension, add:

for coord in cubes:
  print(coord, end=' ')
print()

And, the output I get:

0, 0, 0) (0, 1, 0) (0, 2, 0) (0, 3, 0) (0, 4, 0) (0, 5, 0) (0, 6, 0) (0, 7, 0) (1, 0, 0) (1, 1, 0) (1, 2, 0) (1, 3, 0) (1, 4, 0) (1, 5, 0) (1, 6, 0) (1, 7, 0) (2, 0, 0) (2, 1, 0) (2, 2, 0) (2, 3, 0) (2, 4, 0) (2, 5, 0) (2, 6, 0) (2, 7, 0) (3, 0, 0) (3, 1, 0) (3, 2, 0) (3, 3, 0) (3, 4, 0) (3, 5, 0) (3, 6, 0) (3, 7, 0) (4, 0, 0) (4, 1, 0) (4, 2, 0) (4, 3, 0) (4, 4, 0) (4, 5, 0) (4, 6, 0) (4, 7, 0) (5, 0, 0) (5, 1, 0) (5, 2, 0) (5, 3, 0) (5, 4, 0) (5, 5, 0) (5, 6, 0) (5, 7, 0) (6, 0, 0) (6, 1, 0) (6, 2, 0) (6, 3, 0) (6, 4, 0) (6, 5, 0) (6, 6, 0) (6, 7, 0) (7, 0, 0) (7, 1, 0) (7, 2, 0) (7, 3, 0) (7, 4, 0) (7, 5, 0) (7, 6, 0) (7, 7, 0)

So, in this situation Python generates an iterable of the dictionaries keys. Let’s change the loop to read as follows, and run the script again. Note the specific request for the dict’s keys.

  for coord in cubes.keys():
    print(coord, end=' ')
  print()

Same output (trust me). Now let’s have a look at the values() method.

  for value in cubes.values():
    print(value, end=' ')
  print()

And, I get:

. . # # . # . # # # . . . . # . . . . . # # # # # . . # # . . . # . . # . # # . . # . . # . . . # # . . . # # . . # . . . # . . 

In this case likely not of much use. And finally, let’s have a look at items(), probably the one of the three I have used most. And, I’ve done something more reasonable with the output.

  r_ndx = -1
  for coord, value in cubes.items():
    if coord[0] != r_ndx:
      r_ndx = coord[0]
      if r_ndx > 0:
        print()
    print(f"{coord}: '{value}'", end=' ')
  print()

And:

(0, 0, 0): '.' (0, 1, 0): '.' (0, 2, 0): '#' (0, 3, 0): '#' (0, 4, 0): '.' (0, 5, 0): '#' (0, 6, 0): '.' (0, 7, 0): '#' 
(1, 0, 0): '#' (1, 1, 0): '#' (1, 2, 0): '.' (1, 3, 0): '.' (1, 4, 0): '.' (1, 5, 0): '.' (1, 6, 0): '#' (1, 7, 0): '.'
(2, 0, 0): '.' (2, 1, 0): '.' (2, 2, 0): '.' (2, 3, 0): '.' (2, 4, 0): '#' (2, 5, 0): '#' (2, 6, 0): '#' (2, 7, 0): '#'
(3, 0, 0): '#' (3, 1, 0): '.' (3, 2, 0): '.' (3, 3, 0): '#' (3, 4, 0): '#' (3, 5, 0): '.' (3, 6, 0): '.' (3, 7, 0): '.'
(4, 0, 0): '#' (4, 1, 0): '.' (4, 2, 0): '.' (4, 3, 0): '#' (4, 4, 0): '.' (4, 5, 0): '#' (4, 6, 0): '#' (4, 7, 0): '.'
(5, 0, 0): '.' (5, 1, 0): '#' (5, 2, 0): '.' (5, 3, 0): '.' (5, 4, 0): '#' (5, 5, 0): '.' (5, 6, 0): '.' (5, 7, 0): '.'
(6, 0, 0): '#' (6, 1, 0): '#' (6, 2, 0): '.' (6, 3, 0): '.' (6, 4, 0): '.' (6, 5, 0): '#' (6, 6, 0): '#' (6, 7, 0): '.'
(7, 0, 0): '.' (7, 1, 0): '#' (7, 2, 0): '.' (7, 3, 0): '.' (7, 4, 0): '.' (7, 5, 0): '#' (7, 6, 0): '.' (7, 7, 0): '.'

All of the above being fairly self-explanatory.

Done for This One, M’thinks

I was planning on discussing sorting dictionaries, but I am running a little short of time. Still bumbling my way through CS50. But, now, slowly though it may be, closer to the finish than to the start. Just finished Week 7 — covering SQL. And, it seems to take me more time for each exercise than I typically anticpate. My goal is to get it finished as quickly as I can, so the blog is suffering. And, expect it will be suffering for another month or so. Maybe more as it is time I was getting outside and working on the flower beds and such.

So, I will look at a short post next week looking at sorting of dictionaries. See you then.

Resources