Merge and New Branch

I have decided to leave any further enhancements of the name search facility for the future — if at all. Instead I am going to work on providing percentage based population plots as well as absolute numerical population plots So, I merged the search-names branch with the master branch. Removed the branch locally and remotely. Then created a new branch, percent-pop, for the next set of code development.

Percentage Population

Well, seems to me producing a percentage plot should be simple enough for our first two plot types. We just need to total the age-group populations to get the total population for that year. Then generate and use percentages in the plots. For the many country, many year, one age group plot that is not possible. As for any given year we only have one value. We’ll work on the easy ones first. Then tackle the apparently more difficult situation.

Determining What to Plot: Numeric Count or Percentage

But first, how and where to determine if we are doing plots by age group count or percentage? I thought about adding another query to each plots user interface to ask what type of plot the user wanted. But, that just struck me as too messy. So, I have decided to add an item to the menu that toggles a default value indicating the type of plot. Then use that global value in the functions to determine what values to plot.

So let’s add the new menu item and the related code. You will likely want to let the user know the current state in some fashion. I will include it in the menu item’s text. See you back here when you’re done.

Let’s start by adding the variable and a new menu item. I was originally going to use ‘P’ (i.e. Percent) for the menu item. But then, keeping things alphabetical would have put it after the Plot chart selection. Didn’t think that was the best choice. So, I went with ‘%’ which I figured could safely ignore alphabetical positioning. Thought having it above the Plot chart selection was a logical place for it.

# default state is numeric count for population values in plotted charts
do_percent = False

MENU_M = {
  'A': 'About',
  '%': 'Toggle percentage plots',
  'C': 'Plot chart',
  'S': 'Search country/region names',
  'X': 'Exit the application'
}

Then we need to add the code to print the menu item. Which has to account for the current state of do_percent. So we had to add an if control block to cover printing the menu option for ‘%’. Note the use of the ternary operator to determine the text based on the current state of the do_percent variable.

    for key, value in MENU_M.items():
      if key == '%':
        print(f"\t\t{CMITEM}{key}{TRESET}: {value} {TRED+'Off'+TRESET if do_percent else TGREEN+'On'+TRESET}")
      else:  
        print(f"\t\t{CMITEM}{key}{TRESET}: {value}")

And finally we need to toggle the do_percent variables state whenever the user selects the ‘%’ menu item. The code I used is a pretty efficient way to toggle a boolean value. And, I believe it is a well understodd programming idiom. You may have chosen to be much more clear in your code. You could of course also use something like do_percent ^= True, but that might not be as obvious to everyone reading your code.

    elif u_choice.upper() == '%':
      do_percent = not do_percent

And, it seems to work! <img src=‘img/pcent_toggle_1.jpg’ alt=‘view of terminal window illustrating toggling of percent plot global variable’ loading=‘lazy’ style=‘width:800px;height:407px>

Modify Code to Plot Percentages for First Two Chart Types

Been looking at the code and thinking about how to tackle this. I am fairly certain we should be looking after this in chart.plot_population(). That is afterall where the calls to the database.population module are made. We don’t currently import database.population in our main module, population_by_age.py. But, that means the functions in chart.chart.py will need access to our do_percent global variable. And, that isn’t as straightforward as it sounds.

Duplicate do_percent

Done a bit of googling. For now, I am going to duplicate the variable in both modules, population_by_age.py and chart.chart.py. I may eventually just have the variable in chart.chart and alter it when needed from the main module menu code, as it will already be imported in the main module.

So, I will start by doing just that. I originally planned to update the value in the chart.chart module whenever the menu updated the value in the main module. But, I have since decided to update chart.do_percent only when Plot chart is selected from the menu. Seems like the most logical time to do so. I temporarily added some print statements to check the values of both variables before and after the code to sync the values. That is not shown below.

PS R:\learn\py_play> git diff HEAD
diff --git a/population/chart/chart.py b/population/chart/chart.py
index 36c74b1..5673c66 100644
--- a/population/chart/chart.py
+++ b/population/chart/chart.py
@@ -20,6 +20,8 @@ from database import population as pdb

 # pylint: disable=unused-variable

+do_percent = False
+
 def plot_bar_chart(country, year, pop_data):
   # define the x-labels for the chart
   x_labels = pop_data.keys()

diff --git a/population/population_by_age.py b/population/population_by_age.py
index 3305282..d9def57 100644
--- a/population/population_by_age.py
+++ b/population/population_by_age.py
@@ -456,6 +456,9 @@ if __name__ == '__main__':
       do_percent = not do_percent

     elif u_choice.upper() == 'C':
+      chart.do_percent = do_percent
       while True:
         c_choice = do_chart_menu()
         if c_choice.upper() == 'Q':```

Percentage Type 1 Plots

Okay on to the real, working code. During testing I am going to set the default value of do_percent to True in the main module. Save the extra menu selection step every time I restart the program after code updates/fixes.

I believe all the necessary code changes can be made in four functions, chart.plot_population() and three plotting functions it calls. Let’s start with the plot for one country/region with possibly multiple years of data for all age groups. In this case we know that for each year in the plot we have data for every age group, so we also, with a simple sum, have the total population for each year. So, it should not be difficult to get the percentages for each age group. Which we can then pass to the appropriate plotting function.

The data we will need to modify is that returned by the call to get_1cr_years_all() in the database.population module. That looks something like the following:

{
  '2015': [14627.451, 14960.24, 16256.559, 17253.819, 17072.244, 17277.848, 17497.092, 15860.259, 14045.635, 13156.121, 12031.018, 9991.214, 8057.85, 5983.552, 4179.927, 3058.838, 1788.973, 946.406, 335.614, 79.815, 11.284],
  '2016': [14700.497, 14779.413, 15978.02, 17133.065, 17082.075, 17175.157, 17551.664, 16214.343, 14292.544, 13254.696, 12246.542, 10308.514, 8331.458, 6256.742, 4346.959, 3135.721, 1890.192, 998.754, 382.348, 91.514, 12.838],
  '2017': [14704.398, 14667.991, 15693.903, 16955.659, 17132.634, 17088.109, 17507.225, 16577.206, 14597.442, 13347.404, 12424.777, 10653.693, 8598.181, 6544.496, 4554.001, 3196.301, 2006.585, 1039.174, 422.218, 107.84, 14.588]
}

So a loop on the keys of the dictionary coverting each array element into percentages should do the trick. And, alas still pretty puny numbers for the eldest age groups. Ah well. It may still be of interest to some.

{
  '2015': [7.153775695742902, 7.316531179251997, 7.950515552614775, 8.438240608083193, 8.349438613671827, 8.449992353222727, 8.557216940653404, 7.756699055931729, 6.869229799113725, 6.434199551244629, 5.883950946986278, 4.886354012340648, 3.9408131662818042, 2.9263464202897573, 2.0442563904387403, 1.4959708934670044, 0.8749242480962861, 0.462854139187016, 0.1641370923991513, 0.03903473046368228, 0.005518610518726942],
  '2016': [7.130519543714953, 7.168797982893698, 7.750185852891123, 8.310443845962395, 8.285710995669373, 8.33086069504131, 8.513486528837642, 7.8648150229205, 6.932640734623183, 6.429229492989279, 5.940221413869613, 5.0001752011281795, 4.041198341569016, 3.0348512101993688, 2.1085053182370372, 1.5209907443358817, 0.9168432194757534, 0.484448581320991, 0.18545902811995574, 0.04438913633488243, 0.0062271098658917825],
  '2017': [7.075074521676152, 7.057557161352344, 7.551178447492847, 8.158276931101085, 8.243429095336142, 8.222005729818042, 8.423664915949075, 7.976182895156743, 7.023612253683923, 6.4221519283494874, 5.978226595213748, 5.126063093916498, 4.137046027036264, 3.1489080278438797, 2.1911741267332205, 1.5379118389415196, 0.9654756630688002, 0.5000023456239618, 0.20315172470121262, 0.051887607803975125, 0.007019069201079275]
}

The affected area of code for the loop looks like (showing the existing line before and after):

    dbg_data = pdb.get_1cr_years_all(p_nms[0], yr_list)
    if do_percent:
      for p_yr, y_data in dbg_data.items():
        yr_tot_pop = sum(y_data)
        dbg_data[p_yr] = [agp/yr_tot_pop*100 for agp in y_data]
    plot_data[1].append(dbg_data)

We also need to change the y-axis text for the percentage case. Other than that the chart looks fine.

  if do_percent:
    ax.set_ylabel('Population (% Annual Total)')
  else:
    ax.set_ylabel('Population (1000s)')

Percentage Type 2 Plots

Well, looks to me like we need to pretty much the same thing to get a Type 2 percentage chart. Only real difference is that the key for the dictionary of population data is a country/region name, rather than a year. So, the section of plot_population() for this chart type, I now have:

    dbg_data = pdb.get_crs_1yr_all(p_nms, p_yrs[0])
    if do_percent:
      for p_cr, cr_data in dbg_data.items():
        yr_tot_pop = sum(cr_data)
        dbg_data[p_cr] = [agp/yr_tot_pop*100 for agp in cr_data]
    plot_data[0].append(dbg_data)

And, in plot_m1a(), I now have:

  # Add some text for labels, title and custom x-axis tick labels, etc.
  if do_percent:
    ax.set_ylabel('Population (% Annual Total)')
  else:
    ax.set_ylabel('Population (1000s)')
  ax.set_xlabel('Age Groups')

Done for Now

Getting to be a lengthy post, so think I will call it quits for today. And, the last chart type is going to get somewhat complicated I think. Also, thinking that since the code for the first two chart types is duplicated, it should likely go into a separate function. And, likely should also write a separate function to deal with the Type 3 Chart percentage case rather than including that code in plot_population(). Something to think about for next time.

Note, I don’t think it would make sense to move the duplicated y-label code into a function.

Also, while working on this I was reminded that the charting sub-menu doesn’t exit after each plot. So, if you wanted to switch between population counts and percentages you would need to exit the charting sub-menu to do so. Doesn’t sound very user friendly. So, I will look at moving that menu item from the main menu to the charting sub-menu. Which will mean changing how we sync the variable in the chart module with the one in the main module.

Anyway, until next time. (Still posting on the semi-weekly schedule.)

Resources

More Pylint Corrections/Exceptions

I also decided to get rid of the Pylint warnings regarding the two occurrences of a bare-except. Modified code to check for a ValueError on conversion to int and committed the changes. Warnings gone. And, was getting wrong-import-position and unused-variable complaints in the chart.chart.py module. So used Pylint pragmas to disable those messages for now.

I will at some point re-enable the unused-variable message and do something about any remaining unused variables.

PS R:\learn\py_play> git diff percent-pop~1 percent-pop
diff --git a/population/chart/chart.py b/population/chart/chart.py
index 81e33cc..36c74b1 100644
--- a/population/chart/chart.py
+++ b/population/chart/chart.py
@@ -14,8 +14,12 @@ if __name__ == '__main__':
   except ValueError: # Already removed
     pass

+# import here so above code can run when testing module
+# pylint: disable=wrong-import-position
 from database import population as pdb

+# pylint: disable=unused-variable
+
 def plot_bar_chart(country, year, pop_data):
   # define the x-labels for the chart
   x_labels = pop_data.keys()