Another Step Sideways

I have been doing a little reading about project layout best practices. And, have decided to modify my project layout to better reflect what I have been reading. So, sorry, we will be taking a short, tangential step in our project development. I will also have to figure out how to make git happy with the changes. And, then how to get the applicaton code to work nicely with the various new folders and/or packages.

Current Layout

The project folder currently looks like this:

py_play
│
├── .gitignore
├── base-3.8.yml
├── conda_requirements.txt
├── cr_list.txt
├── file_access.test.py
├── fixes.test.py
├── LICENSE
├── play_pop_csv.test.py
├── population_by_age.py
├── pyPlay.code-workspace
├── rc_name_db.py
├── README.md
├── requirements.txt
└── WPP2019_PopulationByAgeSex_Medium.csv
└── WPP2019_Region_Blocks.csv

Proposed Structure

I am going to aim for something like the following. It may end up being adjusted as I go, but figured I should have a decent starting point.

py_play
│
├── population/
│   ├── __init__.py
│   ├── population_by_age.py
│   │
│   ├── database/
│   │   ├── __init__.py
│   │   ├── population.py
│   │   └── rc_names.py
│   │
│   │─── chart/
│   │    ├── __init__.py
│   │    └── chart.py
│   │
│   └─── play/
│        ├── file_access.test.py
│        ├── fixes.test.py
│        ├── play_pop_csv.test.py
│        └── WPP2019_Region_Blocks.csv
│
├── data/
│   ├── WPP2019_PopulationByAgeSex_Medium.csv
│   └── cr_list.txt
│
├── tests/
│
├── .gitignore
├── LICENSE
├── pyPlay.code-workspace
├── README.md
├── requirements.txt
└── setup.py

Given the database/ and chart/ folders, and the presence of __init__.py files, you hopefully realize I am going to be modifying the current code modules and moving various functions and variables into individual packages. And, in fact, the bulk of the application, population/, will itself be a package. That way, if I do start doing stand alone testing, I can import the population package in the test files.

I have also decided to put all application data in a folder of its own. It’s a central location for any files that the application will use or produce. E.G. the CSV of population data or the file of the list of country and region names. I do not currently track these files in version control as they are either downloaded or generated by the application. So, no need to have git or GitHub track them and waste space.

The play/ folder is for the bits and pieces of the code I play with (i.e. test) prior to including in the project. Or, maybe never including. I have not been tracking those files via git. And will not track the new directory.

The tests/ folder is currently optional. I haven’t started doing proper testing—unit tests, execution tests, integration tests, and so on. Down the road perhaps.

If I had any documentation files, I would also put them a separate directory, e.g. docs/, off the root.

I should likely rename the root directory to something more meaningful as well, but chose not to do so.

And, it should be clear there will be new files, new imports, and numerous other code changes to complete the process. That will likely be covered in the next post.

Is this all really necessary. Don’t know. But experience tells me it always pays to be properly organized. That very likely also includes one’s project layouts. So, I am going to make the changes, for better or worse. Will take some time, but all well spent I am sure.

Making the Changes

First I will move the data and play files into their new directories and see how git sees the changes.

Well, looks like everything ok. The files I moved weren’t being tracked (in .gitignore) and git doesn’t see that anything has changed. Lovely. At some point I will modify *.gitignore` to include the directories and remove the individual files. Now the hard stuff.

Ok, I have created py_play/population and copied population_by_age.py to that directory. A git status gave me the following:

PS R:\learn\py_play> git status
On branch master
Your branch is up to date with 'origin/master'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        population/

nothing added to commit but untracked files present (use "git add" to track)

So, I ran a git add population/, followed by a git rm population_by_age.py. Now status shows:

PS R:\learn\py_play> git status
On branch master
Your branch is up to date with 'origin/master'.

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        renamed:    population_by_age.py -> population/population_by_age.py

I have no real idea of how one should go about these changes with respect to version control (in my case, git). However, the mantra of commit small and commit often likely still applies. Going to be messy but such is life and programming. So I have committed this single visible change.

git commit -m "applicaton layout restructure, step 1" -m "created new directory 'py_play/population', moved 'population_by_age.py' to that directory from its original location in 'py_play'"

Now let’s move and rename rc_names_db.py.

PS R:\learn\py_play> md population/database
PS R:\learn\py_play> git mv rc_name_db.py population/database/rc_names.py
PS R:\learn\py_play> git status
On branch master
Your branch is up to date with 'origin/master'.

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        renamed:    rc_name_db.py -> population/database/rc_names.py

And another commit.

git commit -m "layout restructure, step 2" -m "moved rc_name_db.py to population/database/rc_names.py"

Now I am going to add the new chart folder and all the “empty” files (3 x __init__.py, database/population.py, chart/chart.py). And commit that change. I will leave the details to you.

And, once that is done, I think that will be it for this post. The hard work of splitting the code and fixing all the errors comes next.

So as not to keep you waiting too long, that post will be published this coming Thursday, a little ahead of schedule.