As I mentioned last post, I wanted to code an utility module to allow me to add individual temperature/weather data to the database as it became available. Similar to the one I wrote for adding rainfall data. Thought about including the temperature option in the rainfall utility module. But figured that would get a little messy. So going with a separate module.
After that I am going to play around with extracting data I may be interested in. Think there are going to be some curve balls when it comes to querying the table. E.G. monthly average, minimum, and maximum temperatures. Last few years don’t have anything in the low and high columns. And in earlier years not every data row had one or both.
Adding Temperature Data on Daily Basis
Well, in fact, could be more often than daily as I sometimes record additional data as the day progresses. Depends on what I am doing and my mood. But I do try to record the temperature/weather in my daily note file at least once per day.
For now I have coded the insert query in the module. Last time I created a new method in the Weather_db class. Maybe later for the temperature table data. Not much else to say. Again data passed via command line parameters.
# get_daily_tw.py: utility module to obtain daily temperature/weather data
# and write it to the database
# ver: 0.1.0, 2025.09.15, rek, init version
# will initially use argparse to get data as command line parameters
# may eventually modify to also allow ui like interface
# Had thought about modifying get_daily_rf.py to handle this data as well,
# but decided not to at this time. Though down the road who knows.
import argparse, time
from pathlib import Path
import pandas as pd
if __name__ == "__main__":
from weather_db import Weather_db
def get_tw_parser():
"""Create and return parser for rainfall data.
"""
# determine default date for optional parameter
def_dt = time.strftime("%Y.%m.%d", time.gmtime())
# instantiate and set up command paramter parser
parser = argparse.ArgumentParser()
parser.add_argument("-d", "--date", help="Supply date rainfall recorded, yyyy.mm.dd",
default=def_dt)
parser.add_argument("-t", "--time", help="Supply time rainfall recorded (24 hour format), hh:mm",
default="06:00")
# parser.add_argument("-u", "--history", help="Update history table as well",
# action="store_true")
parser.add_argument("-lt", "--low", help="Supply the low temperature for the day", type=float)
parser.add_argument("-ht", "--high", help="Supply the high temperature for the day", type=float)
parser.add_argument("-hu", "--humidity", help="Supply the humidity at time temperature recorded, float", type=int)
parser.add_argument("-w", "--weather", help="Supply the weather condition at the time the temperature was recorded")
parser.add_argument("temperature", help="Supply amount of rainfall recorded (mm), 2 decimals maximum",
type=float)
return(parser)
if __name__ == "__main__":
# instantiate database class
cwd = Path(__file__).cwd()
fl_pth = cwd/"../data"
fl_nm = "weather.db"
db_pth = fl_pth/fl_nm
wdb = Weather_db(db_pth)
tw_dt, tw_tm, tw_t, tw_l, tw_h, tw_u, tw_w = "", "", 0.0, "", "", "", ""
# get parser and provided command line args
tw_parse = get_tw_parser()
tw_args = tw_parse.parse_args()
print(f"args: {tw_args}")
# get data from parser
tw_t = round(tw_args.temperature, 2)
tw_dt = tw_args.date
tw_tm = tw_args.time
tw_l = tw_args.low
tw_h = tw_args.high
tw_u = tw_args.humidity
tw_w = tw_args.weather
d_data = (f"{tw_dt} {tw_tm}", tw_t, tw_l, tw_h, tw_w, tw_u)
print(d_data)
in_tmpr_tbl = f"""INSERT INTO {wdb.tnms["tp_tnm"]} VALUES (NULL, ?, ?, ?, ?, ?, ?);"""
wdb.qry_nsrt(in_tmpr_tbl, d_data)
# quick check
d_qry = f"""SELECT * FROM {wdb.tnms["tp_tnm"]} WHERE datetime LIKE '{tw_dt[:7]}%';"""
d_rslt = wdb.qry_pd(d_qry)
print("\n", d_rslt)
I then entered all the data for the current month. Up to the morning of 2025.09.16. Will do an afternoon update later today (this is a draft post being written long before it is set to be published).
I typically only have the temperature data for any additional daily entries, not weather condition.
Here’s the output for the last entry; somewhat truncated.
(dbd-3.13) PS R:\learn\dashboard\utils> python get_daily_tw.py 13 -d 2025.09.16 -t 07:31 -hu 85 -w "mainly clear"
args: Namespace(date='2025.09.16', time='07:31', low=None, high=None, humidity=85, weather='mainly clear', temperature=13.0)
('2025.09.16 07:31', 13.0, None, None, 'mainly clear', 85)
row_id datetime temperature dmin dmax condition humidity
0 5200 2025.09.01 06:20 17.0 None None mostly cloudy 82.0
1 5201 2025.09.01 14:40 22.0 None None None 70.0
2 5202 2025.09.02 07:30 16.0 None None mostly cloudy 85.0
3 5203 2025.09.02 17:04 27.0 None None None 50.0
4 5204 2025.09.03 06:47 17.0 None None mainly clear 86.0
5 5205 2025.09.03 15:08 27.0 None None None 64.0
... ...
22 5222 2025.09.12 06:31 15.0 None None mostly cloudy 82.0
23 5223 2025.09.12 14:25 21.0 None None None 71.0
24 5224 2025.09.13 08:04 15.0 None None partly cloudy 86.0
25 5225 2025.09.13 16:58 22.0 None None None 60.0
26 5226 2025.09.14 06:42 16.0 None None mostly cloudy 82.0
27 5227 2025.09.15 08:06 14.0 None None mostly cloudy 86.0
28 5228 2025.09.15 15:01 20.0 None None None 62.0
29 5229 2025.09.16 07:31 13.0 None None mainly clear 85.0
Big Fix
When I started working on the queries to investigate the temperature table, I started getting counts I didn’t expect. Turns out I was getting strings, blank or otherwise, in places I wasn’t expecting them. Should have read the SQLite docs. Python SQLite3 is more than happy to put an empty string into a float column. I assumed an empty string would be treated as a NULL in that situation. No such luck. So, I am going to rework all my previous code and rebuild the temperature table.
Here’s some example output of a bit of code I wrote to check things out. And, I didn’t expect the % in the humidity value for 2025.01.03.
(dbd-3.13) PS R:\learn\dashboard\utils> python weather_db.py
'4699': <class 'int'>
'2025.01.03 06:00': <class 'str'>
'2.0': <class 'float'>
'': <class 'str'>
'': <class 'str'>
'cloudy': <class 'str'>
'92%': <class 'str'>
(dbd-3.13) PS R:\learn\dashboard\utils> python weather_db.py
'3': <class 'int'>
'2015.01.03 07:00': <class 'str'>
'1.3': <class 'float'>
'': <class 'str'>
'2.7': <class 'float'>
'snow': <class 'str'>
'': <class 'str'>
Live and learn! Or, so some say. I expect the refactor will get some what messy. And I wil likely only show a small amount of the refactored code. But, that’s a bit of time away. (Calling it a day.)
Okay, I am back.
In the end I decided to modify the array of values just before they are appended to the monthly list of data. As that happens in a few places, I wrote a function to get the job done.
def mk_null(a_dt):
"""Go through array of temperature/weather data, convert empty strings
in numeric columns to None. So that when inserted into database,
column value will be NULL.
Since argument is array, changes will be made to current variable.
No need to return.
"""
num_cols = [2, 3, 5]
for i in num_cols:
if a_dt[i] == "":
a_dt[i] = None
And, an example of one of the refactored append code blocks.
if p_mn and mon_end:
if pt_data[1] and pt_data[0][5:7] == p_mn:
mk_null(pt_data)
mn_data.append(tuple(pt_data))
if ct_data[1] and ct_data[0][5:7] == p_mn:
mk_null(ct_data)
mn_data.append(tuple(ct_data))
print(f"Writing month to database: '{p_mn}'")
in_tmpr_tbl = f"""INSERT INTO {rfall.tnms["tp_tnm"]} VALUES (NULL, ?, ?, ?, ?, ?, ?);"""
rfall.qry_nsrt(in_tmpr_tbl, mn_data, is_many=True)
I also refactored the note parsing function. It was returning a string ending in a % sign for the humidity value. That was making that column a string value. I want it to be of type float. Won’t bother showing the change (just moved the % sign outside the grouping brackets in the regex).
And the terminal output after processing one year’s notes file.
(dbd-3.13) PS R:\learn\dashboard> python data2db.py
7 -> F:\BaRKqgs\gdn\bark_gdn_2022.php
Writing month to database: '01'
Writing month to database: '02'
Writing month to database: '03'
Writing month to database: '04'
Writing month to database: '05'
Writing month to database: '06'
Writing month to database: '07'
Writing month to database: '08'
Writing month to database: '09'
Writing month to database: '10'
Writing month to database: '11'
Writing month to database: '12'
row_id datetime temperature dmin dmax condition humidity
0 3487 2022.01.01 06:00 -5.0 None None cloudy NaN
1 3488 2022.01.01 16:21 -1.5 None None NaN
2 3489 2022.01.02 06:00 2.1 None None cloudy NaN
3 3490 2022.01.02 14:06 2.5 None None NaN
4 3491 2022.01.03 06:00 3.3 None None cloudy NaN
.. ... ... ... ... ... ... ...
373 3860 2022.12.27 06:00 7.0 None None cloudy 99.0
374 3861 2022.12.28 06:00 6.0 None None cloudy 99.0
375 3862 2022.12.29 06:00 4.0 None None cloudy 87.0
376 3863 2022.12.30 06:00 4.0 None None light rain 99.0
377 3864 2022.12.31 06:00 6.0 None None mostly cloudy 99.0
[378 rows x 7 columns]
I have reprocessed all the note files from 2015 to 2024 in total. And, 2025 to the end of August. I will process the available September data once I refactor the utility module code to reflect what I’ve done above.
It turns out that I don’t seem to need to do so. argparse returns None if a value is not present. And, that’s what is being put in the tuple for insertion to the database table. Well, if the particular value is missing.
No Rainfall?~!#
When I tried running the current dashboard code, I got an error. I eventually paid attention to the terminal output in my earlier work and realized the rainfall table was empty. I apparently copied over the wrong backup database file before adding the temperature data (as described above).
I didn’t feel like starting over with a different database file for the temperature data. Nor did I want to rerun the code to add all that rainfall data. So, instead I instatiated the Weather_db class for both the two database files I needed to merge. Got the rainfall data from one and inserted it in the empty table in the other. Like so. Note, I just stuck it in the module I had started when I found all my errors. That if False: was if True: earlier.
if False:
# going to copy rainfall data into most recent database
# which somehow is missing it
# instantiate database class using approp databases
cwd = Path(__file__).cwd()
fl_pth = cwd/"data"
fl_nm = "weather.db"
db_pth = fl_pth/fl_nm
rfall = Weather_db(db_pth)
fl_pth = cwd/"data"
fl_nm = "weather_2025_09_18_1.db"
db_pth = fl_pth/fl_nm
rfall2 = Weather_db(db_pth)
# copy rainfall data in rfall into rfall2
rf_q = f"""SELECT * FROM {rfall.tnms["rf_tnm"]};"""
rf = rfall.qry_exec(rf_q)
print(f"len(rf): {len(rf)}\n{rf[:5]}")
# for td in rf:
# td[0] = None
# print(f"len(rf): {len(rf)}\n{rf[:5]}")
updt_q = f"""INSERT INTO {rfall2.tnms["rf_tnm"]}(row_id, datetime, daily, monthly)
VALUES (?, ?, ?, ?);"""
n_cnt = rfall2.qry_nsrt(updt_q, rf, is_many=True)
rf_q = f"""SELECT * FROM {rfall2.tnms["rf_tnm"]};"""
rf2 = rfall2.qry_exec(rf_q)
print(f"len(rf2): {len(rf2)}\n{rf2[:5]}")
rfall2.updt_hist_tbl()
print("rainfall history table updated")
exit(0)
And in the terminal I got the following.
(dbd-3.13) PS R:\learn\dashboard> python chk_db_rek.py
len(rf): 1774
[(1, '2014.03.09 13:30', 37.5, 37.5), (2, '2014.03.17 11:11', 54.0, 91.5), (3, '2014.03.20 16:00', 21.0, 112.5), (4, '2014.03.23 16:00', 3.0, 115.5), (5, '2014.03.26 16:00', 19.75, 135.25)]
len(rf2): 1774
[(1, '2014.03.09 13:30', 37.5, 37.5), (2, '2014.03.17 11:11', 54.0, 91.5), (3, '2014.03.20 16:00', 21.0, 112.5), (4, '2014.03.23 16:00', 3.0, 115.5), (5, '2014.03.26 16:00', 19.75, 135.25)]
rainfall history table updated
I added the two days of rainfall for this month using my utility module. The dashboard now appears to work correctly.
Text in Numeric Colunns
While continuing to work on queries to extract data I would be interested in, I had some issues with bad/inaccurate data in the temperature column of the temperature table.
For example, I had a temperature of 195°C for one date (2021.08.11 08:01 195.0 None NaN None). I used a query to get all the rows with a temperature greater than 30°C.
nx_q = f"""SELECT * FROM {rfall.tnms["tp_tnm"]}
WHERE substr(datetime, 1, 7)='{tst_yr}.{q_mn}' AND (dmax>30.0 OR temperature>30.0);
I manually used update or delete queries to fix all the entries I found. I did something similar for temperatures less than -30°C.
But I figured I should check to see how many rows had text in the numeric columns. So, new bit of code in a play around module.
for t_col in ["temperature", "dmin", "dmax", "humidity"]:
vt_q = f"""SELECT * FROM {rfall.tnms["tp_tnm"]}
WHERE TYPEOF({t_col})='text';"""
print("\n", vt_q)
mn_data = rfall.qry_pd(vt_q)
print(mn_data)
And in the terminal, I got the following.
(dbd-3.13) PS R:\learn\dashboard\utils> python chk_db_rek.py
SELECT * FROM temperature
WHERE TYPEOF(temperature)='text';
Empty DataFrame
Columns: [row_id, datetime, temperature, dmin, dmax, condition, humidity]
Index: []
SELECT * FROM temperature
WHERE TYPEOF(dmin)='text';
row_id datetime temperature dmin dmax condition humidity
0 2273 2020.05.04 08:33 8.6 6.9.0 16.1 None
SELECT * FROM temperature
WHERE TYPEOF(dmax)='text';
Empty DataFrame
Columns: [row_id, datetime, temperature, dmin, dmax, condition, humidity]
Index: []
SELECT * FROM temperature
WHERE TYPEOF(humidity)='text';
row_id datetime temperature dmin dmax condition humidity
0 3295 2021.08.30 05:10 15.6 NaN 24.5 mostly cloudy 30%
1 3336 2021.09.17 07:10 12.0 17.3 NaN 77%
2 3337 2021.09.18 19:39 16.0 NaN NaN 62%
3 3338 2021.09.19 07:29 12.0 NaN NaN 83%
4 3339 2021.09.19 11:39 14.0 NaN NaN 82%
5 3340 2021.09.19 18:24 16.0 NaN NaN 73%
6 3341 2021.09.20 06:14 12.0 NaN NaN 90%
7 3354 2021.09.29 07:53 11.1 NaN NaN 83%
8 3355 2021.09.29 19:11 11.0 NaN NaN 84%
So I am going to manually fix those entries as appropriate. Will take me a bit of time, but, may as well have data as good as possible.
q_up = f"""UPDATE {rfall.tnms["tp_tnm"]}
SET dmin=6.9
WHERE datetime='2020.05.04 08:33';"""
up_t = rfall.qry_dlt(q_up)
hu_fix = {"2021.08.30 05:10": 30,
"2021.08.30 05:10": 77,
"2021.09.18 19:39": 62,
"2021.09.19 07:29": 83,
"2021.09.19 11:39": 82,
"2021.09.19 18:24": 73,
"2021.09.20 06:14": 90,
"2021.09.29 07:53": 83,
"2021.09.29 19:11": 84
}
for dttm, hmd in hu_fix.items():
q_up = f"""UPDATE {rfall.tnms["tp_tnm"]}
SET humidity={hmd}
WHERE datetime='{dttm}';"""
up_t = rfall.qry_dlt(q_up)
for t_col in ["temperature", "dmin", "dmax", "humidity"]:
vt_q = f"""SELECT * FROM {rfall.tnms["tp_tnm"]}
WHERE TYPEOF({t_col})='text';"""
print("\n", vt_q)
mn_data = rfall.qry_pd(vt_q)
print(mn_data)
And in the terminal, I got the following.
(dbd-3.13) PS R:\learn\dashboard\utils> python chk_db_rek.py
SELECT * FROM temperature
WHERE TYPEOF(temperature)='text';
Empty DataFrame
Columns: [row_id, datetime, temperature, dmin, dmax, condition, humidity]
Index: []
SELECT * FROM temperature
WHERE TYPEOF(dmin)='text';
Empty DataFrame
Columns: [row_id, datetime, temperature, dmin, dmax, condition, humidity]
Index: []
SELECT * FROM temperature
WHERE TYPEOF(dmax)='text';
Empty DataFrame
Columns: [row_id, datetime, temperature, dmin, dmax, condition, humidity]
Index: []
SELECT * FROM temperature
WHERE TYPEOF(humidity)='text';
Empty DataFrame
Columns: [row_id, datetime, temperature, dmin, dmax, condition, humidity]
Index: []
So, other than blatantly wrong values, things in the temperature table now look reasonable. I also did a check of the rainfall table and none of the numeric columns had any string values.
C’est Fini
I think this post has gotten plenty long enough for me. But some questionable database values corrected. Though perhaps many more I have not yet identified. Not to mention a bunch of code I never expected I’d have to write. But writing code is where all the fun is, you think?
Until next time, may your projects be better thought out and researched that mine tend to be. That said, I am really only playing at this. You might in fact be working at it. If so, I expect there is a design and development plan in place. Not so much here.