Okay, based on the temperature/weather data I worked on in the previous post, I need to modify or recreate the temperature table I previously added to the Weather_db class.
We only effectively had two columns, datetime and temperature. But in the previous posts, we were also parsing daily low and high, weather condition and humidity (assuming they were available, blank string or Null otherwise). Since I already have the table in the database, I am going to see if I can use the SQL ALTER command to modify it as desired. I will also update the original creation method to reflect the changes. I will just use some test code in the __main__ block to do the ALTER, no new function or the like.
Add Colunms to Temperature Table
For reference, the relevant portion of the original table creation method now looks like the following. Four new, optional columns.
tp_tbl = f"""CREATE TABLE IF NOT EXISTS {c_tnm} (
row_id INTEGER PRIMARY KEY,
datetime TEXT NOT NULL,
temperature REAL NOT NULL,
dmin REAL,
dmax REAL,
condition TEXT,
humidity REAL
);"""
Okay, SQLite only allows one column to be added at a time. So, a wee loop. I use the qry_dlt method to get a commit after each query execution. Will likely have to rethink that method’s name. And I modified the if chk_i_rf block to cover the temperature table case.
... ...
alter_t_tbl = True
chk_i_rf = True
... ...
if alter_t_tbl:
new_cols = ["dmin", "dmax", "condition", "humidity"]
col_typ = ["REAL", "REAL", "TEXT", "REAL"]
t_nm = rfall.tnms["tp_tnm"]
for ndx, col in enumerate(new_cols):
ncol_qry = f"ALTER TABLE {t_nm} ADD COLUMN {col} {col_typ[ndx]}"
print("\n", ncol_qry)
rtnd = rfall.qry_dlt(ncol_qry)
print(rtnd)
... ...
elif c_tnm == "temperature":
nx_q = f"SELECT * FROM {c_tnm}"
mn_data = rfall.qry_pd(nx_q)
print("\n", mn_data)
nx_q = f"PRAGMA table_info('{c_tnm}')"
And in the terminal I got the following.
(dbd-3.13) PS R:\learn\dashboard\utils> python weather_db.py
rows in rainfall table: 1774
c_month minMonth maxMonth
0 2014.03 3.0 54.0
1 2014.04 1.5 53.0
2 2014.05 5.0 54.0
3 2014.06 2.0 14.5
4 2014.07 3.0 23.0
.. ... ... ...
133 2025.04 0.0 28.0
134 2025.05 0.5 19.0
135 2025.06 0.0 32.0
136 2025.07 7.0 11.5
137 2025.08 1.5 51.5
[138 rows x 3 columns]
rows in rf_mon_history table: 12
month avg
0 01 244.351818
1 02 139.068182
2 03 167.779167
3 04 109.266667
4 05 79.979231
5 06 53.941667
6 07 24.241667
7 08 41.146154
8 09 103.209091
9 10 194.161818
10 11 275.767273
11 12 254.759091
rows in temperature table: 0
Empty DataFrame
Columns: [row_id, datetime, temperature, dmin, dmax, condition, humidity]
Index: []
cid name type notnull dflt_value pk
0 0 row_id INTEGER 0 None 1
1 1 datetime TEXT 1 None 0
2 2 temperature REAL 1 None 0
3 3 dmin REAL 0 None 0
4 4 dmax REAL 0 None 0
5 5 condition TEXT 0 None 0
6 6 humidity REAL 0 None 0
So, I think we are, pending revisions, good to go.
Add Temperature Data to Database
Okay, I am going to refactor the latest CSV code to write the data to the database rather than a CSV file. I expect I will, in the long run, as for the rainfall case, write one months data at a time. Rather than doing so for every set of data parsed from the current file. But, to start, I am going to test with single sets of data at a time. Looking for any issues that may arise. I.E. no data for one or more columns.
And, I had to rework the code to account for the fact that table had a single datetime column. Not separate date and time columns.
In the end, I just tested writing full months to the database. Took a bit of debugging and reworking of my logic. But got it work in a reasonable amount of time. Though, at first I was missing dates. My first complete run had 10 fewer rows than the CSV file. It turned out that was because at a month’s end, I wasn’t correctly handling the two remaining arrays of daily data. A change in code order sorted that.
I added code to query the database for the current year’s data. That’s how I got the row count.
I am only going to show the relevant finished code. There are a few other refactorings I will let you sort out as necessary.
... ...
if do_tmpr:
tst_csv = False
tst_wdb = True
... ...
if not tst_wdb:
fh_csv = open(csv_pth, "w", encoding='utf8')
if tst_rgx or tst_csv or tst_wdb:
t_gnr = wd.get_temp_data(tmpr_srcs[src_ndx])
... ...
if tst_wdb:
pt_data, ct_data = [''] * 7, [''] * 7
p_dt, p_mn, c_dt, c_mn = "", "", "", ""
mn_data = []
mon_end = False
for rw in t_gnr:
d_dt, d_nt = rw.split(": ", 1)
t_mn = d_dt[5:7]
if d_dt >= s_dt:
p_rslt = parse_note(d_nt)
if p_rslt[1] != "Error":
n_tm, c_tmp, c_low, c_hgh, w_cnd, c_hum = p_rslt
if c_mn != t_mn:
p_mn, c_mn = c_mn, t_mn
mon_end = True
if d_dt != c_dt:
p_dt, c_dt = c_dt, d_dt
if p_mn and mon_end:
if pt_data[1] and pt_data[0][5:7] == p_mn:
mn_data.append(tuple(pt_data))
if ct_data[1] and ct_data[0][5:7] == p_mn:
mn_data.append(tuple(ct_data))
in_tmpr_tbl = f"""INSERT INTO {rfall.tnms["tp_tnm"]} VALUES (NULL, ?, ?, ?, ?, ?, ?);"""
rfall.qry_nsrt(in_tmpr_tbl, mn_data, is_many=True)
mn_data = []
mon_end = False
if pt_data[1] and pt_data[0][5:7] == c_mn:
mn_data.append(tuple(pt_data))
pt_data, ct_data = ct_data, [''] * 7
if c_tmp != "":
ct_data = [f"{c_dt} {n_tm}", c_tmp, c_low, "", w_cnd, c_hum]
if c_hgh != c_tmp:
pt_data[3] = c_hgh
else:
ct_data[3] = c_hgh
if pt_data[0][:10] == ct_data[0][:10]:
# if two entries for same date, make sure 1st entry has correct low and high
# and second has no low or high
if ct_data[3] > pt_data[3]:
pt_data[3] = ct_data[3]
ct_data[3] = ""
if ct_data[2] < pt_data[2]:
pt_data[2] = ct_data[2]
ct_data[2] = ""
ct_data[2], ct_data[3] = "", ""
else:
print(p_rslt)
# take care of final month of data
if c_hgh != c_tmp:
pt_data[3] = c_hgh
else:
ct_data[3] = c_hgh
p_mn = mn_data[0][0][5:7]
if pt_data[1] and pt_data[0][5:7] == p_mn:
mn_data.append(tuple(pt_data))
if ct_data[1] and ct_data[0][5:7] == p_mn:
mn_data.append(tuple(ct_data))
in_tmpr_tbl = f"""INSERT INTO {rfall.tnms["tp_tnm"]} VALUES (NULL, ?, ?, ?, ?, ?, ?);"""
rfall.qry_nsrt(in_tmpr_tbl, mn_data, is_many=True)
nx_q = f"SELECT * FROM {rfall.tnms["tp_tnm"]} WHERE substr(datetime, 1, 4)='{tst_yr}'"
mn_data = rfall.qry_pd(nx_q)
print("\n", mn_data)
And for the 2015 note file, I got the following in the terminal.
dbd-3.13) PS R:\learn\dashboard> python data2db.py
0 -> F:\BaRKqgs\gdn\bark_gdn_2015.php
row_id datetime temperature dmin dmax condition humidity
0 1 2015.01.01 07:30 -3.6 -3.6 2.4 clear
1 2 2015.01.02 08:15 0.6 -3.6 rain
2 3 2015.01.03 07:00 1.3 2.7 snow
3 4 2015.01.04 08:40 1.1
4 5 2015.01.04 09:20 0.9 snow
.. ... ... ... ... ... ... ...
285 286 2015.12.25 10:14 0.2 -0.4 3.0 rain
286 287 2015.12.26 10:33 1.0 -0.2 2.7 rain
287 288 2015.12.27 10:30 2.1 0.8 rain
288 289 2015.12.28 10:21 2.7 1.0 2.7
289 290 2015.12.31 10:02 -1.2 -3.8
[290 rows x 7 columns]
And the 2015 CSV file did indeed have 290 rows of data.
For the 2016 file I got the following.
(dbd-3.13) PS R:\learn\dashboard> python data2db.py
1 -> F:\BaRKqgs\gdn\bark_gdn_2016.php
row_id datetime temperature dmin dmax condition humidity
0 291 2016.01.01 08:21 -2.7 -3.3 1.0
1 292 2016.01.02 09:36 -2.5 -3.3 fog
2 293 2016.01.04 10:42 0.4 -2.5 0.4 rain
3 294 2016.01.06 12:53 6.4 -0.4 7.0 rain
4 295 2016.01.09 09:50 1.1 -0.8 rain
.. ... ... ... ... ... ... ...
318 609 2016.12.27 08:54 2.4 -0.4 5.5 cloudy
319 610 2016.12.28 09:57 2.4 0.2 3.3 rain
320 611 2016.12.29 09:37 2.1 1.7 2.7
321 612 2016.12.30 09:33 2.0 0.7 4.2
322 613 2016.12.31 09:54 1.1 0.6 fog
[323 rows x 7 columns]
And, the 2016 CSV file, once I accounted for comments, did indeed have 323 rows of data.
Done
I think that’s it for this post. I will continue processing files until I have all the available data, to the end of August 2025, added to the database. May need to refactor to ensure the code doesn’t attempt to process September data. As I did for rainfall data, I will be adding another utility module to add a single instance of temperature/weather data to the database. And, I will need so data for testing purposes.
Once that is done, I will start looking at ways to present the data in the dashboard.
Until next time, may things go smoothly with your coding efforts.
Resources
- SQLite ALTER TABLE
Addendum
I added temperature/weather data for all the remaining years to the temperature table. That includes 2025 to the end of August. At this point 5,199 rows in the table. I am thinking about starting on the utility to add daily temperature/weather data to the table. As mentioned above, will use September data to test the utility.