Interactive Weathergami#
Motivation#
Scorigami is an interesting concept. Itâs an event in sports where a final score has never happened in its history. For example, in the National Football League, a 20-17 score has happened over 285 times, but a 70-20 score has only happened once. When the Miami Dolphins beat the Denver Broncos 70-20 on September 24th, 2023, that was considered a scorigami!
Well, inspired by this, Jonathan Kahl wrote an article in the Bulletin of the American Meteorological Society describing the concept of âWeatherGamiâ, utilizing the days Maximum and Minimum temperature at a given area. Since NOAA NCEI holds all of the worlds weather data, it makes sense to see how WeatherGami works on their station database. This code reproduces the results!
Jared Rennie then wrote a notebook reproducing similar figures in a notebook, and I modified that notebook to add in some additional plotting tools to make plots interactive + easily configure a dashboard! In this example, I am interested in how common a combination of:
high temperature of 46 degrees Fahrenheit
low temperature of 33 degrees Fahrenheit is at Chicago OâHare airport.
What You Need#
First off, the entire codebase works in Python 3. In addition to base Python, you will need the following packages installed:
requests (to access the api)
pandas (to slice annd dice the data)
matplotlib (to plot!)
cmweather (for neat weather colorbars)
holoviews + hvplot (for interactive plotting)
The âeasiestâ way is to install these is by installing anaconda, and then applying conda-forge. Afterward, then you can install the above packages.
conda install -c conda-forge requests pandas matplotlib cmweather holoviews hvplot
Importing Packages#
Assuming you did the above, it should (in theory) import everything no problem:
# Import packages
import json,requests,sys
import pandas as pd
import hvplot.pandas
import numpy as np
import cmweather
import holoviews as hv
import sys
import panel as pn
hv.extension('bokeh')
Accessing the Data#
Set Your Desired Location and Test Temperatures#
To access the data, we will be utilizing the Applied Climate Information System (ACIS) API, which is a quick and easy way to access our station data without downloading data locally (streaming!). Now we need to know what station to get data for. The ACIS API accepts all sorts of IDs, including:
FAA (i.e. AVL)
ghcn (i.e. USW00003812)
ThreadEx (i.e. AVLthr)
If youâre not sure, you can refer to the API documentation above. We also need a maximum/minimum combo for us to check against the database. And finally we need to give you credit for the image that is created at the end.
Change the arguments below to your liking
# Insert Arguments Here
stationID = 'ORD'
inTmax= 46.
inTmin= 33.
author='Max Grover'
The rest of the code should work without making any changes to it, but if youâre interested, keep on reading to see how the sausage is made.
This next block of code will attempt to access the data we want from the ACIS API. The API is publicly available, but sometimes there are hiccups when getting the data. We tried to account for this with a try/exept in this code block and it will let you know if it fails after 3 seconds. If this happens, wait a minute, then try again.
def get_data(stationID, inTmax=inTmax, inTmin=inTmin):
# Build JSON to access ACIS API (from https://www.rcc-acis.org/docs_webservices.html)
acis_url = 'http://data.rcc-acis.org/StnData'
payload = {
"output": "json",
"params": {"elems":[{"name":"maxt","interval":"dly","prec":1},{"name":"mint","interval":"dly","prec":1}],
"sid":stationID,
"sdate":"por",
"edate":"por"
}
}
# Make Request
try:
r = requests.post(acis_url, json=payload,timeout=3)
acisData = r.json()
print("SUCCESS!")
except Exception as e:
sys.exit('\nSomething Went Wrong With Accessing API after 3 seconds, Try Again')
# Get Station Info
stationName=acisData['meta']['name'].title()
stationState=acisData['meta']['state']
# Convert data into Pandas DataFrame
df = pd.DataFrame(acisData['data'],
columns=['Date','Tmax','Tmin'],
)
# Convert the datatypes for Tmax/Tmin to be floats
df["Tmax"] = df.Tmax.astype(float)
df["Tmin"] = df.Tmin.astype(float)
# Make sure data
return df, stationName, stationState
If it says âSUCCESS!â then congrats you got the data!
Letâs check the data!#
How does it look? Well the data comes back as a JSON, which can be a little confusing to look at, so letâs extract the information we need, and reorganize it a bit.
First, the JSON has a âmetaâ key and a âdataâ key. The âmetaâ key gets us info like station name, latitude, longitude, etc. And âdataâ is the actual data we requested. So letâs get some station info, and convert the data into a pandas dataframe, which makes it easier to see.
df, stationName, stationState = get_data(stationID)
print("\nSuccessfully Orgainzed Data for: ",stationName,',',stationState)
print(df)
SUCCESS!
Successfully Orgainzed Data for: Chicago Ohare Intl Ap , IL
Date Tmax Tmin
0 1958-11-01 54.0 40.0
1 1958-11-02 53.0 37.0
2 1958-11-03 60.0 34.0
3 1958-11-04 68.0 41.0
4 1958-11-05 58.0 38.0
... ... ... ...
23757 2023-11-17 60.0 34.0
23758 2023-11-18 53.0 31.0
23759 2023-11-19 54.0 35.0
23760 2023-11-20 50.0 42.0
23761 2023-11-21 44.0 36.0
[23762 rows x 3 columns]
Sometimes people want to know what the stationâs period of record is, so letâs get that info.
stationStart=df.iloc[[0]]['Date'].values[0][0:4]
stationEnd=df.iloc[[-1]]['Date'].values[0][0:4]
print("Period of Record: ",stationStart,"-",stationEnd)
Period of Record: 1958 - 2023
Cool, but is it a WeatherGami?#
Letâs find out! We can use some pandas calls to see if our max/min input has happened in the record before. This code block will tell you if itâs a WeatherGami or not. If not, it will tell you the other times it has happened in the record. Here, we define a WeatherGami as either happening once before, or not at all.
# Now Find if the Tmax/Tmin combo has happened in the record before (ie WeatherGami).
wgTest=df.loc[(df['Tmax'] == inTmax) & (df['Tmin']==inTmin)].sort_values('Date', ascending=False)
if len(wgTest) == 0:
wgResult="It's a WeatherGami!"
print(inTmax,'/',inTmin,': ',wgResult)
print("It has never happened before!")
elif len(wgTest) == 1:
wgResult="It's a WeatherGami!"
print(inTmax,'/',inTmin,': ',wgResult)
print("It has happened ",len(wgTest)," time before")
print(wgTest)
else:
wgResult="It's NOT a WeatherGami!"
print(inTmax,'/',inTmin,': ',wgResult)
print("It has happened ",len(wgTest)," times before")
print(wgTest)
46.0 / 33.0 : It's NOT a WeatherGami!
It has happened 16 times before
Date Tmax Tmin
22696 2020-12-21 46.0 33.0
22668 2020-11-23 46.0 33.0
21688 2018-03-19 46.0 33.0
21547 2017-10-29 46.0 33.0
16099 2002-11-29 46.0 33.0
15845 2002-03-20 46.0 33.0
15752 2001-12-17 46.0 33.0
14998 1999-11-24 46.0 33.0
14707 1999-02-06 46.0 33.0
12817 1993-12-04 46.0 33.0
11115 1989-04-07 46.0 33.0
7427 1979-03-03 46.0 33.0
6956 1977-11-17 46.0 33.0
5995 1975-04-01 46.0 33.0
5634 1974-04-05 46.0 33.0
4800 1971-12-23 46.0 33.0
The other thing we might want to know is the frequency, or percentage of time a max/min combo occurrs. The followig code block does this for all combinations, and prints out the most common. We also need to weed out missing data at this point, which is recogized as âMâ by the API
frequency_counts = df.groupby(['Tmax', 'Tmin']).size().reset_index(name='Frequency')
frequency_counts['Percentage'] = (frequency_counts['Frequency'] / len(df)) * 100
frequency_counts=frequency_counts.loc[(frequency_counts['Tmax']!='M') & (frequency_counts['Tmin']!='M')].sort_values('Percentage', ascending=True)
frequency_counts
Tmax | Tmin | Frequency | Percentage | |
---|---|---|---|---|
0 | -11.0 | -25.0 | 1 | 0.004208 |
1104 | 46.0 | 44.0 | 1 | 0.004208 |
1106 | 47.0 | 14.0 | 1 | 0.004208 |
1107 | 47.0 | 15.0 | 1 | 0.004208 |
1111 | 47.0 | 20.0 | 1 | 0.004208 |
... | ... | ... | ... | ... |
2252 | 80.0 | 62.0 | 36 | 0.151502 |
2417 | 85.0 | 65.0 | 37 | 0.155711 |
2287 | 81.0 | 63.0 | 37 | 0.155711 |
2415 | 85.0 | 63.0 | 37 | 0.155711 |
2449 | 86.0 | 68.0 | 38 | 0.159919 |
2750 rows Ă 4 columns
# Get Frequency and Percentage Info needed for Plotting
frequency_counts = df.groupby(['Tmax', 'Tmin']).size().reset_index(name='Frequency')
frequency_counts['Percentage'] = (frequency_counts['Frequency'] / len(df)) * 100
# Remove Missing Data
frequency_counts=frequency_counts.loc[(frequency_counts['Tmax']!='M') & (frequency_counts['Tmin']!='M')].sort_values('Percentage', ascending=True)
# Get Frequency of input tmax/tmin and most frequent
currFreq=frequency_counts.loc[(frequency_counts['Tmax'] == inTmax) & (frequency_counts['Tmin']==inTmin)]
if len(currFreq)==0:
currFreq=str(inTmax)+'/'+str(inTmin)+': '+wgResult+' (It has never happened before!)'
elif currFreq['Frequency'].values[0]==1:
currFreq=str(inTmax)+'/'+str(inTmin)+': '+wgResult+' ('+str(currFreq.iloc[-1]['Frequency'])+' Occurrence)'
else:
currFreq=str(inTmax)+'/'+str(inTmin)+': '+wgResult+' ('+str(currFreq.iloc[-1]['Frequency'])+' Occurrences)'
mostFreq=str(frequency_counts.iloc[-1]['Tmax'])+'/'+str(frequency_counts.iloc[-1]['Tmin'])+' ('+str(frequency_counts.iloc[-1]['Frequency'])+' Occurrences)'
print('Most Frequent: ',mostFreq)
Most Frequent: 86.0/68.0 (38.0 Occurrences)
Now for the fun partâŠ
Plotting the data!#
This block of code will take the max/min combos and plot it, and color by frequency. A red dot will also plot with the max/min combo given as an input.
# Determine maximum/minimum ranges
ymin=int(5 * round(float((min(frequency_counts['Tmin'].values) - 10))/5))
ymax=int(5 * round(float((max(frequency_counts['Tmin'].values) + 10))/5))
xmin=int(5 * round(float((min(frequency_counts['Tmax'].values) - 10))/5))
xmax=int(5 * round(float((max(frequency_counts['Tmax']) + 10))/5))
# Create a heatmap with adjustable parameters
heatmap = frequency_counts.hvplot.heatmap('Tmax',
'Tmin',
C='Percentage',
height=300,
width=400,
reduce_function=np.mean,
ylabel=r'Minimum Temperature (°F)',
xlabel=r'Maximum Temperature (°F)',
xlim=(xmin, xmax),
ylim=(ymin, ymax),
cmap='HomeyerRainbow',
clabel='Frequency (%)',
alpha=.6,
title='WeatherGami For \n'+stationName+', '+stationState
).opts(show_grid=True)
# Add the red dot for the maximum/minimum
feature = hv.Points([(float(inTmax), float(inTmin))]).opts(color='red')
# Add different labels to the plot
attribution = hv.Text(xmax-30, ymin+10, f"Source: ACIS \n Generated by {author} \n Inspired By Kahl (2023) \n and Jared Renee", fontsize=4)
status = hv.Text(xmin+40, ymax-30, currFreq+'\nMost Common: '+mostFreq+'\nPeriod of Record= '+str(stationStart)+'-'+str(stationEnd), fontsize=4)
# Combine the first part of the plot by *
final_plot = heatmap * attribution * status * feature
# Add a table for the "hits"
table_results = wgTest.hvplot.table(title=f'Weathergami "Hits" \n (High: {inTmax} °F, Low: {inTmin} °F)',
columns='Date',
sortable=True,
selectable=True,
fontsize=8,
height=300,
width=200)
(final_plot + table_results).cols(2)
Conclusions#
This was a really fun blog post/notebook to put together! Weathergami plots can be interesting to create and visualize, and I hope this helps with getting started with the hvPlot stack with weather/climate data! In future iterations of this dashboard, I hope to add widgets to select the temperatures of interest, as well as a drop down menu for the various sites available.