Yesterday was hot. Like really hot. According to reports, a new hottest June 12th in Seattle was recorded at 95 degrees (35 C). This is roughly how it felt walking outside briefly yesterday for me:
Many in other parts of the US where it regularly gets that hot in the summer will laugh from their climate controlled superfortresses. Many don’t realize that Seattle’s generally mild climate combined with the fact that most houses here were built in the 1960’s mean that home air conditioning is not as common as in Chicago.
Sweltering yesterday made me wonder: at what point should I consider getting an AC unit? I feel like my personal threshold for heat is about 90 degrees. You can’t really get a good cycling workout in when it’s that hot and my house acts like a giant oven at that temperature as well.
Weather Data from Google Bigquery
Using data from NOAA’s Global Surface Summary of the Day (GSOD), we can see how many 90+ degree weather days there have been in the past and see how long it’ll take for a sizeable chunk of the summer to be unbearably hot.
First, let’s find the weather stations local to Seattle:
select * from `bigquery-public-data.noaa_gsod.stations` where lower(name) like '%seattle%'
This gives us some data that looks like:
Some of these have more data than others. In this case, I’ll be picking row number 6 which has the most data for what I need.
One of the cool features about Google Bigquery is a thing called table sharding. The astute observer will notice that the Bigquery data for the NOAA_gsod dataset has a ton of tables in it for each year since the 1930’s:
In Bigquery you can select from all of these tables by using a table wildcard syntax denoted by an asterisk *. This says “give me all of the tables unioned together in one big table.” Let’s see it in action below:
with temp1 as ( select year ,mo ,da ,temp as mean_temp ,max as max_temp ,min as min_temp ,if(max >= 90, 1, 0) as max_90 from `bigquery-public-data.noaa_gsod.*` where stn = '727930' and wban = '24233' ) select year ,sum(max_90) as high_90_degree_days from temp1 group by 1 order by 1 desc
with the output looking like this (only top rows, but goes back to 1949):
And the data output in a nice graph over the past 20 years:
Now the R-squared value here isn’t super great, but even doing a 2nd degree polynomial fit doesn’t add much for accuracy (0.633 vs 0.594), so we’ll stick with linear at the moment for simplicity and I’m mostly concerned about how many 90 degree weather days in the next couple years.
Given that equation fit from the model, we should expect about 12-14 days of 90+ degree weather in the next couple years. Given that there’s about 90 days of summer here, I would expect to get an AC unit if that predicted value was closer to 25-30 instead of half that. If 20-25% of all the summer days were unbearably hot, then it might make more sense for AC.
So for the meantime, if it’s hot out here this might be a better solution than air conditioning:
(fun note: I wrote the title of this blog first before analyzing data, but was surprised to see that it does follow Betteridge’s law of headlines)