Solving the years old office debate, in minutes

Ever feel like the same thing happens on certain days of the week? An Alipian thought so and it created a stir at the office.

Hypothesis: It rains more in Boston on Tuesday than any day of the week. 

A number of people at the office agreed, others not so sure. So we started keeping track on a white board. The date would be written down if it rained at all on a Tuesday. A few years later, people stopped questioning it and either accepted it or withheld opposition.  But is it really true? Let's do a quick  technology build to find out.

Steps to Solution: Pull data from an API, only save relevant data, upload to a visualization tool, shape the data, display results.

Pull data from API

The National Weather Service Forecast Office has precipitation data for Boston. They also have an API. To pull the data for the past ten years:

curl 'https://data.rcc-acis.org/StnData' -H 'content-type: application/x-www-form-urlencoded; charset=UTF-8' --data-raw 'params=%7B%22elems%22%3A%5B%7B%22name%22%3A%22maxt%22%2C%22add%22%3A%22t%22%7D%2C%7B%22name%22%3A%22mint%22%2C%22add%22%3A%22t%22%7D%2C%7B%22name%22%3A%22avgt%22%2C%22add%22%3A%22t%22%7D%2C%7B%22name%22%3A%22avgt%22%2C%22normal%22%3A%22departure91%22%2C%22add%22%3A%22t%22%7D%2C%7B%22name%22%3A%22hdd%22%2C%22add%22%3A%22t%22%7D%2C%7B%22name%22%3A%22cdd%22%2C%22add%22%3A%22t%22%7D%2C%7B%22name%22%3A%22pcpn%22%2C%22add%22%3A%22t%22%7D%2C%7B%22name%22%3A%22snow%22%2C%22add%22%3A%22t%22%7D%2C%7B%22name%22%3A%22snwd%22%2C%22add%22%3A%22t%22%7D%5D%2C%22sid%22%3A%22BOSthr+9%22%2C%22sDate%22%3A%222011-06-23%22%2C%22eDate%22%3A%222021-06-23%22%7D&output=json'   --compressed

Only save relevant data

The data was in the following format:

["2021-01-01",["36",24],["29",24],["32.5",24],["0.7",24],["32",24],["0",24],["0.10",24],["0.0",0],["M",-1]]

We only care about the date and the precipitation numbers, so we concatenate those two pieces of data with the program jq to create a csv.

cat tenyears.json | jq -r '.[] | .[0] + "," + .[7][0]' > tenyears.csv

Upload to a visualization tool

Google Data Studio accepts a csv and a date range control can be added to be more granular. You can also shape the data once ingested to better understand the data.

Shape the data

Import the data into Google Data Studio and add two custom fields. One converts the date to the day of the week (2 for Tuesday) and then it is further converted to a human readable day of the week (Tuesday). The other field checks to see if the precipitation is greater than 0.00 inches. If it is, it is considered to have rained on that date.

Display results

We add two bar charts on the page. One will look at the number of times it has rained in Boston, while the other looks at how much precipitation has fallen. 

So what are the final results?

Over the past 10 years, Tuesday has been the rainiest day. Both in the number of occurrences and the number of inches fallen.

It took about an hour from start to finish to solve the great debate. I wonder what we will come up with next?

navigate_next
Home Approach Focus Case Studies People Careers Blog Contact