Ever feel like the same thing happens on certain days of the week? An Alipian thought so and it created a stir at the office.
Hypothesis: It rains more in Boston on Tuesday than any day of the week.
A number of people at the office agreed, others not so sure. So we started keeping track on a white board. The date would be written down if it rained at all on a Tuesday. A few years later, people stopped questioning it and either accepted it or withheld opposition. But is it really true? Let's do a quick technology build to find out.
Steps to Solution: Pull data from an API, only save relevant data, upload to a visualization tool, shape the data, display results.
Pull data from API
The National Weather Service Forecast Office has precipitation data for Boston. They also have an API. To pull the data for the past ten years:
curl 'https://data.rcc-acis.org/StnData' -H 'content-type: application/x-www-form-urlencoded; charset=UTF-8' --data-raw 'params=%7B%22elems%22%3A%5B%7B%22name%22%3A%22maxt%22%2C%22add%22%3A%22t%22%7D%2C%7B%22name%22%3A%22mint%22%2C%22add%22%3A%22t%22%7D%2C%7B%22name%22%3A%22avgt%22%2C%22add%22%3A%22t%22%7D%2C%7B%22name%22%3A%22avgt%22%2C%22normal%22%3A%22departure91%22%2C%22add%22%3A%22t%22%7D%2C%7B%22name%22%3A%22hdd%22%2C%22add%22%3A%22t%22%7D%2C%7B%22name%22%3A%22cdd%22%2C%22add%22%3A%22t%22%7D%2C%7B%22name%22%3A%22pcpn%22%2C%22add%22%3A%22t%22%7D%2C%7B%22name%22%3A%22snow%22%2C%22add%22%3A%22t%22%7D%2C%7B%22name%22%3A%22snwd%22%2C%22add%22%3A%22t%22%7D%5D%2C%22sid%22%3A%22BOSthr+9%22%2C%22sDate%22%3A%222011-06-23%22%2C%22eDate%22%3A%222021-06-23%22%7D&output=json' --compressed
Only save relevant data
The data was in the following format:
\["2021-01-01",["36",24],\["29",24],\["32.5",24],\["0.7",24],\["32",24],\["0",24],\["0.10",24],\["0.0",0],\["M",-1]]
We only care about the date and the precipitation numbers, so we concatenate those two pieces of data with the program jq to create a csv.
cat tenyears.json | jq -r '.\[] | .\[0] + "," + .\[7]\[0]' > tenyears.csv
Upload to a visualization tool
Google Data Studio accepts a csv and a date range control can be added to be more granular. You can also shape the data once ingested to better understand the data.
Shape the data
Import the data into Google Data Studio and add two custom fields. One converts the date to the day of the week (2 for Tuesday) and then it is further converted to a human readable day of the week (Tuesday). The other field checks to see if the precipitation is greater than 0.00 inches. If it is, it is considered to have rained on that date.
Display results
We add two bar charts on the page. One will look at the number of times it has rained in Boston, while the other looks at how much precipitation has fallen.
So what are the final results?
Over the past 10 years, Tuesday has been the rainiest day. Both in the number of occurrences and the number of inches fallen.
It took about an hour from start to finish to solve the great debate. I wonder what we will come up with next?
Our Blog