Most Recent:

Pitchfork Music Reviews (python BeautifulSoup)

Pitchfork 50 Best Albums of the Year

February 6, 2019

All code can be found here


Pitchfork.com is a music review website. Each album that is reviewed on the site is also rated on a scale from 0 to 10. Pitchfork began in the mid 90s as a review site for new independent music. It’s cultural footprint has increased each year and now Pitchfork hosts over 200,000 visitors on its site each day.

I love new music. I read Pitchfork daily to see if there are any new albums that have been relaesed that I would be interested in. Recently, while looking through some of the available data sets on…keep reading

2018 Staten Island Election Results (R ggplot2)

2018 Staten Island Election Results

January 21, 2019

All code can be found here

Staten Island

Staten Island is the most conservative of the 5 New York City boroughs. In 2016, 56% of Staten Islanders voted for Donald Trump (compared to just 37% for all of NYC). Trump received 82% of possible Staten Island votes during the 2016 Republican Primary. If Staten Island was it’s own city, it would be the 2nd largest conservative city in America.

In 2018, Dan Donovan (R) was seeking reelection for New York’s 11th Congressional District. The 11th district combines…keep reading

Automating Reports (R Markdown)

Using R Markdown to Automate Report Writing

January 2, 2019

All code can be found here

R Markdown

Markdown is a markup language used to create different types of documents (pdf, html, etc). R Markdown allows a user to embed R code within the document, thus producing figures, creating tables, performing calculations, and doing anything else that can be done with R. R Markdown can even be used to create Microsoft Word documents…keep reading

New York Times Movie Reviews (R rvest)

Scraping New York Times Movie Reviews

December 20, 2018

All code can be found here

New York Times Movie Reviews

I love reading movie reviews. I avidly look forward to the Friday New York Times because that is when most of the movie reviews get printed (yes - printed, I still get the physical copy sent to my house on weekends). Manohla Dargis, A.O. Scott, …keep reading

The Wu-Tang Clan Network (python graphlab networkx)

The Wu-Tang Clan Network

December 10, 2018

All code can be found here

In honor of Wu-Tang Clan day, I dug out this old post I created for one of my grad school classes. We were learning about graph analysis and this is what I put together. All code is in python using the networkx package and graphlab package. Graphlab is pretty great.

For this project, we were tasked with exploring and analyzing a bi-modal network. It was important to me to choose a data set that I was familiar with. Familiarity with the data helped me make sure the calculations produce coherent results. …keep reading

538 NBA Predictions (R dplyr)

Posting Up 538 NBA Predictions Using R

November 25, 2018

All code can be found here


FiveThirtyEight is the best. From politics to pop culture, Nate Silver and his team do a great job creating interesting articles and visuals using various data science techniques.

A big part of what FiveThirtyEight does revolves around sports. A major focus of mine is also sports, specifically the NBA. FiveThirtyEight assigns win probabilities to every NBA game during the regular season and playoffs. I have been using the NBA regular season to test my modelling skills.

Predicting wins for every NBA game is difficult…keep reading