Most Recent:

2018 Staten Island Election Results (R ggplot2)

2018 Staten Island Election Results

January 21, 2019

All code can be found here

Staten Island

Staten Island is the most conservative of the 5 New York City boroughs. In 2016, 56% of Staten Islanders voted for Donald Trump (compared to just 37% for all of NYC). Trump received 82% of possible Staten Island votes during the 2016 Republican Primary. If Staten Island was it’s own city, it would be the 2nd largest conservative city in America.

In 2018, Dan Donovan (R) was seeking reelection for New York’s 11th Congressional District. The 11th district combines…keep reading

Subscribe to What's New
Subscribe to what’s new!

Automating Reports (R Markdown)

Using R Markdown to Automate Report Writing

January 2, 2019

All code can be found here

R Markdown

Markdown is a markup language used to create different types of documents (pdf, html, etc). R Markdown allows a user to embed R code within the document, thus producing figures, creating tables, performing calculations, and doing anything else that can be done with R. R Markdown can even be used to create Microsoft Word documents…keep reading

New York Times Movie Reviews (R rvest)

Scraping New York Times Movie Reviews

December 20, 2018

All code can be found here

New York Times Movie Reviews

I love reading movie reviews. I avidly look forward to the Friday New York Times because that is when most of the movie reviews get printed (yes - printed, I still get the physical copy sent to my house on weekends). Manohla Dargis, A.O. Scott, …keep reading

The Wu-Tang Clan Network (python graphlab networkx)

The Wu-Tang Clan Network

December 10, 2018

All code can be found here

In honor of Wu-Tang Clan day, I dug out this old post I created for one of my grad school classes. We were learning about graph analysis and this is what I put together. All code is in python using the networkx package and graphlab package. Graphlab is pretty great.

For this project, we were tasked with exploring and analyzing a bi-modal network. It was important to me to choose a data set that I was familiar with. Familiarity with the data helped me make sure the calculations produce coherent results. …keep reading

538 NBA Predictions (R dplyr)

Posting Up 538 NBA Predictions Using R

November 25, 2018

All code can be found here


FiveThirtyEight is the best. From politics to pop culture, Nate Silver and his team do a great job creating interesting articles and visuals using various data science techniques.

A big part of what FiveThirtyEight does revolves around sports. A major focus of mine is also sports, specifically the NBA. FiveThirtyEight assigns win probabilities to every NBA game during the regular season and playoffs. I have been using the NBA regular season to test my modelling skills.

Predicting wins for every NBA game is difficult…keep reading