-
Bike Racing and Clustering
This cycling off-season, I’ve been experimenting with races on the game platform Zwift. These are cute online races where your digital avatar races against other real people at the same time by connecting your bike’s power output through a stationary trainer to your computer. These races are put on by various communities like World […]
-
The Data Science of Board Games @ PAX West
I had the great privilege to give a talk on using data science on data from the board gaming world this year at PAX West. The source data and code used in the talk can be found my github account: https://github.com/scottburger/pax17. The presentation slides can be found below: Over 300 people showed up to the talk, […]
-
Steam Games and Recommendation Systems
Every so often I’m at a loss to suggest what game I should recommend me and all my friends get together and play. Sometimes I’m at a loss for what game even I should be playing. After ruminating and playing around with some modelling scenarios, I think I may have designed a pretty decent game recommendation system […]
-
Sentiment Mining of Steam User Reviews
It’s often hard to figure out ways to review a product. How do we determine if it’s worth our time and focus? How do we assess what other people think about that product? In the case of the gaming sector: is it good enough to warrant me playing it? In 2015 I gave a talk […]
-
The Data Science of Board Games – Exploring the BoardGameGeek Database
Board games have changed. Much of the time when a friend suggests breaking out a board game to play, people will sigh. Monopoly, Life, dare I go so far as to say Checkers too? These are games that have been around forever (and by forever I mean the 1950s), take way too long to play, […]
-
Hierarchical Regression Modelling
Data comes in many forms, but a lot of the time people are focused with how that data is evolving over time. There’s been numerous projects I’ve done in the past that have had some kind of sales data that looked like: Where we have a date/time value across all geographic locations, with multiple SKUs […]
-
Pythagorean Rank Optimization
Min/Maxing Optimization is the family of problems by which we want to find the maximum or minimum of something. Pretty simple when taken at face value. A lot of times optimization problems focus on finding a part of a curve or a plot that has a global maximum or minimum. This can be done with […]
-
Correlation, Causation, and Congress
The reddit site /r/dataisbeautiful too often has fun visualizations, but improper meaning derived from them. Some time ago, there was a post showing political donations to congress from the telecom lobby: Clearly the context is to try and show that there’s a lot of money being dumped into congress in order to affect their thinking on votes […]
-
Somewhat Dissatisfied – How to reformulate a KPI to be more statistically balanced
KPI Genealogy There was a project I worked on a long time ago where I was brought in to provide some data science magic on why this group’s KPI wasn’t moving. A KPI is a Key Performance Indicator: something you use to tell you the heartbeat of your business. Are the things we doing having […]