Halo: Data Evolved

When Halo 2 came out in 2004, it was a watershed moment in videogaming for me. The culmination of a sequel to one of my favorite games, but with online play you say??? It was just an electrifying time to be playing games at the time, and checking my game’s stats on halo.bungie.net was an almost daily habit.

So it was sad to see in a recent Bungie blog post they plan on removing the site on 2021-02-09. I guess some data doesn’t last forever. That sent me into a coding fever to try and build a gentle web scraper and preserve the sacred texts.

So if you want to preserve your Halo 2 and Halo 3 game data from this ancient relic, see my Python script on github here: https://github.com/ScottBurger/halo_preserver

There’s lots of great, ancient data here due to be lost to time. So unless the Internet Archive project does a roll over everyone’s data, it’ll be gone for good. I had in mind to answer some questions of my old Halo data, but now a fire’s been lit to grab it all before it’s gone soon.

The intent here is to just scrape the HTML files for the raw data since it’s a more lightweight approach for data science purposes than downloading all the media for the page as well. An upcoming post will detail how to mine the thousands of HTML files to build some interesting stats views.

But time is short! Go get your data!





Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: