When Halo 2 came out in 2004, it was a watershed moment in videogaming for me. The culmination of a sequel to one of my favorite games, but with online play you say??? It was just an electrifying time to be playing games at the time, and checking my game’s stats on halo.bungie.net was an almost daily habit.
So it was sad to see in a recent Bungie blog post they plan on removing the site on 2021-02-09. I guess some data doesn’t last forever. That sent me into a coding fever to try and build a gentle web scraper and preserve the sacred texts.
So if you want to preserve your Halo 2 and Halo 3 game data from this ancient relic, see my Python script on github here: https://github.com/ScottBurger/halo_preserver
There’s lots of great, ancient data here due to be lost to time. So unless the Internet Archive project does a roll over everyone’s data, it’ll be gone for good. I had in mind to answer some questions of my old Halo data, but now a fire’s been lit to grab it all before it’s gone soon.
The intent here is to just scrape the HTML files for the raw data since it’s a more lightweight approach for data science purposes than downloading all the media for the page as well. An upcoming post will detail how to mine the thousands of HTML files to build some interesting stats views.
But time is short! Go get your data!
Leave a Reply