Foursquare, the popular check-in based social network, recently surpassed 7 million registered users. With that many people checking in, earning badges, and interacting via the service, I wanted to take a look at some of the information that could be gleaned; see if any interesting stories could be told. Using the popular Twitter4J library I wrote a simple Java program that parses the Streaming API looking for tweets announcing foursquare badges. At the same time I’m able to gather information about the twitter user’s account like their local time, and general location. This will let me generate some pretty interesting stats.
I’ve been collecting some sample data for about 36 hours and can already present some general numbers. I should note the limitations of these numbers. First, I’m only showing badges that were reported to the public twitter feed, if it was a private status, I can’t see it. Second, tweets can slip through the cracks, if the application is being rate limited, I can miss tweets. Third, system can be gamed, someone could falsely tweet about a badge being earned, or tweet out a fake badge (I’m working on filtering that now).
In the 36 or so hours that I’ve been collecting data, I’ve seen over 27,000 tweets about badges. Here are some charts showcasing a few aspects from my current dataset.
This is a breakdown of the top badges earned. As you can see there no badge dominates the top ranking, they actually fall off fairly evenly. All of these badges have fairly simple requirements, such as checking into the same place three times, 10 different locations, etc.
Another fairly interesting trend is the level of activity during the average day. I haven’t adjusted this data to account for the local timezones, so we’re seeing a fairly high level of badge earning activity around the world. Once I account for the timezone differences I’ll be able to better see what local time of the day people earn the the most badges in (I’m assuming there will be peaks at lunch, in the evening, etc).
These charts represent the tip of the iceberg with regards to what I’ll be able to do with the data. Now that the badges are flowing in, I can start to focus on processing this data. I know I’m going to become closely acquainted with the Google Charts API in the next few days!

