I love data. I’ve known this for quite a long time. Back when I first made the conscious decision to start collecting video games, for example, my first thought was not about where to get the best deals but how to keep track of everything. After fussing around with various collection software, I settled for the most fully-functional database-styled option I could. Would it ever matter that I had the barcode of games owned? No, but keeping track of every single data point was important to me. The same has followed through in many other ways, leading me down the path of learning more about data gathering and data analysis — though this process of learning is always ongoing.
It is with all this in mind that I decided in early 2015 to start posting about Kickstarter video game analysis. We at Cliqist had already grabbed a huge amount of data off Kickstarter in a handful of ways (from Kickspy, and by our own methods) but they were imperfect. As is common with me, an option was clear — let’s gather our own data without using scrapers. Gathering by hand would take far longer but it would also yield better control over how and what we saved. And as such, I got to work quickly snapping up information about every single video game (and gaming-related) campaign as possible.
How did I start this initiative? All that was done was open up an Excel spreadsheet, add in some column headers with the Cliqist staff decided upon, and got to work adding in information as new projects launched. Each day I would go to Kickspy and update my spreadsheet with the newest projects in the Video Game section, and sometimes from other sections if I discovered gaming projects that were misplaced or mixed media campaigns. For example, a gaming documentary might be in the Film section though it dealt directly with games. On the other hand, mobile titles and the like sometimes ended up in weird places. Keeping this amount of track led to the potential for more data than via scrapers alone, but certainly there were issues as well.
One of the biggest issues early on was the matter of canceled campaigns. Once canceled, they would basically disappear from the main search on Kickspy and even Kickstarter has been very unwilling to show canceled campaigns easily. Luckily, my daily gathering kept most all of these titles on the list before they were canceled, but not always. That’s where RSS feeds came in handy. By loading up the proper section in FeedDemon (my RSS reader of choice, despite being discontinued), I would continue to receive updates for every project, even if it were canceled within hours of launch (and yes, that has definitely happened before). What still isn’t perfect about this? Projects from other sections may occasionally slip by my notice if they disappear quickly since my feed reader would destroy itself if it had to keep track of every category on Kickstarter. My current RSS gathering includes feeds from Kicktraq and Crowdfunding RSS in case something somehow slips through the cracks of one.
Of course, earlier in the year Kickspy shut down. Why had I not just used Kickstarter’s own listing of campaigns? It just isn’t nearly as conducive to quick and orderly searching, I’ve found. So, after that was gone I had to switch over to Kicktraq. This site is certainly quite handy as well, but it simply isn’t quite as good. With that said, it is lightyears ahead of other sites attempting to provide the same campaign stats. In June I also expanded the amount of data gathered to create an even fuller picture about what was going on with projects. Sure, it was hard to say what all this data might be used for, but we at Cliqist figured it was much better to secure as much data as possible up front rather than retroactively add items later on. Some data still doesn’t have a direct purpose, but we may find a new way to utilize it in the future.
Other unexpected challenges have appeared along this path. For one, Kickstarter campaigns have been increasing in tremendous amount over 2015. Earlier in the year, it was not that tough to keep track of every new project launch and add its data right in. As the months wore on, however, I have felt a perceptible increase in titles that I must add to the spreadsheet on a daily basis. Right now (during the holiday season) the campaigns have slowed a bit, but if trends continue then 2016 will require even more focus on adding data daily. What could once be accomplished in 20 minutes or so on a “popular” day might now start taking over an hour. And this data gathering period does not include the fact that, after projects are over, I must once again add in a variety of data about the project’s closing stats.
That’s not to say any of this is particularly unenjoyable. In fact, it is tremendously exciting to see this collection of data grow larger and larger because it means at some point this information will be even more relevant from a statistical standpoint. Kickstarter provides its own informational statistics post at the end of every year but it is tremendously bare-bones. Thus far, they’ve also been unwilling to share the specific data used to create those posts with others, either. By gathering heaps of data on my own, however, I can see with a pretty clear view what exactly is going on with video game projects on Kickstarter. There’s no need to depend on data from others which may not be an exact fit with my aims. Sure, it requires quite a bit of babysitting and is imperfect, but it still feels quite great to have on hand.
The other issue with all this data gathering is that, at times, it causes me to lose sight of the bigger picture. I’ve hand-selected the kind of data I want to transform into informational graphs and charts in my monthly campaign analysis posts, sure. But is that what people actually want to hear about? This is the main reason why I have begun to ask for opinions from others as to what exactly they want to know about with regards to Kickstarter data. Unfortunately, sometimes these requests are not possible to fulfill with the current implementation of data gathering. Aka: Having things purely gathered by hand and on Excel lead to a variety of restrictions. This is one reason a goal of Cliqist is to transform the data into a database and potentially add some other form of data grabbing to lessen the strain on myself.
With over 1000 gaming projects launched on Kickstarter this year, we have a ton of information available and have been doing our best to provide it to people on a monthly basis. There’s little chance this will change in the future. If things change, they will be behind the scenes. No matter what, the goal is to make these posts even more useful to people with even more information packed within. If you have ever read one of my Kickstarter Videogame Campaign and Analysis posts — Thank you. I’m glad to provide these for all of us out there who love looking at the numbers behind Kickstarter successes and failures.