Using statistical regression methods, I constructed an algorithm that predicts the probability of success for almost every project on DonorsChoose.org within 2-7%. This is the report on how I constructed the prediction models and evaluated the results along with a couple of ways that it can be implemented in the DC system: Predicting Project Completion on DonorsChoose.org (pdf). The video below is my introduction to the algorithm and what it could do for DC. I was hoping to have an app developed as a part of this analysis, so that people could at least see predictions generated in real time, but it’s still a little above my skill level and I needed to focus on improving the algorithm. If you want to calculate your own probabilities, see the appendices in the report for the necessary values and variables along with an example. I also have a python script that will compute probabilities for live projects on DonorsChoose.org if you’re savvy like that. I have a zipped copy of it along with an unusual dependency package (nltk_contrib) that is used to compute the readability scores.
Really great work, Jason!
Thanks so much Clay!