The team and I from Debug Politics are still working on updating the VoterKarma app.
For my part, I did some testing of different classifiers (i.e. Logistic Regression, Random Forest and XGBoost). That code is here.
XGBoost I hadn’t used before, but it’s easy to incorporate it into the CV process in sklearn. XGBoost definitely outperformed Random Forest and Logistic Regression, so we’ll be using it for our scoring going forward.
An interesting thing I learned looking at the NYC voterfile is how stark the difference is in ages between those who vote and those who don’t. Check it out:
For the presidential, it’s not so stark, but for local elections, the distribution of ages among voters is much more top-heavy than non-voters. It’s not a new finding but I found the difference between local and national elections pretty surprising.