The scoring method for the 2016 prediction contest didn't properly reward well calibrated predictions, which we are going to fix this year using logarithms! From a correspondent:
Log scoring is a “proper scoring rule" which means your score is maximized (expected) by reporting the true probability distribution.Do keep in mind that for this year's contest you need to be careful with very low or very high probabilities.
As a note, you can shift the scores around (linear transformations) to have certain properties pretty easily. Some examples:
Scoring LN(P(correct)/0.5) or LN(P(correct)) - LN(0.5) gives you scores that would be 0 if someone assigned a 50% probability to everything.
Doing that and then dividing each score by LN(2) gives you a final score that will just be equal to the total number of items scored if someone chooses everything correctly at 100% confidence.
In every case, being confident and wrong sends you toward a score of negative infinity.
I took a decision analysis class in grad school where the tests were probabilistic multiple choice. The professor made very clear that assigning a probability of zero to something meant you were willing to risk failing the class if you were wrong (and apocryphally had failed a former student). It ended up being pretty typical to assign 0.94/0.02/0.02/0.02 most of the time and move the high score downward when you were less sure. You didn't want to expend too much mental energy on calibration.
Another fun activity was having everyone in the class write down what probability range they had in mind when they used phrases like: probably, most likely, as likely as not, maybe, certainly, etc. The lesson was that even if you personally had a well defined notion of what you meant by those terms, it didn't matter when communicating because other people have wildly varying notions.
I re-scored the 2016 contest with log scoring, first converting all 0% to .001 and all 100% to .999.
Interestingly, the winner of the contest remains the same. However, the losers changed and are two guys who made a lot of wrong and extreme 0/100 bets.