Intuiting Randomness - Bayesian Calibration Game

Links:

This demo tests how well users can intuit randomness and mimic probability distributions. Users are shown a probability p and must generate binary predictions that collectively match that distribution.
The key challenge: when shown p=0.7, can you generate a sequence where approximately 70% are 1s, while maintaining appropriate randomness? Most people are poorly calibrated - they might guess 1s too often when p is high or show predictable patterns.
The system uses Bayesian inference to learn and score your calibration function q(p), which maps the shown probability p to your actual guessing probability.

Bayesian posterior computation for q(p): After each guess, the system updates a posterior distribution over your calibration parameters using Bayesian logistic regression. This captures how your implicit probability q(p) relates to the true probability p.
Integration over posterior for scoring: Rather than using MAP estimates, the scoring integrates over the full posterior distribution when computing calibration error. This provides robust scoring even with limited data.
Proper uncertainty quantification: Error bars are computed as credible intervals by sampling from the posterior distribution, reflecting genuine uncertainty rather than simple standard errors.
Real-time calibration visualization: The interface shows your calibration curve q(p) against perfect calibration (the identity line), with shaded regions indicating posterior uncertainty.
Efficient incremental updates: Implemented conjugate prior approximations and incremental update rules to avoid recomputing from scratch after each guess.