To help students get a better feel for three of the most popular “multi-armed bandit” exploration/exploitation balancing strategies (Epsilon Greedy, Thompson Sampling, and Upper Confidence Bound), I combined my R package “contextual” with the versatile …