Computing an Index Policy for Bandits with Switching Penalties | Publicación