Measuring the Mixing of Contextual Information in the Transformer | Publicación