Exploiting Fine-Grained Data Parallelism with Chip Multiprocessors and Fast Barriers | Publicación