Pushing memory bandwidth limitations through efficient implementations of Block-Krylov space solvers on GPUs | Publicación