Benchmarking deep learning models for surface defect detection: a reproducible and statistically-rigorous approach | Publicación