Towards zero-shot object counting via deep spatial prior cross-modality fusion | Publicación