Multimodal speaker diarization for meetings using volume-evaluated SRP-PHAT and video analysis | Publicación