2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information
Login Paper Search My Schedule Paper Index Help

My ICASSP 2021 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.
  1. Create a login based on your email (takes less than one minute)
  2. Perform 'Paper Search'
  3. Select papers that you desire to save in your personalized schedule
  4. Click on 'My Schedule' to see the current list of selected papers
  5. Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Paper IDIVMSP-27.6
Paper Title SEEHEAR: SIGNER DIARISATION AND A NEW DATASET
Authors Samuel Albanie, University of Oxford, United Kingdom; Gül Varol, Ecole des Ponts, Univ Gustave Eiffel, France; Liliane Momeni, Triantafyllos Afouras, Andrew Brown, Chuhan Zhang, Ernesto Coto, University of Oxford, France; Necati Cihan Camgöz, Ben Saunders, University of Surrey, United Kingdom; Abhishek Dutta, University of Oxford, United Kingdom; Neil Fox, University College London, United Kingdom; Richard Bowden, University of Surrey, United Kingdom; Bencie Woll, University College London, United Kingdom; Andrew Zisserman, University of Oxford, United Kingdom
SessionIVMSP-27: Multi-modal Signal Processing
LocationGather.Town
Session Time:Friday, 11 June, 11:30 - 12:15
Presentation Time:Friday, 11 June, 11:30 - 12:15
Presentation Poster
Topic Image, Video, and Multidimensional Signal Processing: [IVARS] Image & Video Analysis, Synthesis, and Retrieval
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Abstract In this work, we propose a framework to collect a large-scale, diverse sign language dataset that can be used to train automatic sign language recognition models. The first contribution of this work is SDTRACK, a generic method for signer tracking and diarisation in the wild. Our second contribution is SeeHear, a dataset of 90 hours of British Sign Language (BSL) content featuring more than1000 signers, and including interviews, monologues and debates. Using SDTRACK, the SeeHear dataset is annotated with 35K active signing tracks, with corresponding signer identities and subtitles, and 40K automatically localised sign labels. As a third contribution, we provide benchmarks for signer diarisation and sign recognition on SEEHEAR.