As a Last Christmas present in the WHAM! series, we have decided to Make It Big and release WHAM!48kHz, a high-fidelity version of the ambient background noise recordings originally used for the WSJ0 Hipster Ambient Mixtures (WHAM!) dataset. 78 hours of raw binaural recordings (48 kHz/24-bit), collected by our definitely not Careless Whisper.ai collaborators at various urban locations (restaurants, cafes, bars, parks) throughout the Bay Area.
Our paper "Hierarchical Musical Source Separation" won the Best Poster Award and the Best Video Award at the 2020 International Society for Music Information Retrieval Conference (ISMIR 2020), held October 11-14 in virtual Montreal, Canada. Check out the paper page on the ISMIR website.
SANE 2017 was held on October 19, 2017, at Google, NY, with a new record audience of 180 people. Videos and slides of the SANE 2017 talks (as well as those from previous years) are now available. There is even a convenient Monday morning binge-watching YouTube playlist.
SANE 2016 was held on October 21, 2016, at MIT. Slides of all talks are now available from the SANE website.
Emmanuel Vincent (Inria, France), Hakan Erdogan (Sabanci University, MSR), and myself presented a tutorial on "Learning-based Approaches to Speech Enhancement And Separation" at Interspeech 2016 in San Francisco, CA. The room was full, with over 100 attendees. The slides are available here.
SANE 2015 was held on October 22, 2015, at Google, NYC, with a record audience of 130 people. Thanks to Hank Liao's hard work, the videos and slides of the SANE 2015 talks are now available. Here is a convenient Saturday night binge-watching YouTube playlist.
The Slides for my ICASSP 2015 talk on MICbots are now available for download. I put together a project page that explains the concept and the actual construction of our MICbots. It features a Youtube video showing them moving on the tune of dark suits and greasy wash water. From there, you can also download PyRobot 2, a Python wrapper for the iRobot Create 2 Open Interface; it is derived from Damon Kohler's PyRobot, that handled the first version of the interface.
Thanks to the many participants and the speakers of SANE 2014 for a very exciting day. The slides for all talks as well as some photos are now available through the sane-news group (slides for previous SANE workshops also available). More info on SANE 2014.
Emmanuel Vincent (INRIA) and I gathered a comprehensive list of datasets for robust speech processing research, with detailed attributes and links to software baselines and evaluation results. It is available as a Technical Report as well as a wiki page under the ROSP wiki.
The code for our IEEE Signal Processing Letters article "Consistent Wiener Filtering for Audio Source Separation" is now available (research-only license).
I added a page with non-research software where I share pieces of software that I modified to suit my needs and that others may found useful. In particular, I have taken over development of IguanaTex, a free add-in to include LaTeX displays in PowerPoint. It is a good alternative to TexPoint.
We created a Google group, "Speech and Audio in the Northeast", to gather researchers and students in speech and audio from the northeast of the American continent. Anyone is welcome to join.