{"id":37,"date":"2015-04-18T09:37:17","date_gmt":"2015-04-18T13:37:17","guid":{"rendered":"http:\/\/www.jonathanleroux.org\/wordpress\/?p=37"},"modified":"2015-04-18T21:03:06","modified_gmt":"2015-04-19T01:03:06","slug":"icassp-2015-in-brisbane","status":"publish","type":"post","link":"https:\/\/www.jonathanleroux.org\/wordpress\/2015\/04\/18\/icassp-2015-in-brisbane\/","title":{"rendered":"ICASSP 2015 in Brisbane"},"content":{"rendered":"<p><a href=\"https:\/\/www.jonathanleroux.org\/wordpress\/wp-content\/uploads\/2015\/04\/MICbots_large.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\" size-large wp-image-39 aligncenter\" src=\"https:\/\/www.jonathanleroux.org\/wordpress\/wp-content\/uploads\/2015\/04\/MICbots_large-1024x768.jpg\" alt=\"MICbots\" width=\"660\" height=\"495\" srcset=\"https:\/\/www.jonathanleroux.org\/wordpress\/wp-content\/uploads\/2015\/04\/MICbots_large-1024x768.jpg 1024w, https:\/\/www.jonathanleroux.org\/wordpress\/wp-content\/uploads\/2015\/04\/MICbots_large-300x225.jpg 300w, https:\/\/www.jonathanleroux.org\/wordpress\/wp-content\/uploads\/2015\/04\/MICbots_large.jpg 1440w\" sizes=\"(max-width: 660px) 100vw, 660px\" \/><\/a><\/p>\n<p>I&#8217;m flying tomorrow from Tokyo to Brisbane to attend the ICASSP 2015 conference. Who would have guessed I&#8217;d be back to Brisbane and its conference center 7 years after Interspeech 2008&#8230; if I&#8217;d had to choose a conference location to be repeated, I&#8217;d probably have gone with Honolulu, but anyway.<\/p>\n<p>I&#8217;ll be chairing a special session Wednesday morning on &#8220;<a class=\"locationlink\" href=\"https:\/\/www2.securecms.com\/ICASSP2015\/Papers\/PublicSessionIndex3.asp?Sessionid=2140\">Audio for Robots &#8211; Robots for Audio<\/a>&#8220;, that I am co-organizing with Emmanuel Vincent (INRIA) and Walter Kellerman (Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg). I will also present the following two papers:<\/p>\n<ul>\n<li>&#8220;<b>MICbots: collecting large realistic datasets for speech and audio research using mobile robots<\/b>,&#8221; with Emmanuel Vincent, John R. Hershey, and Daniel P. W. Ellis. [<a href=\"http:\/\/www.jonathanleroux.org\/pdf\/LeRoux2015ICASSP04MICbots.pdf\">.pdf<\/a>] [<a href=\"http:\/\/www.jonathanleroux.org\/bib\/LeRoux2015ICASSP04MICbots.bib\">.bib<\/a>]<br \/>\n<span style=\"text-decoration: underline;\">Abstract<\/span>: Speech and audio signal processing research is a tale of data collection efforts and evaluation campaigns. Large benchmark datasets for automatic speech recognition (ASR) have been instrumental in the advancement of speech recognition technologies.\u00a0 However, when it comes to robust ASR, source separation, and localization, especially using microphone arrays, the perfect dataset is out of reach, and many different data collection efforts have each made different compromises between the conflicting factors in terms of realism, ground truth, and costs. Our goal here is to escape some of the most difficult trade-offs by proposing MICbots, a\u00a0low-cost method of collecting large amounts of realistic data where annotations and ground truth are readily available. Our key idea is to use freely moving robots equiped with microphones and loudspeakers, playing recorded utterances from existing (already annotated) speech datasets.\u00a0 We give an overview of previous data collection efforts and the trade-offs they make, and describe the benefits of using our robot-based approach. We finally explain the use of this method to collect room impulse response measurement.<\/li>\n<li>&#8220;<b>Deep NMF for Speech Separation<\/b>,&#8221; with John R. Hershey and Felix Weninger. [<a href=\"http:\/\/www.jonathanleroux.org\/pdf\/LeRoux2015ICASSP04DeepNMF.pdf\">.pdf<\/a>] [<a href=\"http:\/\/www.jonathanleroux.org\/bib\/LeRoux2015ICASSP04DeepNMF.bib\">.bib<\/a>]<br \/>\n<span style=\"text-decoration: underline;\">Abstract<\/span>: Non-negative matrix factorization (NMF) has been widely used for challenging single-channel audio source separation tasks. However, inference in NMF-based models relies on iterative inference methods, typically formulated as multiplicative updates.\u00a0 We propose &#8220;deep NMF&#8221;, a novel non-negative deep network architecture which results from unfolding the NMF iterations and untying its parameters. This\u00a0 architecture can be discriminatively trained for optimal separation performance. To optimize its non-negative parameters, we show how a new form of back-propagation, based on multiplicative updates, can be used to preserve non-negativity, without the need for constrained optimization. We show on a challenging speech separation task that deep NMF improves in terms of accuracy upon NMF and is competitive with conventional sigmoid deep neural networks, while requiring a tenth of the number of parameters.<\/li>\n<\/ul>\n<p>If you are attending the conference, don&#8217;t hesitate to come by and ask the hard questions&#8230;<\/p>\n<p>(The photo above is myself happily posing with Dot, Hot, and Lot, our first three MICbots.)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I&#8217;m flying tomorrow from Tokyo to Brisbane to attend the ICASSP 2015 conference. Who would have guessed I&#8217;d be back to Brisbane and its conference center 7 years after Interspeech 2008&#8230; if I&#8217;d had to choose a conference location to be repeated, I&#8217;d probably have gone with Honolulu, but anyway. I&#8217;ll be chairing a special session Wednesday morning on &#8220;Audio for Robots &#8211; Robots for Audio&#8220;, that I am co-organizing with Emmanuel Vincent (INRIA) and Walter Kellerman (Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg). I will also present the following two papers: &#8220;MICbots: collecting large realistic datasets for speech and audio research using mobile robots,&#8221; with Emmanuel Vincent, John R. Hershey, and Daniel P. W. &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/www.jonathanleroux.org\/wordpress\/2015\/04\/18\/icassp-2015-in-brisbane\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;ICASSP 2015 in Brisbane&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[],"class_list":["post-37","post","type-post","status-publish","format-standard","hentry","category-research"],"_links":{"self":[{"href":"https:\/\/www.jonathanleroux.org\/wordpress\/wp-json\/wp\/v2\/posts\/37"}],"collection":[{"href":"https:\/\/www.jonathanleroux.org\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.jonathanleroux.org\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.jonathanleroux.org\/wordpress\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.jonathanleroux.org\/wordpress\/wp-json\/wp\/v2\/comments?post=37"}],"version-history":[{"count":5,"href":"https:\/\/www.jonathanleroux.org\/wordpress\/wp-json\/wp\/v2\/posts\/37\/revisions"}],"predecessor-version":[{"id":44,"href":"https:\/\/www.jonathanleroux.org\/wordpress\/wp-json\/wp\/v2\/posts\/37\/revisions\/44"}],"wp:attachment":[{"href":"https:\/\/www.jonathanleroux.org\/wordpress\/wp-json\/wp\/v2\/media?parent=37"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.jonathanleroux.org\/wordpress\/wp-json\/wp\/v2\/categories?post=37"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.jonathanleroux.org\/wordpress\/wp-json\/wp\/v2\/tags?post=37"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}