Title/theme: Can advances in speech recognition make spoken language as convenient and as accessible as online text? 00-05 (5) PJP Intro to theme 01-03 overview;history of information access and sources of information: speech, text, computers, speech recognition) 03-06 Overview of initial applications (speech as access to information or input device) command and control (manufacturing apps? consumer toys?) database query (resource management?, ATIS? Schwab, sears, etc.) dictation (Dragon, IBM products) 06-10 (5) PJP and LH Demos of current applications 06-7 PJP demos Dragon or IBM dictation 08 1 minute to change, just in case- 09-10 Larry demos telephone query 11-15 (5) PJP Future trends 11 transition from recognition to info extraction note from Joe: probably need a transition foil here on how the vision for speech changed from recognition to information extraction note from pjp: what did you have in mind? Darpa funding pressure to do something different? Analogous to move from perfect transcription to something "useful" Outgrowth of both of above inspired by darpa arranged marriage between speech and nl? 12-13 beginnings of speech as source of information, depending on what I can find- CMU spinout- Virage-??? Pjp note to JP: I think the CMU spinout is called ISLIP, Virage is a competitor 14-15 Vision: merge speech as access means and as source. But that vision will take require technology development Now Joe will talk about what's going on in the research labs 16-35 (20) JP Overview of research trends (leading to possibility of speech as source of information) History of best practice, and lessons learned (we've come a long way) 00 Title: Who am I? Joe: I have 10 minutes to introduce the group and say why I chose this group of people, so we should talk about what goes on there. I'm thinking that it should take less than 10 minutes but that we are cramped for time in this section. So let's aim for 40 min, but I'll try to be brief in the intro to buy us some time. Pjp 01 Intro: The evolution of high performance in speech recognition from the 1970's to the 1990's Intro: how our view of performance has changed over time from "perfect recognition" to "usable performance" 02 Intro: Human performance 03 Intro: DARPA/DoD eval performance over time 04 Intro: Beyond WER 05 Intro: Named Entity / IE performance as a function of speech recognition performance Statistics 06 Stats: Bayesian statistics 07 Stats: LM, AM, and linguistics 08 Stats: The acoustic modeling problem 09 Stats: training, estimation, etc. 10 Stats: The LM problem 11 Stats: Ngram language models 12 Stats: entropy, perplexity, examples of ngram LM generation: does this make sense as an LM 13 Stats: more powerful LMs Demo 14 Demo #1: overview 15 Demo #2: actual demo pjp: tight for demos! technology: current research directions and state of the art 16 Tech: Conversational speech transcription 17 Tech: Broadcast News 18 Tech: Named Entity / Information Extraction 19 Tech: Dialog Systems. transcription of telephone interactions Summary, references 20 summary 21 references Pjp: references here? In a talk? what about research challenges (what limits performance now?) and mathematical frontiers (what mathematics will help us take the technology to the next level) 36-40 (5) PJP how does the "research push" help enable the vision? How does the application vision lead to more research? (with plugs for more basic research) collaboration: multiple talkers translation among modalities browsing and composition tools for speech proactive information intro next speakers: speaker ID, Larry Heck and Doug Reynolds