Robert J. Moorhead
Image Processing
ERC

Introducing the
INSTITUTE FOR SIGNAL AND INFORMATION PROCESSING
located at
Mississippi State University
Department of Electrical and Computer Engineering
Box 9571, Mississippi State, Mississippi 39762
Tel: 601-325-3149  Fax: 601-325-3149
email: picone@isip.msstate.edu

MISSION STATEMENT

Mississippi State University for over 100 years has had a mission of being a center of excellence in the State of Mississippi for:

·  Learning - to enhance the intellectual development of its students
·  Research - to extend the present limits of knowledge
·  Service - to apply its research to improve the lives of people

The Institute for Signal and Information Processing (ISIP) offers a multidisciplinary program focused on the development of next generation information processing techniques. Research at ISIP is centered on intelligent information processing, perhaps the most important technology of the next century. ISIP draws upon a wide range of research experience in areas such as signal processing, communications, natural language, database query, intelligent systems, and discrete controls. Its present vision is to develop systems capable of intelligent interactions with users by the integration of a multiplicity of interface technologies including speech, natural language, database query, and imaging.


isip00 (fileserver, router, and domain server):
·  Sun SPARC 5
·  70 MHz MicroSPARC II
·  32 Mbytes RAM, 1  Gbyte local disk
·  2 ethernets (for routing)
·  60 Gbytes magnetic disk (Seagate Elite)
Exabyte 10h Tape Library
·  8 mm tapes
·  70 Gbyte capacity ·  140 Gbytes compressed
Outside World (hub #0):
·  Allied Telesyn MR 820T
·  10BaseT 8 port hub (10 Mbits/sec)
·  Cat-5 Unshielded Twisted Pair
·  155 Mbits/sec ATM (campus)
isip01 (compute server):
·  Sun SPARC 20-512
·  Two 50 MHz SuperSPARC Processors
·  192 Mbytes RAM, 1 Gbyte local disk
isip02 (demo machine):
·  Sun Sparc 5
·  70 MHz MicroSPARC II
·  32 Mbytes RAM, 1 Gbyte local disk
·  T1 Telecom Interface
Basic TECHNOLOGY:
A PATTERN RECOGNITION PARADIGM
BASED ON HIDDEN MARKOV MODELS
datlink 0 and datlink 1 (audio):
·  Townshend DAT-Link+
·  16-bit digital audio
·  AES/EBU and SP-DIF
Sharp JX-325 Color Scanner:
·  one-pass 24-bit color scan
·  300 dpi native mode
·  Detailed performance analysis in a common framework 
Algorithm	FFT ORDER					
	16	64	256	1024	4096	16384
RAD2	20	60	280	1960	10900	97100
RAD4	20	60	250	1800	9720	58220
SRFFT	20	40	160	1060	6140	38100
FHT	20	40	140	640	3800	38100
QFT	20	40	160	880	6560	44020
DITF	20	60	360	2500	12320	104080

(Table entries are computation times in usec)

PARALLEL IMPLEMENTATIONS
OF FAST FOURIER TRANSFORMS

·  Object-oriented software implemented in C++
A T1-BASED DATA COLLECTION SYSTEM
For SUN/UNIX WORKSTATIONS
The JEIDA Japanese Common Speech Data Corpus
Domain: isip.msstate.edu
Automatic Generation of 
N-Best Proper Noun Pronunciations

What Differentiates ISIP Research?

p	Public Domain Software

p	Extensive Web Archive

p	Object-Oriented Signal Processing Software

p	State-of-the-Art Performance Tasks

p	Close Industrial Ties

p		Next-Generation Statistical Models Based on Chaotic Systems
Applicable to acoustic and language modeling
Addresses a fundamental barrier in speech understanding
Anthony Skjellum
High Performance Computing
Computer Science / ERC
Joe Picone
Signal Processing
Inst. for Signal and Info. Proc.
Stephen E. Saddow
Semiconductor Technology
EMRL
Signal Processing Research
At MISSISSIPPI STATE UNIVERSITY Is Multidisciplinary
isip03 and isip05 (compute server):
·  dual Pentium Pro
·  200 MHz Processor
·  256 Mbytes RAM, 1Gbyte local disk
isip04 and isip06 (laptops):
·  Samsung Sens 810, Toshiba Tecra 500 CDT
·  133 MHz Pentium Processor
·  40 Mbytes RAM, 2 Gbyte local disk
ncd20c00 (clients):
·  NCD Xterms
·  16-bit audio
SYLLABLE-BASED SPEECH RECOGNITION
FOR CONVERSATIONAL TELEPHONE SPEECH
ISIP's Focal Project

·  An Integrated Services Transactions Processor That Supports Advanced Telecommunications Interfaces such as an Asynchronous Transfer Mode (ATM) Digital Communications Link

Example:	Telephone-Based Natural Language Query of Entertainment Archives

Customer: "Give me all movies, uh, make that only the recent movies, directed by Martin Scorsese and starring Robert DeNiro, and oh, by the way, make that movies about gangsters only."

Computer: We have three titles available (the titles of the movies are shown on the television screen with real-time video of promo clips from each movie below the title). Please select a movie.

Customer: "That one with the three guys looks good, I'll take that one. I want it to start at 8:00 PM tomorrow."

Computer: (The promo clip for the selected movie starts playing on the television.) The movie titled GoodFellas starring Robert DeNiro and directed by Martin Scorsese will be delivered for viewing on your television on Thursday, September 25 starting at 8:00 PM. Thank you for using ISIP's Entertainment Server. Good-bye.
Local
Central
Office
ATM (160 Mbps)
· Voice
· Video
· Data (X Windows)
Unix Multiprocessor (Sparcstation 2000): 
· 8 Processors
· 512 Mbytes of memory
· videotape jukebox
Search Algorithms: 
Pattern Matching: 
Signal Model: 
Recognized Symbols: 
Language Model: 
Algorithms
Aravind Ganapathiraju (Ph.D. - 1)

Jule Baca (Ph.D. - 4)
Neeraj Deshmukh (Ph.D. - 3)
Julie Ngan (M.S. - 1)
Institute for
Signal and Information Processing
(ISIP)
Director: Dr. Joseph Picone
Software
Jonathan Hamaker (M.S. - 1)
Audrey Le (M.S. - 1)
Janna Shaffer (U.G. - 4)
Information Technology
Richard Duncan (U.G. - 3)
Nirmala Kalidindi (M.S. - 2)
Suresh Balakrishnam (M.S. - 1)
New Hires (U.G. - 3)

Joseph Picone
Associate Professor
Department of Electrical and Computer Engineering

	Mississippi State University	Phone: (601) 325-3149
	Box 9571	Fax: (601) 325-2298
	Mississippi State, MS 39762	Email: picone@isip.msstate.edu

Education
Ph.D. in Electrical Engineering, Illinois Institute of Technology, December 1983
M.S. in Electrical Engineering, Illinois Institute of Technology, May  1980
B.S. in Electrical Engineering, Illinois Institute of Technology, May 1979

Areas of Research
Speech Understanding, Digital Signal Processing, and Pattern Recognition.

Experience Summary
Dr. Picone primary interests are in the area of new statistical approaches to speech understanding. He has founded a speech research laboratory at Mississippi State University that conducts research into a number of related areas. (For more information, please check http://www.isip.msstate.edu). Research support has included projects with Texas Instruments, the Linguistic Data Consortium, ARPA's Spoken Language Systems program, and DoD.

Dr. Picone recently served as Data and Systems Coordinator for the 1997 Summer Workshop on Large Vocabulary Speech Recognition hosted by the Center for Language and Speech Processing at Johns Hopkins University. During this workshop, he also served as a senior member of a team dedicated to syllable-based speech processing. Under his guidance, the workshop was extremely successful as all four teams participating in the workshop posted statistically significant improvements on the state of the art.

Dr. Picone is currently a Senior Member of the IEEE and a Professional Engineer registered in the State of Texas. He is also an Associate Editor for the IEEE Signal Processing Magazine and the IEEE Transactions on Speech and Audio Processing, and has served as a reviewer for numerous organizations including NSF. He was previously employed at Texas Instruments as a Senior Member of Technical Staff and at AT&T Bell Laboratories. He is also a former Adjunct Professor at University of Texas at Dallas and Illinois Institute of Technology. He has previously conducted research in medium and low data rate speech compression. Dr. Picone has published more than 85 papers in the area of speech processing and has been awarded 8 patents.
Recent Significant Publications
Journal Articles:
1.	N. Deshmukh and J. Picone, "AUTOMATIC GENERATION OF N-BEST PRONUNCIATIONS OF PROPER NOUNS," submitted to the IEEE Transactions on Speech and Audio Processing, November 1996.
2.	J. Picone, T. Staples, K. Kondo and N. Arai, "Kanji to Hiragana Conversion Based on a Length Constrained N-Gram Analysis," accepted for publication in the IEEE Transactions on Speech and Audio Processing, Fall 1996.
3.	J. Picone, W.J. Ebel, and N. Deshmukh, "Automated Speech Understanding: The Next Generation," in Digital Signal Processing Technology, Vol. CR57, pp. 101-114, 1995.
Conferences:
4.	N. Deshmukh, J. Ngan, J. Hamaker, and J. Picone, "An Advanced System to Generate Multiple Pronunciations of Proper Nouns," Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Munich, Germany, vol. 2, pp. 1467-1470, April 1997.
5.	J.J. Godfrey, A. Ganapathiraju, and J. Picone, "Microsegment Modeling for Speech Recognition," Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Munich, Germany, vol. 3, pp. 1755-1758, April 1997.
6.	A. Ganapathiraju and J. Picone, "Echo Cancellation For Evaluating Speaker Identification Technology," Proceedings of IEEE Southeastcon, pp. 100-102, Blacksburg, Virginia, U.S.A., April 1997.
7.	N. Deshmukh, R. Duncan, and J. Picone, "Human Listening Benchmarks on ARPA's CSR Performance Tasks," Proceedings Fourth International Conference on Spoken Language Processing, Philadelphia, Pennsylvania, U.S.A., pp. SuP1P1.10, October 1996.
8.	N. Deshmukh, M. Weber, and J. Picone, "Automated Generation of N-Best Pronunciations of Proper Nouns," Proceedings IEEE International Conference on Acoustics, Speech, and Signal Processing, Atlanta, Georgia, vol. 1, pp. 283-286, May 1996.
9.	N. Deshmukh and J. Picone, "Human Performance on ARPA's CSR'95 Hub," presented at the ARPA Spoken Language Systems Technology Workshop, Harriman, New York, January 1996.
10.	W.J. Ebel and J. Picone, "Human Speech Recognition Performance on the 1994 CSR Spoke 10 Corpus," Proceedings of the Spoken Language Systems Technology Workshop, pp. 53-59, Austin, Texas, January 1995.
11.	Y. Muthusamy, E. Holliman, B. Wheatley, J. Picone, and J. Godfrey, "Voice Across Hispanic America: A Telephone Speech Corpus of American Spanish," IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 85-88, Detroit, Michigan, May 1995.

.
Number of speakers	150 speakers
75 male speakers
75 female speakers
Number of items per speaker
monosyllables
178 isolated words
35 4-digit sequences	323 items
Number of repetitions per item	4 repetitions of each item
Range of speaker age	20 yrs. to 60 yrs.
Amount of data	120 hours
Number of Digital Audio Tapes	76 (120-minute tapes)
Total number of utterances	193,800 utterances
Number of channels/mic. type	2 (dynamic and condenser mics.)
Anticipated size of final corpus
(16-bit 16 kHz samples @ 1.0 secs per utterance)	6.5 Gbytes
(13 CD-ROMs uncompressed)


NEURAl NETWORk SOLUTION
JAVA APPLETS
http://isip.msstate.edu/software/java_system_response

Other ISIP Java Applets include:

· Convolution
· Frequency Response
· Nyquist Criterion
· Analog and Digital Filter Design
· Compilers and Assembly Code
· Hidden Markov Model Toolkit
· Speech Recognition Primer
ECHO CANCELLATION FOR SPEECH RECOGNITION
Semi-Parser
Language
Model
Tagged Text
Natural
Language
Processing
Request
Generator
Knowledge
Extractor
Filled Templates
Netscape Requests
Netscape
Knowledge
Extraction
Flat Parsed
Structures
Speech
Recognition
Language
Model
Text
Natural
Language
Understanding
"Show me all the reports from the White House on Healthcare."
Victor A. Rudis
Forestry Imaging
USFS
Communications Laboratory
Elect. and Comp. Eng.
Bud Rizer
Assistive Technologies
T.K. Martin Center