Nemlar logo





Contact:
nemlar@hum.ku.dk


BLARK Definition and BLARK Content
- for written language

Back to BLARK Home

The degree to which the modules are needed is marked by plus signs:
+++ means essential
++ means very important
+ means important

Speech modules and corresponding spoken language resources, marked with importance:
(Follow the blue link to see the BLARK Content.)

Resources BNSC Desktop/Microphone & High quality microphone data or phone data Telephony Audio data with prosodic markers and other emotional features Annotated written corpus Unannotated written corpus Vowelised corpus Non-vowelised corpus Phonetic lexicon general vocabular Onomastica (proper names) Visual data (faces, lips, etc.)
Speech modules
Acoustic models +++ +++ +++ +++              
Language models         ++ +++ ++ +++      
Pronunciation lexicon         +   ++   +++ +++  
Lexicon adaptation         + + ++ + +++ +++  
Phoneme alignment ++ ++ ++ ++ ++   +   +++ +++  
Prosody recognition       ++ ++   ++   ++ ++  
Speech Units Selection   + + +++ ++       + +  
Prosody prediction       ++ ++   ++   ++ ++  
Segmentation Speech / Silence ++ ++ ++ ++              
Sentence boundary detection ++ ++ ++ ++         + +  
Dialect / language identification ++ ++ ++ +         + +  
Word boundary identification + + + +         + +  
Speech /Non-speech (music) detection ++ + + ++              
Speaker recognition / identification + + + +              
Emotion identification + + + +         + +  
Speaker adaptation ++ ++ ++ +              
Lips movement reading                     +++

Speech applications and corresponding speech modules, marked with importance:

Applications Dictation Telephony
speech
applications
Embedded speech recognition Transcription of broadcast News Transcription of conversational speech Speaker recognition Dialect / language identification Emotion identification Speaker adaptation Lips movement reading Topic detection, segmentation, topic boundaries Speaker 2 speaker mapping Emotion / Prosody output Text to Speech (incl. formatted data e.g. databases) Customization to different voices Generation Lips Movement
Speech modules
Acoustic models +++ +++ +++ +++ +++ ++ +++ +++ +++ +++ +++ ++ +++ +++ +++ +++
Language models +++ ++ ++ +++ +++   ++           ++ +++    
Pronunciation lexicon +++ +++ +++ +++ +++             ++   +++    
Lexicon adaptation + + + + +             ++   +++    
Phoneme alignment + + + + + + ++         ++   +++    
Prosody recognition + + + + + + + +++ +     ++        
Speech Units Selection                         +++ +++    
Prosody prediction                         +++ +++    
Segmentation Speech / Silence ++ + ++ ++ ++ + ++ ++ + + +   +      
Sentence boundary detection + + + + + + + ++ + + +   ++ +++    
Dialect / language identification + + + + + + + + + + +     +    
Word boundary identification + + + + + + + + + + +   ++      
Speech /Non-speech (music) detection + + + + + + + ++ + + +          
Speaker recognition / identification                                
Emotion identification + + + + + + +   + + + ++ ++      
Speaker adaptation ++ + ++ + ++ + + + + + + ++   +    
Lips movement reading                   +++            


MEDAR is supported by the European Commission's ICT programme and is running from
February 1st 2008 until July 31st 2010

European Flag