1.1 Models of speech production
1.2 Physiology and neurophysiology of speech production
1.3 Coarticulation
1.4 Models of speech perception
1.5 Physiology and neurophysiology of speech perception
1.6 Acoustic and articulatory cues in speech perception
1.7 Interaction speech production-speech perception
1.8 Multimodal speech perception
1.9 Cognition and brain studies on speech
1.10 Code switching and multilingual studies
1.11 L1 acquisition
1.12 Bilingual and L2 acquisition and processing
1.13 Speech and voice disorders
1.14 Hearing disorders
1.15 Singing voice: production and perception
1.16 Speech and other biosignals
1.17 Adverse listening conditions
1.18 Other topics in Speech Perception, Production and Acquisition

2.1 Phonetics and phonology
2.2 Language descriptions
2.3 Linguistic systems
2.4 Acoustic phonetics
2.5 Phonation, voice quality
2.6 Articulatory and acoustic features of prosody
2.7 Perception of prosody
2.8 Laboratory phonology
2.9 Phonetic universals
2.10 Sound changes
2.11 Sociophonetics
2.12 Phonetics of L1-L2 interaction
2.13 Forensic phonetics
2.14 Acoustic Manifestations of Social Characteristics
2.15 Allophonic variation across languages
2.16 Other topics in Phonetics, Phonology, and Prosody

3.1 Analysis of speaker states
3.2 Analysis of speaker traits
3.3 Automatic analysis of speaker states and traits
3.4 Pathological speech and language
3.5 Social signal processing
3.6 Sentiment analysis and opinion mining
3.7 Paralinguistics in singing
3.8 Perception of paralinguistic phenomena
3.9 Multimodal paralinguistics
3.10 Phonetic and linguistic aspects of paralinguistics
3.11 Other topics in Analysis of Paralinguistics in Speech and Language

4.1 Language identification and verification, language diarization and code switching
4.2 Dialect and accent recognition
4.3 Speaker verification and identification
4.4 Features for speaker and language recognition
4.5 Robustness to variable and degraded channels
4.6 Speaker confidence estimation
4.7 Speaker diarization
4.8 Higher-level knowledge in speaker and language recognition
4.9 Evaluation of speaker and language identification systems
4.10 Multimodal/multimedia speaker recognition and diarization
4.11 Multilingual speaker recognition
4.12 Other topics in Speaker and Language Identification

5.1 Speech acoustics
5.2 Speech analysis and representation
5.3 Audio signal analysis and representation
5.4 Speech and audio segmentation and classification
5.5 Voice activity detection
5.6 Pitch and harmonic analysis
5.7 Source separation and computational auditory scene analysis
5.8 Speaker spatial localization
5.9 Music signal processing and understanding
5.10 Singing analysis
5.11 Other topics in Analysis of Speech and Audio Signals

6.1 Speech coding and transmission
6.2 Low-bit-rate speech coding
6.3 Perceptual audio coding of speech signals
6.4 Noise reduction for speech signals
6.5 Speech enhancement: single-channel
6.6 Speech enhancement: multi-channel
6.7 Speech intelligibility
6.8 Active noise control
6.9 Speech enhancement in hearing aids
6.10 Adaptive beamforming for speech enhancement
6.11 Dereverberation for speech signals
6.12 Echo cancelation for speech signals
6.13 Evaluation of speech transmission, coding and enhancement
6.14 Other topics in Speech Coding and Enhancement

7.1 Grapheme-to-phoneme conversion for synthesis
7.2 Text processing for speech synthesis
7.3 Signal processing/statistical models for synthesis
7.4 Speech synthesis paradigms and methods
7.5 Articulatory speech synthesis
7.6 Segment-level and/or concatenative synthesis
7.7 Unit selection speech synthesis
7.8 Statistical parametric speech synthesis
7.9 Prosody modeling and generation
7.10 Expression, emotion and personality generation
7.11 Synthesis of singing voices
7.12 Voice modification, conversion and morphing
7.13 Concept-to-speech conversion
7.14 Cross-lingual and multilingual aspects in speech synthesis, code switching
7.15 Multimodal synthesis for avatars and talking heads
7.16 Tools and data for speech synthesis
7.17 Evaluation of speech synthesis
7.18 Other topics in Speech Synthesis and Spoken Language Generation

8.1 Feature extraction and low-level feature modeling for ASR
8.2 Prosodic features and models
8.3 Robustness against noise, reverberation
8.4 Far field and microphone array speech recognition
8.5 Speaker normalization (e.g., VTLN)
8.6 New types of neural network models and learning (e.g., network topologies, objective functions, etc.)
8.7 Discriminative acoustic training methods for ASR
8.8 Acoustic model adaptation (speaker, bandwidth, emotion, accent)
8.9 Speaker adaptation, speaker adapted training methods
8.10 Pronunciation variants and modeling for speech recognition
8.11 Acoustic confidence measures
8.12 Cross-lingual and multilingual aspects, non-native accents
8.13 Acoustic modeling for conversational speech (dialog, interaction)
8.14 Other topics in Speech Recognition: Signal Processing, Acoustic Modeling, Robustness, Adaptation

9.1 Lexical modeling and access: units and models
9.2 Automatic lexicon learning
9.3 Supervised/unsupervised morphological models
9.4 Prosodic features and models for language modeling
9.5 Discriminative training methods for language modeling
9.6 Language model adaptation (domain, diachronic adaptation)
9.7 Language modeling for conversational speech (dialog, interaction)
9.8 Neural networks for language modeling
9.9 Search methods, decoding algorithms, lattices, multipass strategies
9.10 New computational strategies, data-structures for ASR
9.11 Computational resource constrained speech recognition
9.12 Confidence measures
9.13 Cross-lingual and multilingual components for speech recognition, code switching
9.14 Structured classification approaches
9.15 Other topics in Speech Recognition -Architecture, Search, and Linguistic Components

10.1 Multimodal systems
10.2 Applications in education and learning (incl. CALL, assessment of fluency)
10.3 Applications in medical practice (CIS, voice assessment, etc.)
10.4 Speech science in end-user applications
10.5 Rich transcription
10.6 Innovative products and services based on speech technologies
10.7 Sparse, template-based representations
10.8 New paradigms (e.g. articulatory models, silent speech interfaces, topic models)
10.9 Zero-resource speech recognition
10.10 Code-switched speech recognition
10.11 Other topics in Speech Recognition -Technologies and Systems for New Applications

11.1 Spoken dialog systems
11.2 Discourse and dialog structures
11.3 Multimodal interaction and interfaces
11.4 Conversation, communication and interaction
11.5 Analysis of verbal, co-verbal and nonverbal behavior
11.6 Interactive systems for speech/language training, therapy, communication aids
11.7 Stochastic modeling for dialog
11.8 Question-answering from speech
11.9 Spoken interaction with social robots
11.10 Systems for spoken language understanding
11.11 Evaluation of speech and multimodal dialog systems
11.12 Dialog system in a multilingual setting
11.13 Other topics in Spoken dialog systems and conversational analysis

12.1 Spoken machine translation
12.2 Speech-to-speech translation systems
12.3 Transliteration
12.4 Voice search
12.5 Spoken term detection
12.6 Audio indexing
12.7 Spoken document retrieval
12.8 Systems for mining spoken data, search or retrieval of speech documents
12.9 Speech and multimodal resources and annotation, code switching
12.10 Evaluation of speech recognition
12.11 Metadata descriptions of speech, audio and text resources
12.12 Metadata for semantic or content markup
12.13 Metadata for ling./discourse structure (disfluencies, boundaries, speech acts)
12.14 Methodologies and tools for language resource construction and annotation
12.15 Automatic segmentation and labeling of resources
12.16 Multilingual resources
12.17 Evaluation and quality insurance of language resources
12.18 Evaluation of translation and information retrieval systems
12.19 Spoken document summarization
12.20 Semantic analysis and classification
12.21 Entity extraction from speech
12.22 Evaluation of summarization and understanding
12.23 Topic spotting and classification
12.24 Other topics in Spoken Language Processing: Translation, Information Retrieval, Summarization, Resources and Evaluation

13.1 The Independence of Source and Filter in Vowel Production and Perception in Cross-Cultural Speech Communication
13.2 The INTERSPEECH 2018 Computational Paralinguistics ChallengE (ComParE): Atypical & Self-Assessed Affect, Crying & Heart Beats
13.3 The First DIHARD Speech Diarization Challenge
13.4 Novel Paradigms for Direct Synthesis based on Speech-Related Biosignals
13.5 Speech Recognition for Indian Language
13.6 Deep Neural Networks: How Can We Interpret What They Learned?
13.7 Low Resource Speech Recognition Challenge for Indian Languages
13.8 Integrating Speech Science and Technology for Clinical Applications
13.9 Speech Technologies for Code-Switching in Multilingual Communities
13.10 Spoken CALL Shared Task, Second Edition

Interspeech 2018

September 2-6 | HYDERABAD, India

Hyderabad international convention centre

Program | Areas and Topics

Program | Areas and Topics

1. Speech Perception, Production and Acquisition

2. Phonetics, Phonology, and Prosody

3. Analysis of Paralinguistics in Speech and Language

4. Speaker and Language Identification

5.Analysis of Speech and Audio Signals

6. Speech Coding and Enhancement

7. Speech Synthesis and Spoken Language Generation

8. Speech Recognition: Signal Processing, Acoustic Modeling, Robustness, Adaptation

9. Speech Recognition -Architecture, Search, and Linguistic Components

10. Speech Recognition -Technologies and Systems for New Applications

11. Spoken dialog systems and conversational analysis

12. Spoken Language Processing: Translation, Information Retrieval, Summarization, Resources and Evaluation

13. Special Sessions