LEADER 08317nam 22007215 450 001 9910766883503321 005 20231121205504.0 010 $a3-031-48309-X 024 7 $a10.1007/978-3-031-48309-7 035 $a(MiAaPQ)EBC30963210 035 $a(Au-PeEL)EBL30963210 035 $a(DE-He213)978-3-031-48309-7 035 $a(CKB)29021569300041 035 $a(EXLCZ)9929021569300041 100 $a20231121d2023 u| 0 101 0 $aeng 135 $aurcnu|||||||| 181 $ctxt$2rdacontent 182 $cc$2rdamedia 183 $acr$2rdacarrier 200 10$aSpeech and Computer $e25th International Conference, SPECOM 2023, Dharwad, India, November 29 ? December 2, 2023, Proceedings, Part I /$fedited by Alexey Karpov, K. Samudravijaya, K. T. Deepak, Rajesh M. Hegde, Shyam S. Agrawal, S. R. Mahadeva Prasanna 205 $a1st ed. 2023. 210 1$aCham :$cSpringer Nature Switzerland :$cImprint: Springer,$d2023. 215 $a1 online resource (657 pages) 225 1 $aLecture Notes in Artificial Intelligence,$x2945-9141 ;$v14338 311 08$aPrint version: Karpov, Alexey Speech and Computer Cham : Springer,c2023 327 $aAutomatic Speech Recognition -- Extreme Learning Layer: A Boost for Spoken Digit Recognition with Spiking Neural Networks -- EMO-AVSR: Two-Level Approach for Audio-Visual Emotional Speech Recognition -- Significance of Audio Quality in Speech-to-Text Translation Systems -- Everyday Conversations: a Comparative Study of Expert Transcriptions and ASR Outputs at a Lexical Level -- Improving Automatic Speech Recognition with Dialect-Specific Language Models -- Emotional speech recognition of Holocaust survivors with deep neural network models for Russian language -- Computational Paralinguistics -- Aggregation Strategies of Wav2vec 2.0 Embeddings for Computational Paralinguistic Tasks -- Rhythm Formant Analysis for Automatic Depression Classification -- Determining Alcohol Intoxication Based on Speech and Neural Networks -- Linear Frequency Residual Cepstral Coefficients for Speech Emotion Recognition -- Enhancing Stutter Detection in Speech using Zero Time Windowing Cepstral Coefficients and Phase Information -- Source and System-based Modulation Approach for Fake Speech Detection -- Digital Signal Processing -- Investigation of Different Calibration Methods for Deep Speaker Embedding based Verification Systems -- Learning to Predict Speech Intelligibility from Speech Distortions -- Sparse Representation Frameworks for Acoustic Scene Classification -- Driver Speech Detection in Real Driving Scenario -- Regularization based Incremental Learning in TCNN for Robust Speech Enhancement Targeting Effective Human Machine Interaction -- Candidate Speech Extraction from Multi-Speaker Single-Channel Audio Interviews -- Post-Processing of Translated Speech by Pole Modification and Residual Enhancement to Improve Perceptual Quality -- Region Normalized Capsule Network based Generative Adversarial Network for Non-Parallel Voice Conversion -- Speech Enhancement using LinkNet Architecture -- ATT:Adversarial Trained Transformer for Speech Enhancement -- Human Identification by Dynamics of Changes in Brain Frequencies Using Artificial Neural Networks -- Speech Prosody -- Analysis of Formant Trajectories of a Speech Signal for the Purpose of Forensic Identification of a Foreign Speaker -- Gestures vs. Prosodic Structure in Laboratory Ironic Speech -- Sounds of < sil > ence: Acoustics of Inhalation in Read Speech -- Prolongations as Hesitation Phenomena in Spoken Speech in First and Second Language -- Study of Indian English Pronunciation Variabilities Relative to Received Pronunciation -- Multimodal Collaboration in Expository Discourse: Verbal and Nonverbal Moves Alignment -- Association of Time Domain Features with Oral Cavity Configuration during Vowel Production and its Application in Vowel Recognition -- Prosodic Interaction Models in a Conversation -- Natural Language Processing -- Development and Research of Dialogue Agents with Long-Term Memory and Web Search -- Pre- and Post-Textual Contexts in Assessment of a Message as Offensive or Defensive Aggression Verbalization -- Boosting Rule-based Grapheme-to-Phoneme Conversion with Morphological Segmentation and Syllabification in Bengali -- Revisiting Assessment of Text Complexity: Lexical and Syntactic Parameters Fluctuations -- Analysis of Natural Language Understanding Systems with L2 Learner Specific Synthetic Grammatical Errors based on Parts-of-Speech -- On the Most Frequent Sequences of Words in Russian Spoken Everyday Language (Bigrams and Trigrams): An Experience of Classification -- Child Speech Processing -- Recognition of the Emotional State of Children by Video and Audio Modalities by Indian and Russian Experts -- Effect of Linear Prediction Order to Modify Formant Locations for Children Speech Recognition -- Gammatone-Filterbank based Pitch-Normalized Cepstral Coefficients for Zero-Resource Children?s ASR -- System Assisted Vocal Response Analysis and Assessment of Autism in Children: A Machine Learning Based Approach -- Addressing Effects of Formant Dispersion and Pitch Sensitivity for the Development of Children?s KWS System -- Development of Children?s KWS System Perceptual Experiment and Automatic Recognition by Video, Audio and Text Modalities -- Linear Frequency Residual Features for Infant Cry Classification -- Speech Processing for Medicine -- Identification of Voice Disorders: A Comparative Study of Machine Learning Algorithms -- Transfer Learning using Whisper for Dysarthric Automatic Speech Recognition -- Significance of Duration Modification in Reducing Listening Effort of Slurred Speech from Patients with Traumatic Brain Injury -- Significance of Duration Modification in Reducing Listening Effort of Slurred Speech from Patients with Traumatic Brain Injury -- Respiratory Sickness Detection from Audio Recordings using CLIP Models -- Investigating the Effect of Data Impurity on the Detection Performances of Mental Disorders through Spoken Dialogues. 330 $aThe two-volume proceedings set LNAI 14338 and 14339 constitutes the refereed proceedings of the 25th International Conference on Speech and Computer, SPECOM 2023, held in Dharwad, India, during November 29?December 2, 2023. The 94 papers included in these proceedings were carefully reviewed and selected from 174 submissions. They focus on all aspects of speech science and technology: automatic speech recognition; computational paralinguistics; digital signal processing; speech prosody; natural language processing; child speech processing; speech processing for medicine; industrial speech and language technology; speech technology for under-resourced languages; speech analysis and synthesis; speaker and language identification, verification and diarization. 410 0$aLecture Notes in Artificial Intelligence,$x2945-9141 ;$v14338 606 $aArtificial intelligence 606 $aComputer engineering 606 $aComputer networks 606 $aApplication software 606 $aImage processing$xDigital techniques 606 $aComputer vision 606 $aArtificial Intelligence 606 $aComputer Engineering and Networks 606 $aComputer and Information Systems Applications 606 $aComputer Imaging, Vision, Pattern Recognition and Graphics 615 0$aArtificial intelligence. 615 0$aComputer engineering. 615 0$aComputer networks. 615 0$aApplication software. 615 0$aImage processing$xDigital techniques. 615 0$aComputer vision. 615 14$aArtificial Intelligence. 615 24$aComputer Engineering and Networks. 615 24$aComputer and Information Systems Applications. 615 24$aComputer Imaging, Vision, Pattern Recognition and Graphics. 676 $a006.3 700 $aKarpov$b Alexey$01448682 701 $aSamudravijaya$b K$01448683 701 $aDeepak$b K. T$01448684 701 $aHegde$b Rajesh M$01448685 701 $aAgrawal$b Shyam S$01448686 701 $aPrasanna$b S. R. Mahadeva$01448687 801 0$bMiAaPQ 801 1$bMiAaPQ 801 2$bMiAaPQ 906 $aBOOK 912 $a9910766883503321 996 $aSpeech and Computer$93644367 997 $aUNINA