top

  Info

  • Utilizzare la checkbox di selezione a fianco di ciascun documento per attivare le funzionalità di stampa, invio email, download nei formati disponibili del (i) record.

  Info

  • Utilizzare questo link per rimuovere la selezione effettuata.
Speech and Computer [[electronic resource] ] : 25th International Conference, SPECOM 2023, Dharwad, India, November 29 – December 2, 2023, Proceedings, Part I / / edited by Alexey Karpov, K. Samudravijaya, K. T. Deepak, Rajesh M. Hegde, Shyam S. Agrawal, S. R. Mahadeva Prasanna
Speech and Computer [[electronic resource] ] : 25th International Conference, SPECOM 2023, Dharwad, India, November 29 – December 2, 2023, Proceedings, Part I / / edited by Alexey Karpov, K. Samudravijaya, K. T. Deepak, Rajesh M. Hegde, Shyam S. Agrawal, S. R. Mahadeva Prasanna
Autore Karpov Alexey
Edizione [1st ed. 2023.]
Pubbl/distr/stampa Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Descrizione fisica 1 online resource (657 pages)
Disciplina 006.3
Altri autori (Persone) SamudravijayaK
DeepakK. T
HegdeRajesh M
AgrawalShyam S
PrasannaS. R. Mahadeva
Collana Lecture Notes in Artificial Intelligence
Soggetto topico Artificial intelligence
Computer engineering
Computer networks
Application software
Image processing - Digital techniques
Computer vision
Artificial Intelligence
Computer Engineering and Networks
Computer and Information Systems Applications
Computer Imaging, Vision, Pattern Recognition and Graphics
ISBN 3-031-48309-X
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Automatic Speech Recognition -- Extreme Learning Layer: A Boost for Spoken Digit Recognition with Spiking Neural Networks -- EMO-AVSR: Two-Level Approach for Audio-Visual Emotional Speech Recognition -- Significance of Audio Quality in Speech-to-Text Translation Systems -- Everyday Conversations: a Comparative Study of Expert Transcriptions and ASR Outputs at a Lexical Level -- Improving Automatic Speech Recognition with Dialect-Specific Language Models -- Emotional speech recognition of Holocaust survivors with deep neural network models for Russian language -- Computational Paralinguistics -- Aggregation Strategies of Wav2vec 2.0 Embeddings for Computational Paralinguistic Tasks -- Rhythm Formant Analysis for Automatic Depression Classification -- Determining Alcohol Intoxication Based on Speech and Neural Networks -- Linear Frequency Residual Cepstral Coefficients for Speech Emotion Recognition -- Enhancing Stutter Detection in Speech using Zero Time Windowing Cepstral Coefficients and Phase Information -- Source and System-based Modulation Approach for Fake Speech Detection -- Digital Signal Processing -- Investigation of Different Calibration Methods for Deep Speaker Embedding based Verification Systems -- Learning to Predict Speech Intelligibility from Speech Distortions -- Sparse Representation Frameworks for Acoustic Scene Classification -- Driver Speech Detection in Real Driving Scenario -- Regularization based Incremental Learning in TCNN for Robust Speech Enhancement Targeting Effective Human Machine Interaction -- Candidate Speech Extraction from Multi-Speaker Single-Channel Audio Interviews -- Post-Processing of Translated Speech by Pole Modification and Residual Enhancement to Improve Perceptual Quality -- Region Normalized Capsule Network based Generative Adversarial Network for Non-Parallel Voice Conversion -- Speech Enhancement using LinkNet Architecture -- ATT:Adversarial Trained Transformer for Speech Enhancement -- Human Identification by Dynamics of Changes in Brain Frequencies Using Artificial Neural Networks -- Speech Prosody -- Analysis of Formant Trajectories of a Speech Signal for the Purpose of Forensic Identification of a Foreign Speaker -- Gestures vs. Prosodic Structure in Laboratory Ironic Speech -- Sounds of < sil > ence: Acoustics of Inhalation in Read Speech -- Prolongations as Hesitation Phenomena in Spoken Speech in First and Second Language -- Study of Indian English Pronunciation Variabilities Relative to Received Pronunciation -- Multimodal Collaboration in Expository Discourse: Verbal and Nonverbal Moves Alignment -- Association of Time Domain Features with Oral Cavity Configuration during Vowel Production and its Application in Vowel Recognition -- Prosodic Interaction Models in a Conversation -- Natural Language Processing -- Development and Research of Dialogue Agents with Long-Term Memory and Web Search -- Pre- and Post-Textual Contexts in Assessment of a Message as Offensive or Defensive Aggression Verbalization -- Boosting Rule-based Grapheme-to-Phoneme Conversion with Morphological Segmentation and Syllabification in Bengali -- Revisiting Assessment of Text Complexity: Lexical and Syntactic Parameters Fluctuations -- Analysis of Natural Language Understanding Systems with L2 Learner Specific Synthetic Grammatical Errors based on Parts-of-Speech -- On the Most Frequent Sequences of Words in Russian Spoken Everyday Language (Bigrams and Trigrams): An Experience of Classification -- Child Speech Processing -- Recognition of the Emotional State of Children by Video and Audio Modalities by Indian and Russian Experts -- Effect of Linear Prediction Order to Modify Formant Locations for Children Speech Recognition -- Gammatone-Filterbank based Pitch-Normalized Cepstral Coefficients for Zero-Resource Children’s ASR -- System Assisted Vocal Response Analysis and Assessment of Autism in Children: A Machine Learning Based Approach -- Addressing Effects of Formant Dispersion and Pitch Sensitivity for the Development of Children’s KWS System -- Development of Children’s KWS System Perceptual Experiment and Automatic Recognition by Video, Audio and Text Modalities -- Linear Frequency Residual Features for Infant Cry Classification -- Speech Processing for Medicine -- Identification of Voice Disorders: A Comparative Study of Machine Learning Algorithms -- Transfer Learning using Whisper for Dysarthric Automatic Speech Recognition -- Significance of Duration Modification in Reducing Listening Effort of Slurred Speech from Patients with Traumatic Brain Injury -- Significance of Duration Modification in Reducing Listening Effort of Slurred Speech from Patients with Traumatic Brain Injury -- Respiratory Sickness Detection from Audio Recordings using CLIP Models -- Investigating the Effect of Data Impurity on the Detection Performances of Mental Disorders through Spoken Dialogues.
Record Nr. UNISA-996565866403316
Karpov Alexey  
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
Speech and Computer [[electronic resource] ] : 25th International Conference, SPECOM 2023, Dharwad, India, November 29 – December 2, 2023, Proceedings, Part II / / edited by Alexey Karpov, K. Samudravijaya, K. T. Deepak, Rajesh M. Hegde, Shyam S. Agrawal, S. R. Mahadeva Prasanna
Speech and Computer [[electronic resource] ] : 25th International Conference, SPECOM 2023, Dharwad, India, November 29 – December 2, 2023, Proceedings, Part II / / edited by Alexey Karpov, K. Samudravijaya, K. T. Deepak, Rajesh M. Hegde, Shyam S. Agrawal, S. R. Mahadeva Prasanna
Autore Karpov Alexey
Edizione [1st ed. 2023.]
Pubbl/distr/stampa Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Descrizione fisica 1 online resource (587 pages)
Disciplina 006.3
Altri autori (Persone) SamudravijayaK
DeepakK. T
HegdeRajesh M
AgrawalShyam S
PrasannaS. R. Mahadeva
Collana Lecture Notes in Artificial Intelligence
Soggetto topico Artificial intelligence
Image processing - Digital techniques
Computer vision
Computer engineering
Computer networks
Application software
Artificial Intelligence
Computer Imaging, Vision, Pattern Recognition and Graphics
Computer Engineering and Networks
Computer and Information Systems Applications
ISBN 3-031-48312-X
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Industrial Speech and Language Technology -- Analysing Breathing Patterns in Reading and Spontaneous Speech -- Audio-Visual Speaker Verification via Joint Cross Attention -- A Novel Scheme to Classify Read and Spontaneous Speech -- Analysis of a Hinglish ASR System’s Performance for Fraud Detection -- Anomaly Detection in Speech: A Comprehensive Approach for Enhanced Speech Analysis -- CAPTuring Accents: An Approach to Personalize Pronunciation Training for Learners with Different L1 Backgrounds -- Speech Technology for Under-Resourced Languages -- Improvements in Language Modeling, Voice Activity Detection, and Lexicon in OpenASR21 Low Resource Languages -- Phone Durations Modeling for Livvi-Karelian ASR -- Significance of Indic Self-Supervised Speech Representations for Indic Under-Resourced ASR -- Study of Various End-to-End Keyword Spotting Systems on the Bengali language under Low-Resource Condition -- Bridging the Gap: Towards Linguistic Resource Development for the Low-Resource Lambani Language -- Studying the Effect of Frame-Level Concatenation of GFCC and TS-MFCC Features on Zero-Shot Children’s ASR -- Code-Mixed Text-to-Speech Synthesis under Low-Resource Constraints -- An End-to-End TTS Model in Chhattisgarhi, a Low-Resource Indian Language -- An ASR Corpus in Chhattisgarhi, a Low Resource Indian Language -- Cross Lingual Style Transfer using Multiscale Loss Function for Soliga: A Low Resource Tribal Language -- Preliminary Analysis of Lambani Vowels and Vowel Classification using Acoustic Feature -- Curriculum Learning based Approach for Faster Convergence of TTS Model -- Rhythm Measures and Language Endangerment: the Case of Deori -- Konkani Phonetic Transcription System 1.0 -- Speech Analysis and Synthesis -- E-TTS: Expressive Text-to-Speech Synthesis for Hindi using Data Augmentation -- Direct vs Cascaded Speech-to-Speech Translation using Transformer -- Deep Learning based Speech Quality Assessment Focusing on Noise Effects -- Quantifying the Emotional Landscape of Music with Three Dimensions -- Analysis of Mandarin vs. English Language for Emotional Voice Conversion -- Audio DeepFake Detection Employing Multiple Parametric Exponential Linear Units -- A Comparison of Learned Representations with Jointly Optimized VAE and DNN for Syllable Stress Detection -- On the Asymptotic Behaviour of the Speech Signal -- Improvement of Audio-Visual Keyword Spotting System Accuracy using Excitation Source Feature -- Developing a Question Answering System on the material of Holocaust survivors’ testimonies in Russian -- Enhancing Children’s Short Utterance based ASV using Data Augmentation Techniques and Feature Concatenation Approach -- Studying the Effectiveness of Data Augmentation and Frequency-Domain Linear Prediction Coefficients in Children’s Speaker Verification under Low-Resource Conditions -- Constant-Q based Harmonic and Pitch Features for Normal vs Pathological Infant Cry Classification -- Robustness of Whisper Features for Infant Cry Classification -- Speaker and Language Identification, Verification, and Diarization -- I-MSV 2022: Indic-Multilingual and Multi-Sensor Speaker Verification Challenge -- Multi-Task Learning over Mixup Variants for the Speaker Verification Task -- Exploring the Impact of Different Approaches for Spoken Dialect Identification of Konkani Language -- Adversarially Trained Hierarchical Attention Network for Domain-Invariant Spoken Language Identification -- Ensemble of Incremental System Enhancements for Robust Speaker Diarization in Code-Switched Real-Life Audios -- Enhancing Language Identification in Indian Context through Exploiting Learned Features with Wav2Vec2.0 -- Design and Development of Voice OTP Authentication System -- End-to-End Native Language Identification using a Modified Vision Transformer(ViT) from L2 English Speech -- Dialect Identification in Ao using Modulation-based Representation -- Self-Supervised Speaker Verification Employing Augmentation Mix and Self-Augmented Training-based Clustering. .
Record Nr. UNISA-996565868103316
Karpov Alexey  
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
Speech and Computer : 25th International Conference, SPECOM 2023, Dharwad, India, November 29 – December 2, 2023, Proceedings, Part II / / edited by Alexey Karpov, K. Samudravijaya, K. T. Deepak, Rajesh M. Hegde, Shyam S. Agrawal, S. R. Mahadeva Prasanna
Speech and Computer : 25th International Conference, SPECOM 2023, Dharwad, India, November 29 – December 2, 2023, Proceedings, Part II / / edited by Alexey Karpov, K. Samudravijaya, K. T. Deepak, Rajesh M. Hegde, Shyam S. Agrawal, S. R. Mahadeva Prasanna
Autore Karpov Alexey
Edizione [1st ed. 2023.]
Pubbl/distr/stampa Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Descrizione fisica 1 online resource (587 pages)
Disciplina 006.3
Altri autori (Persone) SamudravijayaK
DeepakK. T
HegdeRajesh M
AgrawalShyam S
PrasannaS. R. Mahadeva
Collana Lecture Notes in Artificial Intelligence
Soggetto topico Artificial intelligence
Image processing - Digital techniques
Computer vision
Computer engineering
Computer networks
Application software
Artificial Intelligence
Computer Imaging, Vision, Pattern Recognition and Graphics
Computer Engineering and Networks
Computer and Information Systems Applications
ISBN 3-031-48312-X
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Industrial Speech and Language Technology -- Analysing Breathing Patterns in Reading and Spontaneous Speech -- Audio-Visual Speaker Verification via Joint Cross Attention -- A Novel Scheme to Classify Read and Spontaneous Speech -- Analysis of a Hinglish ASR System’s Performance for Fraud Detection -- Anomaly Detection in Speech: A Comprehensive Approach for Enhanced Speech Analysis -- CAPTuring Accents: An Approach to Personalize Pronunciation Training for Learners with Different L1 Backgrounds -- Speech Technology for Under-Resourced Languages -- Improvements in Language Modeling, Voice Activity Detection, and Lexicon in OpenASR21 Low Resource Languages -- Phone Durations Modeling for Livvi-Karelian ASR -- Significance of Indic Self-Supervised Speech Representations for Indic Under-Resourced ASR -- Study of Various End-to-End Keyword Spotting Systems on the Bengali language under Low-Resource Condition -- Bridging the Gap: Towards Linguistic Resource Development for the Low-Resource Lambani Language -- Studying the Effect of Frame-Level Concatenation of GFCC and TS-MFCC Features on Zero-Shot Children’s ASR -- Code-Mixed Text-to-Speech Synthesis under Low-Resource Constraints -- An End-to-End TTS Model in Chhattisgarhi, a Low-Resource Indian Language -- An ASR Corpus in Chhattisgarhi, a Low Resource Indian Language -- Cross Lingual Style Transfer using Multiscale Loss Function for Soliga: A Low Resource Tribal Language -- Preliminary Analysis of Lambani Vowels and Vowel Classification using Acoustic Feature -- Curriculum Learning based Approach for Faster Convergence of TTS Model -- Rhythm Measures and Language Endangerment: the Case of Deori -- Konkani Phonetic Transcription System 1.0 -- Speech Analysis and Synthesis -- E-TTS: Expressive Text-to-Speech Synthesis for Hindi using Data Augmentation -- Direct vs Cascaded Speech-to-Speech Translation using Transformer -- Deep Learning based Speech Quality Assessment Focusing on Noise Effects -- Quantifying the Emotional Landscape of Music with Three Dimensions -- Analysis of Mandarin vs. English Language for Emotional Voice Conversion -- Audio DeepFake Detection Employing Multiple Parametric Exponential Linear Units -- A Comparison of Learned Representations with Jointly Optimized VAE and DNN for Syllable Stress Detection -- On the Asymptotic Behaviour of the Speech Signal -- Improvement of Audio-Visual Keyword Spotting System Accuracy using Excitation Source Feature -- Developing a Question Answering System on the material of Holocaust survivors’ testimonies in Russian -- Enhancing Children’s Short Utterance based ASV using Data Augmentation Techniques and Feature Concatenation Approach -- Studying the Effectiveness of Data Augmentation and Frequency-Domain Linear Prediction Coefficients in Children’s Speaker Verification under Low-Resource Conditions -- Constant-Q based Harmonic and Pitch Features for Normal vs Pathological Infant Cry Classification -- Robustness of Whisper Features for Infant Cry Classification -- Speaker and Language Identification, Verification, and Diarization -- I-MSV 2022: Indic-Multilingual and Multi-Sensor Speaker Verification Challenge -- Multi-Task Learning over Mixup Variants for the Speaker Verification Task -- Exploring the Impact of Different Approaches for Spoken Dialect Identification of Konkani Language -- Adversarially Trained Hierarchical Attention Network for Domain-Invariant Spoken Language Identification -- Ensemble of Incremental System Enhancements for Robust Speaker Diarization in Code-Switched Real-Life Audios -- Enhancing Language Identification in Indian Context through Exploiting Learned Features with Wav2Vec2.0 -- Design and Development of Voice OTP Authentication System -- End-to-End Native Language Identification using a Modified Vision Transformer(ViT) from L2 English Speech -- Dialect Identification in Ao using Modulation-based Representation -- Self-Supervised Speaker Verification Employing Augmentation Mix and Self-Augmented Training-based Clustering. .
Record Nr. UNINA-9910768462003321
Karpov Alexey  
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Speech and Computer : 25th International Conference, SPECOM 2023, Dharwad, India, November 29 – December 2, 2023, Proceedings, Part I / / edited by Alexey Karpov, K. Samudravijaya, K. T. Deepak, Rajesh M. Hegde, Shyam S. Agrawal, S. R. Mahadeva Prasanna
Speech and Computer : 25th International Conference, SPECOM 2023, Dharwad, India, November 29 – December 2, 2023, Proceedings, Part I / / edited by Alexey Karpov, K. Samudravijaya, K. T. Deepak, Rajesh M. Hegde, Shyam S. Agrawal, S. R. Mahadeva Prasanna
Autore Karpov Alexey
Edizione [1st ed. 2023.]
Pubbl/distr/stampa Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Descrizione fisica 1 online resource (657 pages)
Disciplina 006.3
Altri autori (Persone) SamudravijayaK
DeepakK. T
HegdeRajesh M
AgrawalShyam S
PrasannaS. R. Mahadeva
Collana Lecture Notes in Artificial Intelligence
Soggetto topico Artificial intelligence
Computer engineering
Computer networks
Application software
Image processing - Digital techniques
Computer vision
Artificial Intelligence
Computer Engineering and Networks
Computer and Information Systems Applications
Computer Imaging, Vision, Pattern Recognition and Graphics
ISBN 3-031-48309-X
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Automatic Speech Recognition -- Extreme Learning Layer: A Boost for Spoken Digit Recognition with Spiking Neural Networks -- EMO-AVSR: Two-Level Approach for Audio-Visual Emotional Speech Recognition -- Significance of Audio Quality in Speech-to-Text Translation Systems -- Everyday Conversations: a Comparative Study of Expert Transcriptions and ASR Outputs at a Lexical Level -- Improving Automatic Speech Recognition with Dialect-Specific Language Models -- Emotional speech recognition of Holocaust survivors with deep neural network models for Russian language -- Computational Paralinguistics -- Aggregation Strategies of Wav2vec 2.0 Embeddings for Computational Paralinguistic Tasks -- Rhythm Formant Analysis for Automatic Depression Classification -- Determining Alcohol Intoxication Based on Speech and Neural Networks -- Linear Frequency Residual Cepstral Coefficients for Speech Emotion Recognition -- Enhancing Stutter Detection in Speech using Zero Time Windowing Cepstral Coefficients and Phase Information -- Source and System-based Modulation Approach for Fake Speech Detection -- Digital Signal Processing -- Investigation of Different Calibration Methods for Deep Speaker Embedding based Verification Systems -- Learning to Predict Speech Intelligibility from Speech Distortions -- Sparse Representation Frameworks for Acoustic Scene Classification -- Driver Speech Detection in Real Driving Scenario -- Regularization based Incremental Learning in TCNN for Robust Speech Enhancement Targeting Effective Human Machine Interaction -- Candidate Speech Extraction from Multi-Speaker Single-Channel Audio Interviews -- Post-Processing of Translated Speech by Pole Modification and Residual Enhancement to Improve Perceptual Quality -- Region Normalized Capsule Network based Generative Adversarial Network for Non-Parallel Voice Conversion -- Speech Enhancement using LinkNet Architecture -- ATT:Adversarial Trained Transformer for Speech Enhancement -- Human Identification by Dynamics of Changes in Brain Frequencies Using Artificial Neural Networks -- Speech Prosody -- Analysis of Formant Trajectories of a Speech Signal for the Purpose of Forensic Identification of a Foreign Speaker -- Gestures vs. Prosodic Structure in Laboratory Ironic Speech -- Sounds of < sil > ence: Acoustics of Inhalation in Read Speech -- Prolongations as Hesitation Phenomena in Spoken Speech in First and Second Language -- Study of Indian English Pronunciation Variabilities Relative to Received Pronunciation -- Multimodal Collaboration in Expository Discourse: Verbal and Nonverbal Moves Alignment -- Association of Time Domain Features with Oral Cavity Configuration during Vowel Production and its Application in Vowel Recognition -- Prosodic Interaction Models in a Conversation -- Natural Language Processing -- Development and Research of Dialogue Agents with Long-Term Memory and Web Search -- Pre- and Post-Textual Contexts in Assessment of a Message as Offensive or Defensive Aggression Verbalization -- Boosting Rule-based Grapheme-to-Phoneme Conversion with Morphological Segmentation and Syllabification in Bengali -- Revisiting Assessment of Text Complexity: Lexical and Syntactic Parameters Fluctuations -- Analysis of Natural Language Understanding Systems with L2 Learner Specific Synthetic Grammatical Errors based on Parts-of-Speech -- On the Most Frequent Sequences of Words in Russian Spoken Everyday Language (Bigrams and Trigrams): An Experience of Classification -- Child Speech Processing -- Recognition of the Emotional State of Children by Video and Audio Modalities by Indian and Russian Experts -- Effect of Linear Prediction Order to Modify Formant Locations for Children Speech Recognition -- Gammatone-Filterbank based Pitch-Normalized Cepstral Coefficients for Zero-Resource Children’s ASR -- System Assisted Vocal Response Analysis and Assessment of Autism in Children: A Machine Learning Based Approach -- Addressing Effects of Formant Dispersion and Pitch Sensitivity for the Development of Children’s KWS System -- Development of Children’s KWS System Perceptual Experiment and Automatic Recognition by Video, Audio and Text Modalities -- Linear Frequency Residual Features for Infant Cry Classification -- Speech Processing for Medicine -- Identification of Voice Disorders: A Comparative Study of Machine Learning Algorithms -- Transfer Learning using Whisper for Dysarthric Automatic Speech Recognition -- Significance of Duration Modification in Reducing Listening Effort of Slurred Speech from Patients with Traumatic Brain Injury -- Significance of Duration Modification in Reducing Listening Effort of Slurred Speech from Patients with Traumatic Brain Injury -- Respiratory Sickness Detection from Audio Recordings using CLIP Models -- Investigating the Effect of Data Impurity on the Detection Performances of Mental Disorders through Spoken Dialogues.
Record Nr. UNINA-9910766883503321
Karpov Alexey  
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui