top

  Info

  • Utilizzare la checkbox di selezione a fianco di ciascun documento per attivare le funzionalità di stampa, invio email, download nei formati disponibili del (i) record.

  Info

  • Utilizzare questo link per rimuovere la selezione effettuata.
Distant speech recognition / / Matthias Wèolfel and John McDonough
Distant speech recognition / / Matthias Wèolfel and John McDonough
Autore Wèolfel Matthias
Pubbl/distr/stampa Chichester, U.K. : , : Wiley, , 2009
Descrizione fisica 1 online resource (595 p.)
Disciplina 006.4/54
Altri autori (Persone) McDonoughJohn (John W.)
Soggetto topico Automatic speech recognition
Pattern perception
ISBN 1-282-12304-1
9786612123047
0-470-71408-5
0-470-71407-7
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Foreword -- Preface -- 1 Introduction -- 1.1 Research and Applications in Academia and Industry -- 1.2 Challenges in Distant Speech Recognition -- 1.3 System Evaluation -- 1.4 Fields of Speech Recognition -- 1.5 Robust Perception -- 1.6 Organizations, Conferences and Journals -- 1.7 Useful Tools, Data Resources and Evaluation Campaigns -- 1.8 Organization of this Book -- 1.9 Principal Symbols used Throughout the Book -- 1.10 Units used Throughout the Book -- 2 Acoustics -- 2.1 Physical Aspect of Sound -- 2.2 Speech Signals -- 2.3 Human Perception of Sound -- 2.4 The Acoustic Environment -- 2.5 Recording Techniques and Sensor Configuration -- 2.6 Summary and Further Reading -- 2.7 Principal Symbols -- 3 Signal Processing and Filtering Techniques -- 3.1 Linear Time-Invariant Systems -- 3.2 The Discrete Fourier Transform -- 3.3 Short-Time Fourier Transform -- 3.4 Summary and Further Reading -- 3.5 Principal Symbols -- 4 Bayesian Filters -- 4.1 Sequential Bayesian Estimation -- 4.2 Wiener Filter -- 4.3 Kalman Filter and Variations -- 4.4 Particle Filters -- 4.5 Summary and Further Reading -- 4.6 Principal Symbols -- 5 Speech Feature Extraction -- 5.1 Short-Time Spectral Analysis -- 5.2 Perceptually Motivated Representation -- 5.3 Spectral Estimation and Analysis -- 5.4 Cepstral Processing -- 5.5 Comparison between Mel Frequency, Perceptual LP and warped MVDR Cepstral Coefficient Frontends -- 5.6 Feature Augmentation -- 5.7 Feature Reduction -- 5.8 Feature-Space Minimum Phone Error -- 5.9 Summary and Further Reading -- 5.10 Principal Symbols -- 6 Speech Feature Enhancement -- 6.1 Noise and Reverberation in Various Domains -- 6.2 Two Principal Approaches -- 6.3 Direct Speech Feature Enhancement -- 6.4 Schematics of Indirect Speech Feature Enhancement -- 6.5 Estimating Additive Distortion -- 6.6 Estimating Convolutional Distortion -- 6.7 Distortion Evolution -- 6.8 Distortion Evaluation -- 6.9 Distortion Compensation -- 6.10 Joint Estimation of Additive and Convolutional Distortions.
6.11 Observation Uncertainty -- 6.12 Summary and Further Reading -- 6.13 Principal Symbols -- 7 Search: Finding the Best Word Hypothesis -- 7.1 Fundamentals of Search -- 7.2 Weighted Finite-State Transducers -- 7.3 Knowledge Sources -- 7.4 Fast On-the-Fly Composition -- 7.5 Word and Lattice Combination -- 7.6 Summary and Further Reading -- 7.7 Principal Symbols -- 8 Hidden Markov Model Parameter Estimation -- 8.1 Maximum Likelihood Parameter Estimation -- 8.2 Discriminative Parameter Estimation -- 8.3 Summary and Further Reading -- 8.4 Principal Symbols -- 9 Feature and Model Transformation -- 9.1 Feature Transformation Techniques -- 9.2 Model Transformation Techniques -- 9.3 Acoustic Model Combination -- 9.4 Summary and Further Reading -- 9.5 Principal Symbols -- 10 Speaker Localization and Tracking -- 10.1 Conventional Techniques -- 10.2 Speaker Tracking with the Kalman Filter -- 10.3 Tracking Multiple Simultaneous Speakers -- 10.4 Audio-Visual Speaker Tracking -- 10.5 Speaker Tracking with the Particle Filter -- 10.6 Summary and Further Reading -- 10.7 Principal Symbols -- 11 Digital Filter Banks -- 11.1 Uniform Discrete Fourier Transform Filter Banks -- 11.2 Polyphase Implementation -- 11.3 Decimation and Expansion -- 11.4 Noble Identities -- 11.5 Nyquist(M) Filters -- 11.6 Filter Bank Design of De Haan et al -- 11.7 Filter Bank Design with the Nyquist(M) Criterion -- 11.8 Quality Assessment of Filter Bank Prototypes -- 11.9 Summary and Further Reading -- 11.10 Principal Symbols -- 12 Blind Source Separation -- 12.1 Channel Quality and Selection -- 12.2 Independent Component Analysis -- 12.3 BSS Algorithms based on Second-Order Statistics -- 12.4 Summary and Further Reading -- 12.5 Principal Symbols -- 13 Beamforming -- 13.1 Beamforming Fundamentals -- 13.2 Beamforming Performance Measures -- 13.3 Conventional Beamforming Algorithms -- 13.4 Recursive Algorithms -- 13.5 Nonconventional Beamforming Algorithms -- 13.6 Array Shape Calibration -- 13.7 Summary and Further Reading.
13.8 Principal Symbols -- 14 Hands On -- 14.1 Example Room Configurations -- 14.2 Automatic Speech Recognition Engines -- 14.3 Word Error Rate -- 14.4 Single-Channel Feature Enhancement Experiments -- 14.5 Acoustic Speaker-Tracking Experiments -- 14.6 Audio-Video Speaker-Tracking Experiments -- 14.7 Speaker-Tracking Performance vs Word Error Rate -- 14.8 Single-Speaker Beamforming Experiments -- 14.9 Speech Separation Experiments -- 14.10 Filter Bank Experiments -- 14.11 Summary and Further Reading -- Appendices -- A List of Abbreviations -- B Useful Background -- B.1 Discrete Cosine Transform -- B.2 Matrix Inversion Lemma -- B.3 Cholesky Decomposition -- B.4 Distance Measures -- B.5 Super-Gaussian Probability Density Functions -- B.6 Entropy -- B.7 Relative Entropy -- B.8 Transformation Law of Probabilities -- B.9 Cascade of Warping Stages -- B.10 Taylor Series -- B.11 Correlation and Covariance -- B.12 Bessel Functions -- B.13 Proof of the Nyquist / Shannon Sampling Theorem -- B.14 Proof of Equations (11.31 / 11.32) -- B.15 Givens Rotations -- B.16 Derivatives with Respect to Complex Vectors -- B.17 Perpendicular Projection Operators -- Bibliography -- Index.
Record Nr. UNINA-9910143139903321
Wèolfel Matthias  
Chichester, U.K. : , : Wiley, , 2009
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Distant speech recognition / / Matthias Wèolfel and John McDonough
Distant speech recognition / / Matthias Wèolfel and John McDonough
Autore Wèolfel Matthias
Pubbl/distr/stampa Chichester, U.K. : , : Wiley, , 2009
Descrizione fisica 1 online resource (595 p.)
Disciplina 006.4/54
Altri autori (Persone) McDonoughJohn (John W.)
Soggetto topico Automatic speech recognition
Pattern perception
ISBN 1-282-12304-1
9786612123047
0-470-71408-5
0-470-71407-7
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Foreword -- Preface -- 1 Introduction -- 1.1 Research and Applications in Academia and Industry -- 1.2 Challenges in Distant Speech Recognition -- 1.3 System Evaluation -- 1.4 Fields of Speech Recognition -- 1.5 Robust Perception -- 1.6 Organizations, Conferences and Journals -- 1.7 Useful Tools, Data Resources and Evaluation Campaigns -- 1.8 Organization of this Book -- 1.9 Principal Symbols used Throughout the Book -- 1.10 Units used Throughout the Book -- 2 Acoustics -- 2.1 Physical Aspect of Sound -- 2.2 Speech Signals -- 2.3 Human Perception of Sound -- 2.4 The Acoustic Environment -- 2.5 Recording Techniques and Sensor Configuration -- 2.6 Summary and Further Reading -- 2.7 Principal Symbols -- 3 Signal Processing and Filtering Techniques -- 3.1 Linear Time-Invariant Systems -- 3.2 The Discrete Fourier Transform -- 3.3 Short-Time Fourier Transform -- 3.4 Summary and Further Reading -- 3.5 Principal Symbols -- 4 Bayesian Filters -- 4.1 Sequential Bayesian Estimation -- 4.2 Wiener Filter -- 4.3 Kalman Filter and Variations -- 4.4 Particle Filters -- 4.5 Summary and Further Reading -- 4.6 Principal Symbols -- 5 Speech Feature Extraction -- 5.1 Short-Time Spectral Analysis -- 5.2 Perceptually Motivated Representation -- 5.3 Spectral Estimation and Analysis -- 5.4 Cepstral Processing -- 5.5 Comparison between Mel Frequency, Perceptual LP and warped MVDR Cepstral Coefficient Frontends -- 5.6 Feature Augmentation -- 5.7 Feature Reduction -- 5.8 Feature-Space Minimum Phone Error -- 5.9 Summary and Further Reading -- 5.10 Principal Symbols -- 6 Speech Feature Enhancement -- 6.1 Noise and Reverberation in Various Domains -- 6.2 Two Principal Approaches -- 6.3 Direct Speech Feature Enhancement -- 6.4 Schematics of Indirect Speech Feature Enhancement -- 6.5 Estimating Additive Distortion -- 6.6 Estimating Convolutional Distortion -- 6.7 Distortion Evolution -- 6.8 Distortion Evaluation -- 6.9 Distortion Compensation -- 6.10 Joint Estimation of Additive and Convolutional Distortions.
6.11 Observation Uncertainty -- 6.12 Summary and Further Reading -- 6.13 Principal Symbols -- 7 Search: Finding the Best Word Hypothesis -- 7.1 Fundamentals of Search -- 7.2 Weighted Finite-State Transducers -- 7.3 Knowledge Sources -- 7.4 Fast On-the-Fly Composition -- 7.5 Word and Lattice Combination -- 7.6 Summary and Further Reading -- 7.7 Principal Symbols -- 8 Hidden Markov Model Parameter Estimation -- 8.1 Maximum Likelihood Parameter Estimation -- 8.2 Discriminative Parameter Estimation -- 8.3 Summary and Further Reading -- 8.4 Principal Symbols -- 9 Feature and Model Transformation -- 9.1 Feature Transformation Techniques -- 9.2 Model Transformation Techniques -- 9.3 Acoustic Model Combination -- 9.4 Summary and Further Reading -- 9.5 Principal Symbols -- 10 Speaker Localization and Tracking -- 10.1 Conventional Techniques -- 10.2 Speaker Tracking with the Kalman Filter -- 10.3 Tracking Multiple Simultaneous Speakers -- 10.4 Audio-Visual Speaker Tracking -- 10.5 Speaker Tracking with the Particle Filter -- 10.6 Summary and Further Reading -- 10.7 Principal Symbols -- 11 Digital Filter Banks -- 11.1 Uniform Discrete Fourier Transform Filter Banks -- 11.2 Polyphase Implementation -- 11.3 Decimation and Expansion -- 11.4 Noble Identities -- 11.5 Nyquist(M) Filters -- 11.6 Filter Bank Design of De Haan et al -- 11.7 Filter Bank Design with the Nyquist(M) Criterion -- 11.8 Quality Assessment of Filter Bank Prototypes -- 11.9 Summary and Further Reading -- 11.10 Principal Symbols -- 12 Blind Source Separation -- 12.1 Channel Quality and Selection -- 12.2 Independent Component Analysis -- 12.3 BSS Algorithms based on Second-Order Statistics -- 12.4 Summary and Further Reading -- 12.5 Principal Symbols -- 13 Beamforming -- 13.1 Beamforming Fundamentals -- 13.2 Beamforming Performance Measures -- 13.3 Conventional Beamforming Algorithms -- 13.4 Recursive Algorithms -- 13.5 Nonconventional Beamforming Algorithms -- 13.6 Array Shape Calibration -- 13.7 Summary and Further Reading.
13.8 Principal Symbols -- 14 Hands On -- 14.1 Example Room Configurations -- 14.2 Automatic Speech Recognition Engines -- 14.3 Word Error Rate -- 14.4 Single-Channel Feature Enhancement Experiments -- 14.5 Acoustic Speaker-Tracking Experiments -- 14.6 Audio-Video Speaker-Tracking Experiments -- 14.7 Speaker-Tracking Performance vs Word Error Rate -- 14.8 Single-Speaker Beamforming Experiments -- 14.9 Speech Separation Experiments -- 14.10 Filter Bank Experiments -- 14.11 Summary and Further Reading -- Appendices -- A List of Abbreviations -- B Useful Background -- B.1 Discrete Cosine Transform -- B.2 Matrix Inversion Lemma -- B.3 Cholesky Decomposition -- B.4 Distance Measures -- B.5 Super-Gaussian Probability Density Functions -- B.6 Entropy -- B.7 Relative Entropy -- B.8 Transformation Law of Probabilities -- B.9 Cascade of Warping Stages -- B.10 Taylor Series -- B.11 Correlation and Covariance -- B.12 Bessel Functions -- B.13 Proof of the Nyquist / Shannon Sampling Theorem -- B.14 Proof of Equations (11.31 / 11.32) -- B.15 Givens Rotations -- B.16 Derivatives with Respect to Complex Vectors -- B.17 Perpendicular Projection Operators -- Bibliography -- Index.
Record Nr. UNINA-9910808716103321
Wèolfel Matthias  
Chichester, U.K. : , : Wiley, , 2009
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui