MPEG-7 audio and beyond [[electronic resource] ] : audio content indexing and retrieval / / Hyoung-Gook Kim, Nicolas Moreau, Thomas Sikora |
Autore | Kim Hyoung-Gook |
Pubbl/distr/stampa | Chichester, West Sussex, England ; ; Hoboken, NJ, USA, : J. Wiley, c2005 |
Descrizione fisica | 1 online resource (305 p.) |
Disciplina |
006.6/96
006.696 |
Altri autori (Persone) |
MoreauNicolas
SikoraThomas |
Soggetto topico |
MPEG (Video coding standard)
Multimedia systems Sound - Recording and reproducing - Digital techniques - Standards |
Soggetto genere / forma | Electronic books. |
ISBN |
1-280-33982-9
9786610339822 0-470-09336-6 0-470-09335-8 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto |
MPEG-7 Audio and Beyond; Contents; List of Acronyms; List of Symbols; 1 Introduction; 1.1 Audio Content Description; 1.2 MPEG-7 Audio Content Description - An Overview; 1.2.1 MPEG-7 Low-Level Descriptors; 1.2.2 MPEG-7 Description Schemes; 1.2.3 MPEG-7 Description Definition Language (DDL); 1.2.4 BiM (Binary Format for MPEG-7); 1.3 Organization of the Book; 2 Low-Level Descriptors; 2.1 Introduction; 2.2 Basic Parameters and Notations; 2.2.1 Time Domain; 2.2.2 Frequency Domain; 2.3 Scalable Series; 2.3.1 Series of Scalars; 2.3.2 Series of Vectors; 2.3.3 Binary Series; 2.4 Basic Descriptors
2.4.1 Audio Waveform2.4.2 Audio Power; 2.5 Basic Spectral Descriptors; 2.5.1 Audio Spectrum Envelope; 2.5.2 Audio Spectrum Centroid; 2.5.3 Audio Spectrum Spread; 2.5.4 Audio Spectrum Flatness; 2.6 Basic Signal Parameters; 2.6.1 Audio Harmonicity; 2.6.2 Audio Fundamental Frequency; 2.7 Timbral Descriptors; 2.7.1 Temporal Timbral: Requirements; 2.7.2 Log Attack Time; 2.7.3 Temporal Centroid; 2.7.4 Spectral Timbral: Requirements; 2.7.5 Harmonic Spectral Centroid; 2.7.6 Harmonic Spectral Deviation; 2.7.7 Harmonic Spectral Spread; 2.7.8 Harmonic Spectral Variation; 2.7.9 Spectral Centroid 2.8 Spectral Basis Representations2.9 Silence Segment; 2.10 Beyond the Scope of MPEG-7; 2.10.1 Other Low-Level Descriptors; 2.10.2 Mel-Frequency Cepstrum Coefficients; References; 3 Sound Classification and Similarity; 3.1 Introduction; 3.2 Dimensionality Reduction; 3.2.1 Singular Value Decomposition (SVD); 3.2.2 Principal Component Analysis (PCA); 3.2.3 Independent Component Analysis (ICA); 3.2.4 Non-Negative Factorization (NMF); 3.3 Classification Methods; 3.3.1 Gaussian Mixture Model (GMM); 3.3.2 Hidden Markov Model (HMM); 3.3.3 Neural Network (NN); 3.3.4 Support Vector Machine (SVM) 3.4 MPEG-7 Sound Classification3.4.1 MPEG-7 Audio Spectrum Projection (ASP) Feature Extraction; 3.4.2 Training Hidden Markov Models (HMMs); 3.4.3 Classification of Sounds; 3.5 Comparison of MPEG-7 Audio Spectrum Projection vs. MFCC Features; 3.6 Indexing and Similarity; 3.6.1 Audio Retrieval Using Histogram Sum of Squared Differences; 3.7 Simulation Results and Discussion; 3.7.1 Plots of MPEG-7 Audio Descriptors; 3.7.2 Parameter Selection; 3.7.3 Results for Distinguishing Between Speech, Music and Environmental Sound; 3.7.4 Results of Sound Classification Using Three Audio Taxonomy Methods 3.7.5 Results for Speaker Recognition3.7.6 Results of Musical Instrument Classification; 3.7.7 Audio Retrieval Results; 3.8 Conclusions; References; 4 Spoken Content; 4.1 Introduction; 4.2 Automatic Speech Recognition; 4.2.1 Basic Principles; 4.2.2 Types of Speech Recognition Systems; 4.2.3 Recognition Results; 4.3 MPEG-7 SpokenContent Description; 4.3.1 General Structure; 4.3.2 SpokenContentHeader; 4.3.3 SpokenContentLattice; 4.4 Application: Spoken Document Retrieval; 4.4.1 Basic Principles of IR and SDR; 4.4.2 Vector Space Models; 4.4.3 Word-Based SDR 4.4.4 Sub-Word-Based Vector Space Models |
Record Nr. | UNINA-9910143709903321 |
Kim Hyoung-Gook | ||
Chichester, West Sussex, England ; ; Hoboken, NJ, USA, : J. Wiley, c2005 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. Federico II | ||
|
MPEG-7 audio and beyond [[electronic resource] ] : audio content indexing and retrieval / / Hyoung-Gook Kim, Nicolas Moreau, Thomas Sikora |
Autore | Kim Hyoung-Gook |
Pubbl/distr/stampa | Chichester, West Sussex, England ; ; Hoboken, NJ, USA, : J. Wiley, c2005 |
Descrizione fisica | 1 online resource (305 p.) |
Disciplina |
006.6/96
006.696 |
Altri autori (Persone) |
MoreauNicolas
SikoraThomas |
Soggetto topico |
MPEG (Video coding standard)
Multimedia systems Sound - Recording and reproducing - Digital techniques - Standards |
ISBN |
1-280-33982-9
9786610339822 0-470-09336-6 0-470-09335-8 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto |
MPEG-7 Audio and Beyond; Contents; List of Acronyms; List of Symbols; 1 Introduction; 1.1 Audio Content Description; 1.2 MPEG-7 Audio Content Description - An Overview; 1.2.1 MPEG-7 Low-Level Descriptors; 1.2.2 MPEG-7 Description Schemes; 1.2.3 MPEG-7 Description Definition Language (DDL); 1.2.4 BiM (Binary Format for MPEG-7); 1.3 Organization of the Book; 2 Low-Level Descriptors; 2.1 Introduction; 2.2 Basic Parameters and Notations; 2.2.1 Time Domain; 2.2.2 Frequency Domain; 2.3 Scalable Series; 2.3.1 Series of Scalars; 2.3.2 Series of Vectors; 2.3.3 Binary Series; 2.4 Basic Descriptors
2.4.1 Audio Waveform2.4.2 Audio Power; 2.5 Basic Spectral Descriptors; 2.5.1 Audio Spectrum Envelope; 2.5.2 Audio Spectrum Centroid; 2.5.3 Audio Spectrum Spread; 2.5.4 Audio Spectrum Flatness; 2.6 Basic Signal Parameters; 2.6.1 Audio Harmonicity; 2.6.2 Audio Fundamental Frequency; 2.7 Timbral Descriptors; 2.7.1 Temporal Timbral: Requirements; 2.7.2 Log Attack Time; 2.7.3 Temporal Centroid; 2.7.4 Spectral Timbral: Requirements; 2.7.5 Harmonic Spectral Centroid; 2.7.6 Harmonic Spectral Deviation; 2.7.7 Harmonic Spectral Spread; 2.7.8 Harmonic Spectral Variation; 2.7.9 Spectral Centroid 2.8 Spectral Basis Representations2.9 Silence Segment; 2.10 Beyond the Scope of MPEG-7; 2.10.1 Other Low-Level Descriptors; 2.10.2 Mel-Frequency Cepstrum Coefficients; References; 3 Sound Classification and Similarity; 3.1 Introduction; 3.2 Dimensionality Reduction; 3.2.1 Singular Value Decomposition (SVD); 3.2.2 Principal Component Analysis (PCA); 3.2.3 Independent Component Analysis (ICA); 3.2.4 Non-Negative Factorization (NMF); 3.3 Classification Methods; 3.3.1 Gaussian Mixture Model (GMM); 3.3.2 Hidden Markov Model (HMM); 3.3.3 Neural Network (NN); 3.3.4 Support Vector Machine (SVM) 3.4 MPEG-7 Sound Classification3.4.1 MPEG-7 Audio Spectrum Projection (ASP) Feature Extraction; 3.4.2 Training Hidden Markov Models (HMMs); 3.4.3 Classification of Sounds; 3.5 Comparison of MPEG-7 Audio Spectrum Projection vs. MFCC Features; 3.6 Indexing and Similarity; 3.6.1 Audio Retrieval Using Histogram Sum of Squared Differences; 3.7 Simulation Results and Discussion; 3.7.1 Plots of MPEG-7 Audio Descriptors; 3.7.2 Parameter Selection; 3.7.3 Results for Distinguishing Between Speech, Music and Environmental Sound; 3.7.4 Results of Sound Classification Using Three Audio Taxonomy Methods 3.7.5 Results for Speaker Recognition3.7.6 Results of Musical Instrument Classification; 3.7.7 Audio Retrieval Results; 3.8 Conclusions; References; 4 Spoken Content; 4.1 Introduction; 4.2 Automatic Speech Recognition; 4.2.1 Basic Principles; 4.2.2 Types of Speech Recognition Systems; 4.2.3 Recognition Results; 4.3 MPEG-7 SpokenContent Description; 4.3.1 General Structure; 4.3.2 SpokenContentHeader; 4.3.3 SpokenContentLattice; 4.4 Application: Spoken Document Retrieval; 4.4.1 Basic Principles of IR and SDR; 4.4.2 Vector Space Models; 4.4.3 Word-Based SDR 4.4.4 Sub-Word-Based Vector Space Models |
Record Nr. | UNINA-9910830304103321 |
Kim Hyoung-Gook | ||
Chichester, West Sussex, England ; ; Hoboken, NJ, USA, : J. Wiley, c2005 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. Federico II | ||
|
MPEG-7 audio and beyond : audio content indexing and retrieval / / Hyoung-Gook Kim, Nicolas Moreau, Thomas Sikora |
Autore | Kim Hyoung-Gook |
Pubbl/distr/stampa | Chichester, West Sussex, England ; ; Hoboken, NJ, USA, : J. Wiley, c2005 |
Descrizione fisica | 1 online resource (305 pages) |
Disciplina | 006.6/96 |
Altri autori (Persone) |
MoreauNicolas
SikoraThomas |
Soggetto topico |
MPEG (Video coding standard)
Multimedia systems Sound - Recording and reproducing - Digital techniques - Standards Estàndard MPEG Sistemes multimèdia |
ISBN |
9780470093368
1-280-33982-9 9786610339822 0-470-09336-6 0-470-09335-8 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto |
MPEG-7 Audio and Beyond; Contents; List of Acronyms; List of Symbols; 1 Introduction; 1.1 Audio Content Description; 1.2 MPEG-7 Audio Content Description - An Overview; 1.2.1 MPEG-7 Low-Level Descriptors; 1.2.2 MPEG-7 Description Schemes; 1.2.3 MPEG-7 Description Definition Language (DDL); 1.2.4 BiM (Binary Format for MPEG-7); 1.3 Organization of the Book; 2 Low-Level Descriptors; 2.1 Introduction; 2.2 Basic Parameters and Notations; 2.2.1 Time Domain; 2.2.2 Frequency Domain; 2.3 Scalable Series; 2.3.1 Series of Scalars; 2.3.2 Series of Vectors; 2.3.3 Binary Series; 2.4 Basic Descriptors
2.4.1 Audio Waveform2.4.2 Audio Power; 2.5 Basic Spectral Descriptors; 2.5.1 Audio Spectrum Envelope; 2.5.2 Audio Spectrum Centroid; 2.5.3 Audio Spectrum Spread; 2.5.4 Audio Spectrum Flatness; 2.6 Basic Signal Parameters; 2.6.1 Audio Harmonicity; 2.6.2 Audio Fundamental Frequency; 2.7 Timbral Descriptors; 2.7.1 Temporal Timbral: Requirements; 2.7.2 Log Attack Time; 2.7.3 Temporal Centroid; 2.7.4 Spectral Timbral: Requirements; 2.7.5 Harmonic Spectral Centroid; 2.7.6 Harmonic Spectral Deviation; 2.7.7 Harmonic Spectral Spread; 2.7.8 Harmonic Spectral Variation; 2.7.9 Spectral Centroid 2.8 Spectral Basis Representations2.9 Silence Segment; 2.10 Beyond the Scope of MPEG-7; 2.10.1 Other Low-Level Descriptors; 2.10.2 Mel-Frequency Cepstrum Coefficients; References; 3 Sound Classification and Similarity; 3.1 Introduction; 3.2 Dimensionality Reduction; 3.2.1 Singular Value Decomposition (SVD); 3.2.2 Principal Component Analysis (PCA); 3.2.3 Independent Component Analysis (ICA); 3.2.4 Non-Negative Factorization (NMF); 3.3 Classification Methods; 3.3.1 Gaussian Mixture Model (GMM); 3.3.2 Hidden Markov Model (HMM); 3.3.3 Neural Network (NN); 3.3.4 Support Vector Machine (SVM) 3.4 MPEG-7 Sound Classification3.4.1 MPEG-7 Audio Spectrum Projection (ASP) Feature Extraction; 3.4.2 Training Hidden Markov Models (HMMs); 3.4.3 Classification of Sounds; 3.5 Comparison of MPEG-7 Audio Spectrum Projection vs. MFCC Features; 3.6 Indexing and Similarity; 3.6.1 Audio Retrieval Using Histogram Sum of Squared Differences; 3.7 Simulation Results and Discussion; 3.7.1 Plots of MPEG-7 Audio Descriptors; 3.7.2 Parameter Selection; 3.7.3 Results for Distinguishing Between Speech, Music and Environmental Sound; 3.7.4 Results of Sound Classification Using Three Audio Taxonomy Methods 3.7.5 Results for Speaker Recognition3.7.6 Results of Musical Instrument Classification; 3.7.7 Audio Retrieval Results; 3.8 Conclusions; References; 4 Spoken Content; 4.1 Introduction; 4.2 Automatic Speech Recognition; 4.2.1 Basic Principles; 4.2.2 Types of Speech Recognition Systems; 4.2.3 Recognition Results; 4.3 MPEG-7 SpokenContent Description; 4.3.1 General Structure; 4.3.2 SpokenContentHeader; 4.3.3 SpokenContentLattice; 4.4 Application: Spoken Document Retrieval; 4.4.1 Basic Principles of IR and SDR; 4.4.2 Vector Space Models; 4.4.3 Word-Based SDR 4.4.4 Sub-Word-Based Vector Space Models |
Record Nr. | UNINA-9910877297003321 |
Kim Hyoung-Gook | ||
Chichester, West Sussex, England ; ; Hoboken, NJ, USA, : J. Wiley, c2005 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. Federico II | ||
|