1.

Record Nr.

UNISA996464513503316

Titolo

Multimedia for accessible human computer interfaces / / Troy McDaniel, Xueliang Liu, editors

Pubbl/distr/stampa

Cham, Switzerland : , : Springer, , [2021]

©2021

ISBN

3-030-70716-4

Descrizione fisica

1 online resource (310 pages)

Disciplina

006.7

Soggetti

Multimedia systems

Lingua di pubblicazione

Inglese

Formato

Materiale a stampa

Livello bibliografico

Monografia

Nota di bibliografia

Includes bibliographical references.

Nota di contenuto

Intro -- Foreword -- Preface -- Contents -- About the Editors -- Part I Vision-Based Technologies for Accessible Human Computer Interfaces -- A Framework for Gaze-Contingent Interfaces -- 1 Introduction -- 1.1 Gaze-Contingent Interface -- 1.2 Eye Tracking and Gaze Detection -- 1.3 Gaze-Contingent Interface Based on Near Infrared Camera of Mobile Device -- 2 Methods -- 2.1 Framework -- 2.2 Calibration and Standard Acquisition -- 2.3 Determination of Sagittal Plane -- 2.4 Calculation of POG with Head Adjustment -- 2.5 Gaze Prediction by LSTM -- 2.6 Measurement of Head and Eye Movements -- 3 Use Cases -- 3.1 Use Case #1: Select an Option on Screen -- 3.2 Use Case #2: Auto Screen Scrolling -- 4 Future Work -- References -- Sign Language Recognition -- 1 Online Early-Late Fusion Based on Adaptive HMM for Sign Language Recognition -- 1.1 Introduction -- 1.2 Adaptive HMMs -- 1.3 Early-Late Fusion -- 1.3.1 Feature Selection -- 1.3.2 Query-Adaptive Weighting -- 1.3.3 Score Fusion -- 1.4 Experiments -- 1.4.1 Experiments Setup -- 1.4.2 Experiment with HMM-States Adaptation -- 1.4.3 Comparison on Different Fusion Steps -- 1.4.4 Comparison on Different Dataset Sizes -- 1.4.5 Comparison on Different SLR Models -- 2 Hierarchical LSTM for Sign Language Translation -- 2.1 Introduction -- 2.2 Online Key Clip Mining -- 2.3 Hierarchical LSTM Encoder -- 2.3.1 Hierarchical Encoder -- 2.3.2 Pooling Strategy -- 2.3.3 Attention-Based Weighting -- 2.4 Sentence Generation -- 2.5 Experiment -- 2.5.1 Experiment Setup -- 2.5.2 Model Validation --



2.5.3 Comparison to Existing Methods -- 3 Dense Temporal Convolution Network for Sign Language Translation -- 3.1 Introduction -- 3.2 DenseTCN -- 3.3 Sentence Learning -- 3.3.1 CTC Decoder -- 3.3.2 Score Fusion and Translation -- 3.4 Experiments -- 3.4.1 Datasets -- 3.4.2 Evaluation Metrics -- 3.4.3 Implementation Details.

3.4.4 Depth Discussion -- 3.4.5 Comparison -- 4 Joint Optimization for Translation and Sign Labeling -- 4.1 Introduction -- 4.2 Clip Feature Learning in Videos -- 4.3 Joint Loss Optimization -- 4.3.1 CTC Loss for CTTR Module -- 4.3.2 Cross-Entropy Loss for FCLS Module -- 4.3.3 Triplet Loss for FCOR Module -- 4.4 Experiment -- 4.4.1 Experiment Setup -- 4.4.2 Model Validation -- 4.4.3 Main Comparison -- References -- Fusion-Based Image Enhancement and Its Applicationsin Mobile Devices -- 1 Introduction -- 2 Related Works -- 3 Fusion-Based Enhancement Models -- 3.1 A General Framework of Linear Fusion -- 3.2 Naturalness-Preserving Low-Light Enhancement -- 3.3 Mixed Pencil Drawing Generation -- 4 Applications in Mobile Devices -- 4.1 FFT Acceleration -- 4.2 Interactive Segmentation -- 4.3 Experimental Results -- 5 Conclusion and Discussion -- References -- Open-Domain Textual Question Answering Systems -- 1 Introduction -- 2 Overview of Open-Domain Question Answering Systems -- 3 Paragraph Ranking -- 3.1 Multi-Level Fused Sequence Matching Model -- 3.1.1 Multi-Level Fused Encoding -- 3.1.2 Attention Model with Alignment and Comparison -- 3.1.3 Aggregation and Prediction -- 3.2 Evaluation -- 4 Candidate Answer Extraction -- 4.1 Dynamic Semantic Discard Reader -- 4.1.1 Feature Encoding -- 4.1.2 Attention Matching -- 4.1.3 Dynamic Discard -- 4.1.4 Information Aggregation -- 4.1.5 Prediction -- 4.2 Reinforced Mnemonic Reader -- 4.2.1 RC with Reattention -- 4.2.2 Dynamic-Critical Reinforcement Learning -- 4.2.3 End-to-End Architecture -- 4.3 Read and Verify System -- 4.3.1 Reader with Auxiliary Losses -- 4.3.2 Answer Verifier -- 4.4 Evaluations -- 5 Answer Selection Module -- 5.1 RE3QA: Retrieve, Read and Rerank -- 5.1.1 Answer Reranker -- 5.1.2 End-to-End Training -- 5.2 Multi-Type Multi-Span Network for DROP -- 5.2.1 Multi-Type Answer Predictor.

5.2.2 Multi-Span Extraction -- 5.2.3 Arithmetic Expression Reranking -- 5.3 Evaluations -- 6 Conclusion -- References -- Part II Auditory Technologies for Accessible Human Computer Interfaces -- Speech Recognition for Individuals with Voice Disorders -- 1 Motivation and Introduction -- 1.1 Voice Interaction Is Here to Stay -- 1.2 Accessibility Considerations in Voice Interaction -- 2 Definitions and Concepts -- 3 A Brief Introduction to Phonetics and Acoustics -- 3.1 Speech Production -- 3.1.1 Production of Disordered Speech -- 3.2 Speech Perception -- 3.3 Speech Parameterization Methods -- 3.4 Markers of Disordered Speech -- 4 Automatic Speech Recognition Overview -- 4.1 Characterization of ASR Systems -- 4.1.1 Speaker Dependence -- 4.1.2 Continuity -- 4.1.3 Vocabulary Size -- 4.2 Nomenclature of Disordered Speech Recognition -- 4.3 The Ideal System -- 4.4 Levels of Difficulty in ASR Tasks -- 4.4.1 Level 1 ASR -- 4.4.2 Level 2 ASR -- 4.4.3 Level 3 ASR -- 4.4.4 Level 4 ASR -- 5 A Level by Level Guide of ASR Modeling Approaches -- 5.1 Level 1 ASR: Clear and Clean Speech Recognition -- 5.1.1 Multimodels -- 5.1.2 End-to-End Models -- 5.2 Level 2 ASR: Noisy but Clear Speech Recognition -- 5.2.1 Data Augmentation -- 5.2.2 Transfer Learning -- 5.2.3 Multimodal ASR -- 5.3 Level 3 ASR: Clean but Unclear Speech Recognition -- 5.3.1 Data Augmentation -- 5.3.2 Multimodal Techniques -- 5.3.3 Voice Conversion and Speaker Normalization -- 5.4 Level 4 ASR -- 6 Disordered Speech Datasets -- 6.1 Acoustic Datasets -- 6.1.1 Dysarthric Speech Dataset for Universal Access (UASPEECH) -- 6.1.2 The TORGO Database -- 6.1.3 The Nemours Database of Dysarthric Speech -- 6.1.4 The HomeService



Corpus -- 6.1.5 UncommonVoice -- 6.1.6 Parkinson's Disorder Speech Dataset -- 7 Utility and Applications of Disorder-Robust ASR -- 7.1 Clinical Metrics -- 7.2 Voice Assistive Technologies.

7.3 Improvement of Everyday Voice Interactions -- 8 Conclusions -- References -- Socially Assistive Robots for Storytelling and Other Activities to Support Aging in Place -- 1 Introduction -- 2 Technology to Assist with Aging in Place -- 2.1 Smart Homes and Safety -- 2.2 Technologies to Encourage Fitness -- 3 Technology for Communication -- 3.1 Robotic Pets -- 4 Socially Assistive Robots -- 4.1 Existing Technologies -- 4.2 How Robots Can Address Isolation -- References -- Part III Haptic Technologies for Accessible Human Computer Interfaces -- Accessible Smart Coaching Technologies Inspired by ElderlyRequisites -- 1 Introduction -- 2 A Review of Accessible Technology in Healthcare -- 3 Novel Wearable Healthcare Technologies Using Pneumatic Gel Muscle (PGM) -- 3.1 Pneumatic Gel Muscle (PGM) -- 3.1.1 Overview -- 3.1.2 Force Characteristics -- 3.2 A Soft Exoskeleton Jacket for Remote Human Interaction -- 3.2.1 Motivation -- 3.2.2 System Description -- 3.2.3 Measurement of Force During Shoulder Abduction and Elbow Flexion -- 3.2.4 Latency Measurement -- 3.3 A Soft Wearable Balance Exercise Device -- 3.3.1 Motivation -- 3.3.2 System Description -- 3.3.3 Evaluation of the Prototype Through a Single-Leg Stance Test -- 3.4 Swing Support System Using Wireless Actuation of PGMs -- 3.4.1 Motivation -- 3.4.2 System Description -- 3.4.3 Evaluation of the Prototype Through Measurement of Various Lower Limb Parameters -- 4 Stealth Adaptive Exergame Design Framework -- 4.1 Fruit Slicing Exergame Design -- 4.2 Ski Squat Exergame Design -- 4.2.1 System Description of Ski Exergame -- 4.2.2 sEMG Measurement to Detect the Effect of PGM Based Muscle Loading -- 5 An IMU-Based Assessment of Brushed Body Area -- 5.1 Motivation -- 5.2 System Description -- 5.3 Calculation of the Contact Area Between Brush and Body Based on Distance Metrics.

5.4 Comparison of Predicted and Actual Contact Area -- 6 Conclusion -- References -- Haptic Mediators for Remote Interpersonal Communication -- 1 Introduction -- 2 Social Touch -- 2.1 Hugging -- 2.2 Handshaking -- 2.3 Patting, Tapping, and Stroking -- 2.4 Massaging -- 3 Non-verbal Communication -- 3.1 Facial Features and Emotions -- 3.2 Body Movements and Gestures -- 4 Verbal Communication -- 4.1 Emphasis, Attention and Turn Taking -- 4.2 Haptic Messaging -- 5 Conclusions and Future Directions -- References -- Part IV Multimodal Technologies for Accessible Human Computer Interfaces -- Human-Machine Interfaces for Socially Connected Devices: From Smart Households to Smart Cities -- 1 Introduction -- 1.1 Smart Community -- 1.2 Smart Home -- 1.3 Socially Connected Products -- 1.4 Gamification -- 1.4.1 Energy Adapted Octalysis Framework -- 2 Multisystem: Data Fusion -- 2.1 ANFIS: Adaptive Neuro-Fuzzy Inference Systems -- 2.2 Topology Proposed: Detection of Gamified Motivation at Home for Saving Energy -- 2.3 Input 1: Level of Energy Consumption -- 2.4 Input 2: Type of Environmental Home -- 2.5 Output: Gamified Motivation (Local Point of View) -- 2.5.1 Community Gamified motivation's Detection (Global Point of View) -- 3 Proposal -- 3.1 Input 1: Level of Energy Consumption -- 3.2 Input 2: Type of Environmental Home -- 3.3 Output: Gamified Motivation (Local Point of View) -- 4 Results -- 5 HMI to Improve the Quality of Life of Older People Using the Proposed Structure -- 6 From Citizen to Smart City: A Future Vision -- 6.1 Smart City Vision in a COVID-19 Context -- 7 Discussion -- 8 Conclusion -- References -- Enhancing Situational Awareness and Kinesthetic Assistance for Clinicians via Augmented-Reality and Haptic Shared-



Control Technologies -- 1 Intraoperative Situational Awareness -- 1.1 Visual Guidance -- 1.2 Haptic Guidance.

1.3 Applications of Visual and Haptic Guidance in Surgery.