Camera-Based Document Analysis and Recognition [[electronic resource] ] : 5th International Workshop, CBDAR 2013, Washington, DC, USA, August 23, 2013, Revised Selected Papers / / edited by Masakazu Iwamura, Faisal Shafait |
Edizione | [1st ed. 2014.] |
Pubbl/distr/stampa | Cham : , : Springer International Publishing : , : Imprint : Springer, , 2014 |
Descrizione fisica | 1 online resource (VIII, 187 p. 117 illus.) |
Disciplina | 621.367 |
Collana | Image Processing, Computer Vision, Pattern Recognition, and Graphics |
Soggetto topico |
Optical data processing
Pattern recognition Data mining Natural language processing (Computer science) Application software User interfaces (Computer systems) Image Processing and Computer Vision Pattern Recognition Data Mining and Knowledge Discovery Natural Language Processing (NLP) Information Systems Applications (incl. Internet) User Interfaces and Human Computer Interaction |
ISBN | 3-319-05167-9 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto | Spatially Prioritized and Persistent Text Detection and Decoding -- A Hierarchical Visual Saliency Model for Character Detection -- in Natural Scenes -- A Robust Approach to Extraction of Texts from Camera Captured Images -- Scene Text Detection via Integrated Discrimination of Component Appearance and Consensus -- Accuracy Improvement of Viewpoint-Free Scene Character Recognition by Rotation Angle Estimation -- Sign Detection Based Text Localization in Mobile Device Captured Scene Images -- Font Distribution Observation by Network-Based Analysis -- Book Page Spreads Captured with a Mobile Phone Camera -- A Dataset for Quality Assessment of Camera Captured Document Images. |
Record Nr. | UNISA-996203275403316 |
Cham : , : Springer International Publishing : , : Imprint : Springer, , 2014 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. di Salerno | ||
|
Camera-Based Document Analysis and Recognition : 5th International Workshop, CBDAR 2013, Washington, DC, USA, August 23, 2013, Revised Selected Papers / / edited by Masakazu Iwamura, Faisal Shafait |
Edizione | [1st ed. 2014.] |
Pubbl/distr/stampa | Cham : , : Springer International Publishing : , : Imprint : Springer, , 2014 |
Descrizione fisica | 1 online resource (VIII, 187 p. 117 illus.) |
Disciplina | 621.367 |
Collana | Image Processing, Computer Vision, Pattern Recognition, and Graphics |
Soggetto topico |
Optical data processing
Pattern recognition Data mining Natural language processing (Computer science) Application software User interfaces (Computer systems) Image Processing and Computer Vision Pattern Recognition Data Mining and Knowledge Discovery Natural Language Processing (NLP) Information Systems Applications (incl. Internet) User Interfaces and Human Computer Interaction |
ISBN | 3-319-05167-9 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto | Spatially Prioritized and Persistent Text Detection and Decoding -- A Hierarchical Visual Saliency Model for Character Detection -- in Natural Scenes -- A Robust Approach to Extraction of Texts from Camera Captured Images -- Scene Text Detection via Integrated Discrimination of Component Appearance and Consensus -- Accuracy Improvement of Viewpoint-Free Scene Character Recognition by Rotation Angle Estimation -- Sign Detection Based Text Localization in Mobile Device Captured Scene Images -- Font Distribution Observation by Network-Based Analysis -- Book Page Spreads Captured with a Mobile Phone Camera -- A Dataset for Quality Assessment of Camera Captured Document Images. |
Record Nr. | UNINA-9910484698103321 |
Cham : , : Springer International Publishing : , : Imprint : Springer, , 2014 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. Federico II | ||
|
Camera-Based Document Analysis and Recognition [[electronic resource] ] : 4th International Workshop, CBDAR 2011, Beijing, China, September 22, 2011, Revised Selected Papers / / edited by Masakazu Iwamura, Faisal Shafait |
Edizione | [1st ed. 2012.] |
Pubbl/distr/stampa | Berlin, Heidelberg : , : Springer Berlin Heidelberg : , : Imprint : Springer, , 2012 |
Descrizione fisica | 1 online resource (VIII, 173 p. 93 illus.) |
Disciplina |
006.6
006.37 |
Collana | Image Processing, Computer Vision, Pattern Recognition, and Graphics |
Soggetto topico |
Optical data processing
Pattern recognition Data mining Natural language processing (Computer science) Application software User interfaces (Computer systems) Image Processing and Computer Vision Pattern Recognition Data Mining and Knowledge Discovery Natural Language Processing (NLP) Information Systems Applications (incl. Internet) User Interfaces and Human Computer Interaction |
ISBN | 3-642-29364-6 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto | Multi-script and multi-oriented text localization from scene images / Thotreingam Kasar, Angarai G. Ramakrishnan -- Assistive text reading from complex background for blind persons / Chucai Yi, Yingli Tian. |
Record Nr. | UNISA-996465975603316 |
Berlin, Heidelberg : , : Springer Berlin Heidelberg : , : Imprint : Springer, , 2012 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. di Salerno | ||
|
Computational Forensics [[electronic resource] ] : 5th International Workshop, IWCF 2012, Tsukuba, Japan, November 11, 2012 and 6th International Workshop, IWCF 2014, Stockholm, Sweden, August 24, 2014, Revised Selected Papers / / edited by Utpal Garain, Faisal Shafait |
Edizione | [1st ed. 2015.] |
Pubbl/distr/stampa | Cham : , : Springer International Publishing : , : Imprint : Springer, , 2015 |
Descrizione fisica | 1 online resource (X, 213 p. 104 illus.) |
Disciplina | 363.250285 |
Collana | Image Processing, Computer Vision, Pattern Recognition, and Graphics |
Soggetto topico |
Pattern recognition
Optical data processing Artificial intelligence Natural language processing (Computer science) Pattern Recognition Image Processing and Computer Vision Artificial Intelligence Natural Language Processing (NLP) |
ISBN | 3-319-20125-5 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto | Biometrics -- Document image inspection -- Applications. |
Record Nr. | UNISA-996198518503316 |
Cham : , : Springer International Publishing : , : Imprint : Springer, , 2015 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. di Salerno | ||
|
Computational Forensics : 5th International Workshop, IWCF 2012, Tsukuba, Japan, November 11, 2012 and 6th International Workshop, IWCF 2014, Stockholm, Sweden, August 24, 2014, Revised Selected Papers / / edited by Utpal Garain, Faisal Shafait |
Edizione | [1st ed. 2015.] |
Pubbl/distr/stampa | Cham : , : Springer International Publishing : , : Imprint : Springer, , 2015 |
Descrizione fisica | 1 online resource (X, 213 p. 104 illus.) |
Disciplina | 363.250285 |
Collana | Image Processing, Computer Vision, Pattern Recognition, and Graphics |
Soggetto topico |
Pattern recognition
Optical data processing Artificial intelligence Natural language processing (Computer science) Pattern Recognition Image Processing and Computer Vision Artificial Intelligence Natural Language Processing (NLP) |
ISBN | 3-319-20125-5 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto | Biometrics -- Document image inspection -- Applications. |
Record Nr. | UNINA-9910483374003321 |
Cham : , : Springer International Publishing : , : Imprint : Springer, , 2015 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. Federico II | ||
|
Frontiers in handwriting recognition : 18th international conference, ICFHR 2022, Hyderabad, India, December 4-7, 2022, proceedings / / edited by Utkarsh Porwal, Alicia Fornés, and Faisal Shafait |
Pubbl/distr/stampa | Cham, Switzerland : , : Springer, , [2022] |
Descrizione fisica | 1 online resource (567 pages) |
Disciplina | 006.424 |
Collana | Lecture Notes in Computer Science |
Soggetto topico | Optical character recognition |
ISBN | 3-031-21648-2 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto |
Intro -- Preface -- Organization -- Contents -- Historical Document Processing -- A Few Shot Multi-representation Approach for N-Gram Spotting in Historical Manuscripts -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 The Base Architecture -- 3.2 The Multi-modal Architecture -- 3.3 Multi-modal Architecture with Early Fusion -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Evaluation Metrics -- 4.3 Results and Discussion -- 5 Conclusion -- References -- Text Edges Guided Network for Historical Document Super Resolution -- 1 Introduction -- 2 Related Work -- 3 Dataset -- 4 Method -- 4.1 Model Framework -- 4.2 Objective Function -- 5 Experiment -- 5.1 Data Preparation -- 5.2 Hyperparameters Tuning Using Grid Search -- 5.3 Super-Resolution Evaluation -- 6 Conclusion -- References -- CurT: End-to-End Text Line Detection in Historical Documents with Transformers -- 1 Introduction -- 2 Related Work -- 2.1 Transformers for Computer Vision -- 2.2 DETR and Variants -- 2.3 Text Baseline Detection -- 3 Contribution -- 4 The CurT Model -- 4.1 Text Line Data Model -- 4.2 Curve Detection Set Prediction Loss -- 4.3 CurT Architecture -- 5 Experiments -- 5.1 Dataset and Evaluation Protocol -- 5.2 Implementation Details -- 5.3 Overall Performance -- 5.4 Ordered Prediction -- 5.5 Further Extensions -- 6 Conclusion -- References -- Date Recognition in Historical Parish Records -- 1 Introduction -- 2 Data -- 3 Date Recognition -- 4 Experiments -- 4.1 Data Splits -- 4.2 Segmentation -- 4.3 Models -- 4.4 Evaluation Metrics -- 5 Results and Analysis -- 6 Related Work -- 7 Future Work -- 8 Conclusion -- References -- Improving Isolated Glyph Classification Task for Palm Leaf Manuscripts -- 1 Introduction -- 2 Palm Leaf Manuscripts from Southeast Asia -- 2.1 Corpus and Languages -- 2.2 Challenges of Isolated Glyph Datasets -- 3 Overall Frameworks.
3.1 Data Pattern Generations -- 3.2 Image Enhancement for Palm Leaf Manuscripts (IEPalm) -- 3.3 Training CNNs and ViTs -- 4 Experimental Setups and Results -- 4.1 Implementation Settings -- 4.2 Results -- 5 Conclusion -- References -- Signature Verification and Writer Identification -- Impact of Type of Convolution Operation on Performance of Convolutional Neural Networks for Online Signature Verification -- 1 Introduction -- 2 Related Work -- 3 Proposed OSV Framework -- 3.1 Input Representation, Type of Convolution and Order of Convolution -- 3.2 Analyzing the Impact of Signature Length -- 3.3 Further Improvement of Input Representation -- 4 Comparison with SOTA Methods -- 5 Conclusion and Future Work -- References -- COMPOSV++: Light Weight Online Signature Verification Framework Through Compound Feature Extraction and Few-Shot Learning -- 1 Introduction -- 2 Literature Survey -- 3 Proposed Online Signature Verification Framework -- 3.1 Proposed Novel Dimensionality Reduction Algorithm -- 3.2 Proposed Separable Convolution Operation Based OSV Framework: -- 4 Experimentation Analysis and Results -- 5 Conclusion and Future Work -- References -- Finger-Touch Direction Feature Using a Frequency Distribution in the Writer Verification Base on Finger-Writing of a Simple Symbol -- 1 Introduction -- 2 Writer Verification Based on Finger-Writing of a Simple Symbol -- 3 Introduction of Finger-Touching Direction Feature -- 3.1 Finger-Touching Direction -- 3.2 Evaluation of Verification Performance -- 3.3 Considerations -- 4 Introduction of Preprocessing -- 5 Frequency Distribution as a New Feature -- 6 Conclusions -- References -- Self-supervised Vision Transformers with Data Augmentation Strategies Using Morphological Operations for Writer Retrieval -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Preprocessing -- 3.2 Vision Transformer. 3.3 Morphological Operations -- 3.4 Self-supervised Training -- 3.5 Page Descriptor and Retrieval -- 4 Experiments -- 4.1 Historical-WI Dataset -- 4.2 Evaluation -- 4.3 Results -- 5 Conclusion -- References -- EAU-Net: A New Edge-Attention Based U-Net for Nationality Identification -- 1 Introduction -- 2 Related Work -- 3 Proposed Model -- 3.1 Edge-Attention Based U-Net for Edge Detection -- 3.2 Nationality/Ethnicity Identification -- 4 Experimental Results -- 4.1 Ablation Study -- 4.2 Experiments on Edge Detection -- 4.3 Experiments on Classification of Nationality -- 4.4 Gender Classification -- 4.5 Error Analysis -- 5 Conclusion and Future Work -- References -- Progressive Multitask Learning Network for Online Chinese Signature Segmentation and Recognition -- 1 Introduction -- 2 Methodology -- 2.1 Overview -- 2.2 Dual Channel Stroke Feature Extraction Block (DSF-Block) -- 2.3 Stacked Transformer Encoder Block (STE-Block) -- 2.4 Progressive Multitask Interaction Block (PMI-Block) -- 2.5 Training Objective -- 3 Experiments -- 3.1 Database -- 3.2 Evaluation Metrics -- 3.3 Implementation Details -- 3.4 Qualitative Results -- 3.5 Quantitative Results -- 3.6 Ablation Studies -- 4 Conclusion -- References -- Symbol and Graphics Recognition -- Musigraph: Optical Music Recognition Through Object Detection and Graph Neural Network -- 1 Introduction -- 2 Related Work -- 2.1 Optical Music Recognition (OMR) -- 2.2 Graph Neural Network (GNN) -- 3 The Musigraph Model -- 3.1 Object Detector -- 3.2 Graph Neural Network -- 4 Dataset -- 5 Experimental Validation -- 5.1 Object Detection Results -- 5.2 Graph Neural Network Results -- 6 Conclusions and Future Work -- References -- Combining CNN and Transformer as Encoder to Improve End-to-End Handwritten Mathematical Expression Recognition Accuracy -- 1 Introduction -- 2 Methodology -- 2.1 Baseline System. 2.2 Tandem Approach -- 2.3 Parallel Approach -- 2.4 Mixing Approach -- 3 Experimental Result -- 3.1 Experimental Setup -- 3.2 Overall Results -- 3.3 Effects of Number of Transformer Encoder Layers to Tandem Approach -- 3.4 Effects of Number of Transformer Encoder Layers to Parallel Approach -- 3.5 Effects of Number of Attention Heads to Mixing Approach -- 4 Conclusion -- References -- A Vision Transformer Based Scene Text Recognizer with Multi-grained Encoding and Decoding -- 1 Introduction -- 2 Related Works -- 2.1 Scene Text Recognition -- 2.2 Vision Transformer -- 2.3 Self-supervised Learning -- 3 Method -- 3.1 Pipeline -- 3.2 Two-Stage Encoder -- 3.3 Joint Decoder -- 3.4 MAE with Focusing Mechanism -- 3.5 Objective Functions and Training Strategies -- 4 Experiments -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Comparisons with State-of-the-Arts -- 4.4 Ablation Studies -- 4.5 Experiments on Occlusion Scene Text -- 5 Conclusions -- References -- Spatial Attention and Syntax Rule Enhanced Tree Decoder for Offline Handwritten Mathematical Expression Recognition -- 1 Introduction -- 2 Related Works -- 3 Proposed Method -- 3.1 Child Node Prediction Module -- 3.2 Spatial Attention-Based Parent Node Prediction Module -- 3.3 Syntax Rule-Based Relation Prediction Module -- 3.4 Total Loss -- 4 Experiments -- 4.1 Dataset -- 4.2 Implementation Details -- 4.3 Ablation Experiment -- 4.4 Performance Comparison -- 5 Conclusion -- References -- Handwriting Recognition and Understanding -- FPRNet: End-to-End Full-Page Recognition Model for Handwritten Chinese Essay -- 1 Introduction -- 2 Related Works -- 2.1 Segmentation-Based Approaches -- 2.2 Segmentation-Free Approaches -- 3 Architecture -- 3.1 Encoder -- 3.2 Decoder -- 3.3 Order-Align Strategy -- 4 Experiments and Results -- 4.1 Dataset -- 4.2 Experimental Setup -- 4.3 Experimental Results. 5 Conclusion -- References -- Active Transfer Learning for Handwriting Recognition -- 1 Introduction -- 2 Related Work -- 2.1 Transfer Learning -- 2.2 Active Learning -- 2.3 Active Transfer Learning -- 3 Methodology -- 3.1 Model Weights Initialization -- 3.2 Active Learning Sample Selection -- 3.3 Supervised Training -- 3.4 Model Evaluation -- 4 Results -- 4.1 Methods Comparison -- 4.2 Incremental Iterative Training -- 4.3 Selection of Pre-trained Model Weights -- 5 Conclusion -- References -- Recognition-Free Question Answering on Handwritten Document Collections -- 1 Introduction -- 2 Related Work -- 2.1 Document Retrieval -- 2.2 Question Answering -- 3 Method -- 3.1 Query and Document Representation -- 3.2 Retrieval -- 3.3 Question Answering -- 4 Experiments -- 4.1 Dataset -- 4.2 Implementation Details -- 4.3 Results -- 5 Conclusions -- References -- Handwriting Recognition and Automatic Scoring for Descriptive Answers in Japanese Language Tests -- 1 Introduction -- 2 Related Works -- 3 Handwritten Japanese Answer Dataset -- 3.1 Handwritten Text-Line Segmentation -- 3.2 Splitting and Labeling Samples -- 3.3 Statistics -- 4 Handwritten Answer Recognition and Automatic Scoring -- 4.1 Handwritten Answer Recognition -- 4.2 Automatic Scoring -- 5 Experiment Results -- 5.1 Performance of Recognition Model -- 5.2 Performance of Automatic Scoring Model -- 6 Conclusions -- References -- A Weighted Combination of Semantic and Syntactic Word Image Representations -- 1 Introduction -- 2 Related Work -- 2.1 Traditional Word Spotting -- 2.2 Semantic Word Spotting -- 2.3 Word Embeddings -- 3 Method -- 3.1 Word Image Representation -- 3.2 Weighted Combination Approaches -- 3.3 Normalization -- 4 Experiments -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Evaluation Protocol -- 4.4 Normalization -- 4.5 Results -- 5 Conclusions -- References. Combining Self-training and Minimal Annotations for Handwritten Word Recognition. |
Record Nr. | UNISA-996500061903316 |
Cham, Switzerland : , : Springer, , [2022] | ||
Materiale a stampa | ||
Lo trovi qui: Univ. di Salerno | ||
|
Frontiers in handwriting recognition : 18th international conference, ICFHR 2022, Hyderabad, India, December 4-7, 2022, proceedings / / edited by Utkarsh Porwal, Alicia Fornés, and Faisal Shafait |
Pubbl/distr/stampa | Cham, Switzerland : , : Springer, , [2022] |
Descrizione fisica | 1 online resource (567 pages) |
Disciplina | 006.424 |
Collana | Lecture Notes in Computer Science |
Soggetto topico | Optical character recognition |
ISBN | 3-031-21648-2 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto |
Intro -- Preface -- Organization -- Contents -- Historical Document Processing -- A Few Shot Multi-representation Approach for N-Gram Spotting in Historical Manuscripts -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 The Base Architecture -- 3.2 The Multi-modal Architecture -- 3.3 Multi-modal Architecture with Early Fusion -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Evaluation Metrics -- 4.3 Results and Discussion -- 5 Conclusion -- References -- Text Edges Guided Network for Historical Document Super Resolution -- 1 Introduction -- 2 Related Work -- 3 Dataset -- 4 Method -- 4.1 Model Framework -- 4.2 Objective Function -- 5 Experiment -- 5.1 Data Preparation -- 5.2 Hyperparameters Tuning Using Grid Search -- 5.3 Super-Resolution Evaluation -- 6 Conclusion -- References -- CurT: End-to-End Text Line Detection in Historical Documents with Transformers -- 1 Introduction -- 2 Related Work -- 2.1 Transformers for Computer Vision -- 2.2 DETR and Variants -- 2.3 Text Baseline Detection -- 3 Contribution -- 4 The CurT Model -- 4.1 Text Line Data Model -- 4.2 Curve Detection Set Prediction Loss -- 4.3 CurT Architecture -- 5 Experiments -- 5.1 Dataset and Evaluation Protocol -- 5.2 Implementation Details -- 5.3 Overall Performance -- 5.4 Ordered Prediction -- 5.5 Further Extensions -- 6 Conclusion -- References -- Date Recognition in Historical Parish Records -- 1 Introduction -- 2 Data -- 3 Date Recognition -- 4 Experiments -- 4.1 Data Splits -- 4.2 Segmentation -- 4.3 Models -- 4.4 Evaluation Metrics -- 5 Results and Analysis -- 6 Related Work -- 7 Future Work -- 8 Conclusion -- References -- Improving Isolated Glyph Classification Task for Palm Leaf Manuscripts -- 1 Introduction -- 2 Palm Leaf Manuscripts from Southeast Asia -- 2.1 Corpus and Languages -- 2.2 Challenges of Isolated Glyph Datasets -- 3 Overall Frameworks.
3.1 Data Pattern Generations -- 3.2 Image Enhancement for Palm Leaf Manuscripts (IEPalm) -- 3.3 Training CNNs and ViTs -- 4 Experimental Setups and Results -- 4.1 Implementation Settings -- 4.2 Results -- 5 Conclusion -- References -- Signature Verification and Writer Identification -- Impact of Type of Convolution Operation on Performance of Convolutional Neural Networks for Online Signature Verification -- 1 Introduction -- 2 Related Work -- 3 Proposed OSV Framework -- 3.1 Input Representation, Type of Convolution and Order of Convolution -- 3.2 Analyzing the Impact of Signature Length -- 3.3 Further Improvement of Input Representation -- 4 Comparison with SOTA Methods -- 5 Conclusion and Future Work -- References -- COMPOSV++: Light Weight Online Signature Verification Framework Through Compound Feature Extraction and Few-Shot Learning -- 1 Introduction -- 2 Literature Survey -- 3 Proposed Online Signature Verification Framework -- 3.1 Proposed Novel Dimensionality Reduction Algorithm -- 3.2 Proposed Separable Convolution Operation Based OSV Framework: -- 4 Experimentation Analysis and Results -- 5 Conclusion and Future Work -- References -- Finger-Touch Direction Feature Using a Frequency Distribution in the Writer Verification Base on Finger-Writing of a Simple Symbol -- 1 Introduction -- 2 Writer Verification Based on Finger-Writing of a Simple Symbol -- 3 Introduction of Finger-Touching Direction Feature -- 3.1 Finger-Touching Direction -- 3.2 Evaluation of Verification Performance -- 3.3 Considerations -- 4 Introduction of Preprocessing -- 5 Frequency Distribution as a New Feature -- 6 Conclusions -- References -- Self-supervised Vision Transformers with Data Augmentation Strategies Using Morphological Operations for Writer Retrieval -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Preprocessing -- 3.2 Vision Transformer. 3.3 Morphological Operations -- 3.4 Self-supervised Training -- 3.5 Page Descriptor and Retrieval -- 4 Experiments -- 4.1 Historical-WI Dataset -- 4.2 Evaluation -- 4.3 Results -- 5 Conclusion -- References -- EAU-Net: A New Edge-Attention Based U-Net for Nationality Identification -- 1 Introduction -- 2 Related Work -- 3 Proposed Model -- 3.1 Edge-Attention Based U-Net for Edge Detection -- 3.2 Nationality/Ethnicity Identification -- 4 Experimental Results -- 4.1 Ablation Study -- 4.2 Experiments on Edge Detection -- 4.3 Experiments on Classification of Nationality -- 4.4 Gender Classification -- 4.5 Error Analysis -- 5 Conclusion and Future Work -- References -- Progressive Multitask Learning Network for Online Chinese Signature Segmentation and Recognition -- 1 Introduction -- 2 Methodology -- 2.1 Overview -- 2.2 Dual Channel Stroke Feature Extraction Block (DSF-Block) -- 2.3 Stacked Transformer Encoder Block (STE-Block) -- 2.4 Progressive Multitask Interaction Block (PMI-Block) -- 2.5 Training Objective -- 3 Experiments -- 3.1 Database -- 3.2 Evaluation Metrics -- 3.3 Implementation Details -- 3.4 Qualitative Results -- 3.5 Quantitative Results -- 3.6 Ablation Studies -- 4 Conclusion -- References -- Symbol and Graphics Recognition -- Musigraph: Optical Music Recognition Through Object Detection and Graph Neural Network -- 1 Introduction -- 2 Related Work -- 2.1 Optical Music Recognition (OMR) -- 2.2 Graph Neural Network (GNN) -- 3 The Musigraph Model -- 3.1 Object Detector -- 3.2 Graph Neural Network -- 4 Dataset -- 5 Experimental Validation -- 5.1 Object Detection Results -- 5.2 Graph Neural Network Results -- 6 Conclusions and Future Work -- References -- Combining CNN and Transformer as Encoder to Improve End-to-End Handwritten Mathematical Expression Recognition Accuracy -- 1 Introduction -- 2 Methodology -- 2.1 Baseline System. 2.2 Tandem Approach -- 2.3 Parallel Approach -- 2.4 Mixing Approach -- 3 Experimental Result -- 3.1 Experimental Setup -- 3.2 Overall Results -- 3.3 Effects of Number of Transformer Encoder Layers to Tandem Approach -- 3.4 Effects of Number of Transformer Encoder Layers to Parallel Approach -- 3.5 Effects of Number of Attention Heads to Mixing Approach -- 4 Conclusion -- References -- A Vision Transformer Based Scene Text Recognizer with Multi-grained Encoding and Decoding -- 1 Introduction -- 2 Related Works -- 2.1 Scene Text Recognition -- 2.2 Vision Transformer -- 2.3 Self-supervised Learning -- 3 Method -- 3.1 Pipeline -- 3.2 Two-Stage Encoder -- 3.3 Joint Decoder -- 3.4 MAE with Focusing Mechanism -- 3.5 Objective Functions and Training Strategies -- 4 Experiments -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Comparisons with State-of-the-Arts -- 4.4 Ablation Studies -- 4.5 Experiments on Occlusion Scene Text -- 5 Conclusions -- References -- Spatial Attention and Syntax Rule Enhanced Tree Decoder for Offline Handwritten Mathematical Expression Recognition -- 1 Introduction -- 2 Related Works -- 3 Proposed Method -- 3.1 Child Node Prediction Module -- 3.2 Spatial Attention-Based Parent Node Prediction Module -- 3.3 Syntax Rule-Based Relation Prediction Module -- 3.4 Total Loss -- 4 Experiments -- 4.1 Dataset -- 4.2 Implementation Details -- 4.3 Ablation Experiment -- 4.4 Performance Comparison -- 5 Conclusion -- References -- Handwriting Recognition and Understanding -- FPRNet: End-to-End Full-Page Recognition Model for Handwritten Chinese Essay -- 1 Introduction -- 2 Related Works -- 2.1 Segmentation-Based Approaches -- 2.2 Segmentation-Free Approaches -- 3 Architecture -- 3.1 Encoder -- 3.2 Decoder -- 3.3 Order-Align Strategy -- 4 Experiments and Results -- 4.1 Dataset -- 4.2 Experimental Setup -- 4.3 Experimental Results. 5 Conclusion -- References -- Active Transfer Learning for Handwriting Recognition -- 1 Introduction -- 2 Related Work -- 2.1 Transfer Learning -- 2.2 Active Learning -- 2.3 Active Transfer Learning -- 3 Methodology -- 3.1 Model Weights Initialization -- 3.2 Active Learning Sample Selection -- 3.3 Supervised Training -- 3.4 Model Evaluation -- 4 Results -- 4.1 Methods Comparison -- 4.2 Incremental Iterative Training -- 4.3 Selection of Pre-trained Model Weights -- 5 Conclusion -- References -- Recognition-Free Question Answering on Handwritten Document Collections -- 1 Introduction -- 2 Related Work -- 2.1 Document Retrieval -- 2.2 Question Answering -- 3 Method -- 3.1 Query and Document Representation -- 3.2 Retrieval -- 3.3 Question Answering -- 4 Experiments -- 4.1 Dataset -- 4.2 Implementation Details -- 4.3 Results -- 5 Conclusions -- References -- Handwriting Recognition and Automatic Scoring for Descriptive Answers in Japanese Language Tests -- 1 Introduction -- 2 Related Works -- 3 Handwritten Japanese Answer Dataset -- 3.1 Handwritten Text-Line Segmentation -- 3.2 Splitting and Labeling Samples -- 3.3 Statistics -- 4 Handwritten Answer Recognition and Automatic Scoring -- 4.1 Handwritten Answer Recognition -- 4.2 Automatic Scoring -- 5 Experiment Results -- 5.1 Performance of Recognition Model -- 5.2 Performance of Automatic Scoring Model -- 6 Conclusions -- References -- A Weighted Combination of Semantic and Syntactic Word Image Representations -- 1 Introduction -- 2 Related Work -- 2.1 Traditional Word Spotting -- 2.2 Semantic Word Spotting -- 2.3 Word Embeddings -- 3 Method -- 3.1 Word Image Representation -- 3.2 Weighted Combination Approaches -- 3.3 Normalization -- 4 Experiments -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Evaluation Protocol -- 4.4 Normalization -- 4.5 Results -- 5 Conclusions -- References. Combining Self-training and Minimal Annotations for Handwritten Word Recognition. |
Record Nr. | UNINA-9910632468703321 |
Cham, Switzerland : , : Springer, , [2022] | ||
Materiale a stampa | ||
Lo trovi qui: Univ. Federico II | ||
|