06948nam 22006855 450 99646441090331620220115063707.03-030-69541-710.1007/978-3-030-69541-5(CKB)4100000011781458(MiAaPQ)EBC6501067(DE-He213)978-3-030-69541-5(PPN)25385881X(EXLCZ)99410000001178145820210225d2021 u| 0engurnn|008mamaatxtrdacontentcrdamediacrrdacarrierComputer Vision – ACCV 2020[electronic resource] 15th Asian Conference on Computer Vision, Kyoto, Japan, November 30 – December 4, 2020, Revised Selected Papers, Part V /edited by Hiroshi Ishikawa, Cheng-Lin Liu, Tomas Pajdla, Jianbo Shi1st ed. 2021.Cham :Springer International Publishing :Imprint: Springer,2021.1 online resource (XVIII, 706 p. 5 illus., 1 illus. in color.) Image Processing, Computer Vision, Pattern Recognition, and Graphics ;126263-030-69540-9 Face, Pose, Action, and Gesture -- Video-Based Crowd Counting Using a Multi-Scale Optical Flow Pyramid Network -- RealSmileNet: A Deep End-To-End Network for Spontaneous and Posed Smile Recognition -- Decoupled Spatial-Temporal Attention Network for Skeleton-Based Action-Gesture Recognition -- Unpaired Multimodal Facial Expression Recognition -- Gaussian Vector: An Efficient Solution for Facial Landmark Detection -- A Global to Local Double Embedding Method for Multi-person Pose Estimation -- Semi-supervised Facial Action Unit Intensity Estimation with Contrastive Learning -- MMD based Discriminative Learning for Face Forgery Detection -- RE-Net: A Relation Embedded Deep Model for AU Occurrence and Intensity Estimation -- Learning 3D Face Reconstruction with a Pose Guidance Network -- Self-Supervised Multi-View Synchronization Learning for 3D Pose Estimation -- Faster, Better and More Detailed: 3D Face Reconstruction with Graph Convolutional Networks -- Localin Reshuffle Net: Toward Naturally and Efficiently Facial Image Blending -- Rotation Axis Focused Attention Network (RAFA-Net) for Estimating Head Pose -- Unified Application of Style Transfer for Face Swapping and Reenactment -- Multiple Exemplars-based Hallucination for Face Super-resolution and Editing -- Imbalance Robust Softmax for Deep Embedding Learning -- Domain Adaptation Gaze Estimation by Embedding with Prediction Consistency -- Speech2Video Synthesis with 3D Skeleton Regularization and Expressive Body Poses -- 3D Human Motion Estimation via Motion Compression and Refinement -- Spatial Temporal Attention Graph Convolutional Networks with Mechanics-Stream for Skeleton-based Action Recognition -- DiscFace: Minimum Discrepancy Learning for Deep Face Recognition -- Uncertainty Estimation and Sample Selection for Crowd Counting -- Multi-Task Learning for Simultaneous Video Generation and Remote Photoplethysmography Estimation -- Video Analysis and Event Recognition -- Interpreting Video Features: A Comparison of 3D Convolutional Networks and Convolutional LSTM Networks -- Encode the Unseen: Predictive Video Hashing for Scalable Mid-Stream Retrieval -- Active Learning for Video Description With Cluster-Regularized Ensemble Ranking -- Condensed Movies: Story Based Retrieval with Contextual Embeddings -- Play Fair: Frame Contributions in Video Models -- Transforming Multi-Concept Attention into Video Summarization -- Learning to Adapt to Unseen Abnormal Activities under Weak Supervision -- TSI: Temporal Scale Invariant Network for Action Proposal Generation -- Discovering Multi-Label Actor-Action Association in a Weakly Supervised Setting -- Reweighted Non-convex Non-smooth Rank Minimization based Spectral Clustering on Grassmann Manifold -- Biomedical Image Analysis -- Descriptor-Free Multi-View Region Matching for Instance-Wise 3D Reconstruction -- Hierarchical X-Ray Report Generation via Pathology tags and Multi Head Attention -- Self-Guided Multiple Instance Learning for Weakly Supervised Thoracic Disease Classification and Localizationin Chest Radiographs -- MBNet: A Multi-Task Deep Neural Network for Semantic Segmentation and Lumbar Vertebra Inspection on X-ray Images -- Attention-Based Fine-Grained Classification of Bone Marrow Cells -- Learning Multi-Instance Sub-pixel Point Localization -- Utilizing Transfer Learning and a Customized Loss Function for Optic Disc Segmentation from Retinal Images.The six volume set of LNCS 12622-12627 constitutes the proceedings of the 15th Asian Conference on Computer Vision, ACCV 2020, held in Kyoto, Japan, in November/ December 2020.* The total of 254 contributions was carefully reviewed and selected from 768 submissions during two rounds of reviewing and improvement. The papers focus on the following topics: Part I: 3D computer vision; segmentation and grouping Part II: low-level vision, image processing; motion and tracking Part III: recognition and detection; optimization, statistical methods, and learning; robot vision Part IV: deep learning for computer vision, generative models for computer vision Part V: face, pose, action, and gesture; video analysis and event recognition; biomedical image analysis Part VI: applications of computer vision; vision for X; datasets and performance analysis *The conference was held virtually.Image Processing, Computer Vision, Pattern Recognition, and Graphics ;12626Computer visionArtificial intelligenceComputer engineeringComputer networksPattern recognition systemsApplication softwareComputer VisionArtificial IntelligenceComputer Engineering and NetworksAutomated Pattern RecognitionComputer and Information Systems ApplicationsComputer vision.Artificial intelligence.Computer engineering.Computer networks.Pattern recognition systems.Application software.Computer Vision.Artificial Intelligence.Computer Engineering and Networks.Automated Pattern Recognition.Computer and Information Systems Applications.006.37Ishikawa Hiroshiedthttp://id.loc.gov/vocabulary/relators/edtLiu Cheng-Linedthttp://id.loc.gov/vocabulary/relators/edtPajdla Tomasedthttp://id.loc.gov/vocabulary/relators/edtShi Jianboedthttp://id.loc.gov/vocabulary/relators/edtBOOK996464410903316Computer vision-ACCV 20201890247UNISA