1.

Record Nr.

UNISA996465757103316

Titolo

Natural Language Understanding and Intelligent Applications [[electronic resource] ] : 5th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2016, and 24th International Conference on Computer Processing of Oriental Languages, ICCPOL 2016, Kunming, China, December 2–6, 2016, Proceedings / / edited by Chin-Yew Lin, Nianwen Xue, Dongyan Zhao, Xuanjing Huang, Yansong Feng

Pubbl/distr/stampa

Cham : , : Springer International Publishing : , : Imprint : Springer, , 2016

ISBN

3-319-50496-7

Edizione

[1st ed. 2016.]

Descrizione fisica

1 online resource (XXII, 952 p. 377 illus.)

Collana

Lecture Notes in Artificial Intelligence ; ; 10102

Disciplina

004

Soggetti

Natural language processing (Computer science)

Artificial intelligence

Information storage and retrieval

Application software

Natural Language Processing (NLP)

Artificial Intelligence

Information Storage and Retrieval

Information Systems Applications (incl. Internet)

Computer Appl. in Administrative Data Processing

Lingua di pubblicazione

Inglese

Formato

Materiale a stampa

Livello bibliografico

Monografia

Nota di contenuto

Intro -- Message from the Program Committee Co-chairs -- Organization -- Contents -- Fundamentals on Language Computing -- Integrating Structural Context with Local Context for Disambiguating Word Senses -- Abstract -- 1 Introduction -- 2 Proposed Approach -- 2.1 Generate Permuted-Lexicon-Sequence -- 2.2 Proposed Model -- 3 Evaluation -- 3.1 Data Sets -- 3.2 Experiments -- 4 Related Work -- 5 Conclusion -- Acknowledgements -- References -- Tibetan Multi-word Expressions Identification Framework Based on News Corpora -- Abstract -- 1 Introduction -- 2 Related Work -- 3 Brief Description of



Tibetan MWE Identification Framework -- 4 Tibetan MWE Identification Based on the Combination of Context Analysis and Language Model-Based Analysis -- 4.1 Context Analysis -- 4.2 Two-Word Coupling Degree -- 4.3 Tibetan Syllable Inside Word Probability -- 5 Experiments -- 5.1 Experimental Data -- 5.2 Evaluation -- 5.2.1 Evaluation for Different Strategies in Identifying Framework -- 5.2.2 Evaluation for the Effect of Context Analysis Granularity -- 5.2.3 Evaluation on Large Corpus -- 6 Conclusion -- Acknowledgements -- References -- Building Powerful Dependency Parsers for Resource-Poor Languages -- 1 Introduction -- 2 Our Approach -- 2.1 Data Preprocessing -- 2.2 Projecting Dependencies and POS Tags -- 2.3 CRF-Based POS Tagging Model -- 2.4 Graph-Based Dependency Parsing Model -- 3 Enhancing the Parsers -- 3.1 Subtree Based Features -- 3.2 Word-Cluster Based Features -- 4 Experiments -- 4.1 Data Sets -- 4.2 Results on POS Tagging -- 4.3 Results on Parsing -- 5 Related Work -- 6 Conclusions -- References -- Bidirectional Long Short-Term Memory with Gated Relevance Network for Paraphrase Identification -- 1 Introduction -- 2 Related Works -- 3 Methodology -- 3.1 Embedding Layer -- 3.2 Sentence Modeling with Bi-LSTM -- 3.3 Gated Relevance Network.

3.4 Max-Pooling Layer and MLP -- 3.5 Model Training -- 4 Experiments -- 4.1 Dataset and Evaluation Metrics -- 4.2 Parameter Settings -- 4.3 Baselines -- 4.4 Results of Comparison Experiments -- 5 Conclusion -- References -- Syntactic Categorization and Semantic Interpretation of Chinese Nominal Compounds -- Abstract -- 1 Introduction -- 2 Related Literature -- 3 Syntactic Categorization of Nominal Compounds in Chinese -- 3.1 Basic Rules -- 3.2 Context-Based Rules -- 3.3 Rules of Named Entities -- 3.4 Rules for Syntactic Categorization -- 3.5 Syntactic Categorization Experiments -- 4 Automatic Semantic Interpretation of Head-Modifier Nominal Compounds -- 4.1 Description of the System -- 4.2 Resources and Similarity Computation -- 4.3 Noun Matching -- 4.4 Acquisition of Semantic Interpretation Templates -- 4.5 Experiments of Automatic Semantic Interpretation -- 5 Application in Syntactic Parsing and Machine Translation -- 5.1 Correction in Syntactic Parsing -- 5.2 Application in Machine Translation -- 6 Conclusions -- Acknowledgement -- References -- TDSS: A New Word Sense Representation Framework for Information Retrieval -- 1 Introduction -- 2 Related Work -- 3 A New Word Sense Representation Framework -- 4 TDSS Sense Extraction -- 4.1 Explanation Words and Context Extraction -- 4.2 Sense Graph Construction -- 4.3 Sense Generation and Weighting -- 5 Experiments -- 5.1 Evaluating Explanation Word Extraction -- 5.2 Evaluating Word Sense Generation -- 5.3 Case Study: Query Rewriting -- 6 Conclusions -- References -- A Word Vector Representation Based Method for New Words Discovery in Massive Text -- Abstract -- 1 Related Work -- 2 The New Word Discovery Method Based on Word Vector Pruning -- 2.1 Data Preprocessing -- 2.2 Word Vector Representation and Training -- 2.3 Mining n-Gram Word String -- 2.4 Pruning Based on Word Vector -- 3 Experiment Results.

3.1 Data Sets and Experimental Settings -- 3.2 The Result of New Word Detection -- 3.3 Comparative Analysis -- 3.4 Different Vector Similarity Measure Pruning Comparison -- 4 Conclusions and Future Work -- Acknowledgments -- References -- Machine Translation and Multi-lingual Information Access -- Better Addressing Word Deletion for Statistical Machine Translation -- 1 Introduction -- 2 The Proposed Approach -- 2.1 Undesired WD Classification -- 2.2 Undesired WD Model -- 2.3 Integration into SMT Decoder -- 3 Evaluation Metric - Recall of WD -- 3.1 Unigram Recall -- 4 Evaluation -- 4.1 Experiment



Setup -- 4.2 Corpus -- 4.3 Results -- 4.4 Recall of WD vs Human Evaluation -- 5 Related Work -- 6 Conclusion and Future Work -- References -- A Simple, Straightforward and Effective Model for Joint Bilingual Terms Detection and Word Alignment in SMT -- 1 Introduction -- 2 Related Work -- 3 The Proposed Joint Model -- 3.1 The Framework for Jointly Detecting Bilingual Term Pairs and Aligning Words -- 3.2 The Joint Model -- 3.3 Derivation Details -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Results and Analysis -- 5 Conclusion -- References -- Bilingual Parallel Active Learning Between Chinese and English -- Abstract -- 1 Introduction -- 2 Related Work -- 3 Corpus Annotation -- 3.1 Corpus Selection -- 3.2 Annotation of Chinese Corpus -- 3.3 Mapping to English Corpus -- 3.4 Manual Adjustment -- 3.5 Alignment Statistics -- 4 Bilingual Parallel Active Learning -- 4.1 Problem Definition -- 4.2 BPAL Algorithm -- 5 Experiments -- 5.1 Corpora -- 5.2 Experimental Methods -- 5.3 Features for Relation Classification -- 5.4 Evaluation Metrics -- 5.5 Experimental Results -- 6 Conclusion -- Acknowledgement -- References -- Study on the English Corresponding Unit of Chinese Clause -- Abstract -- 1 Chinese-to-English Clause-Aligned Parallel Corpus.

2 ECUCC Grammatically Annotated Corpus -- 2.1 Grammatical Analytic Principles of ECUCC -- 2.2 Grammatical Analytic System of ECUCC -- 3 Classification and Statistical Analysis of ECUCC -- 3.1 Sentences and Clauses -- 3.2 Major Clauses and Subordinate Clauses -- 3.3 Functions of Subordinate Clauses: Adverbial and Attributive -- 3.4 Structures of Subordinate Clauses: Restrictive Relative and Non-defining -- 3.5 Simple Clauses and Coordinate Clauses -- 3.6 General Analysis -- 4 Conclusion and Further Research -- Acknowledgments -- References -- Research for Uyghur-Chinese Neural Machine Translation -- Abstract -- 1 Introduction -- 2 Related Work -- 3 Model -- 3.1 Pre-process -- 3.2 Pointer-NMT Model -- 3.3 Post-process -- 4 Experiment -- 4.1 Experiment Set -- 4.2 Results of Experiment -- 5 Conclusion -- Acknowledgements -- References -- MaxSD: A Neural Machine Translation Evaluation Metric Optimized by Maximizing Similarity Distance -- 1 Introduction -- 2 Learning Task -- 3 MaxSD Model: Maximizing Similarity Distance Model -- 3.1 MaxSD Model -- 3.2 Bi-LSTM and BiC-LSTM Networks -- 4 Experiments and Results -- 4.1 Datasets -- 4.2 Setups -- 4.3 Results -- 5 Conclusion -- References -- Automatic Long Sentence Segmentation for Neural Machine Translation -- 1 Introduction -- 2 Related Work -- 3 Neural Machine Translation -- 4 The Segmentation Method -- 4.1 The Split Model -- 4.2 The Reordering Model -- 4.3 Joint Model: Combining the Two Submodels -- 5 Experiment -- 5.1 Setup -- 5.2 The Split Model -- 5.3 The Reordering Model -- 5.4 Comparison -- 5.5 Analysis -- 6 Conclusion and Future Work -- References -- Machine Learning for NLP -- Topic Segmentation of Web Documents with Automatic Cue Phrase Identification and BLSTM-CNN -- 1 Introduction -- 2 Related Work -- 3 Models -- 3.1 BLSTM (Bidirectional Long Short Term Memory).

3.2 CNN for Paragraph Representation -- 3.3 Model Learning -- 4 Features -- 4.1 Frequent Subsequence Mining Based Cue Phrase Identification -- 4.2 Other Features -- 5 Experiments -- 5.1 Data and Setup -- 5.2 Results -- 5.3 Error Analysis -- 6 Conclusion and Future Work -- References -- Multi-task Learning for Gender and Age Prediction on Chinese Microblog -- 1 Introduction -- 2 Multi-task Convolutional Neural Network (MTCNN) -- 2.1 Model Description -- 2.2 Model Learning -- 3 Weibo Data -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Baselines -- 4.3 Results -- 4.4 Error Analysis -- 5 Related Work -- 6 Conclusion and Future Work --



References -- Dropout Non-negative Matrix Factorization for Independent Feature Learning -- Abstract -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 NMF as a Linear Neural Network -- 3.2 Dropout and Sequential NMF -- 3.3 Complexity Analysis -- 4 Experimental Results -- 4.1 Datasets -- 4.2 Experimental Settings -- 4.3 Clustering Results -- 4.4 Parameter Selection and Convergence Analysis -- 4.5 Case Study -- 5 Conclusion -- Acknowledgement -- References -- Analysing the Semantic Change Based on Word Embedding -- 1 Introduction -- 2 Related Work -- 3 Approaches -- 3.1 Word Embedding -- 3.2 Random Project Forest -- 3.3 DBSCAN -- 4 Experiments -- 4.1 Preparations -- 4.2 Detecting the Semantic Change Based on Word Embedding -- 4.3 Analysing the Semantic Trend with Word Embedding -- 4.4 Clustering on the Similar Words and Context Words -- 5 Conclusion and Future Work -- References -- Learning Word Sense Embeddings from Word Sense Definitions -- 1 Introduction -- 2 Methodology -- 2.1 Definition Understanding Model -- 2.2 Training Definition Understanding Model with Definitions of Monosemous Words -- 2.3 Word Sense Embedding Learning -- 2.4 Training with Word Sense Embeddings to Represent Words in Definitions.

3 Experiments.

Sommario/riassunto

This book constitutes the joint refereed proceedings of the 5th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2016, and the 24th International Conference on Computer Processing of Oriental Languages, ICCPOL 2016, held in Kunming, China, in December 2016. The 48 revised full papers presented together with 41 short papers were carefully reviewed and selected from 216 submissions. The papers cover fundamental research in language computing, multi-lingual access, web mining/text mining, machine learning for NLP, knowledge graph, NLP for social network, as well as applications in language computing.