LEADER 05256nam 2200625 a 450 001 9910143748003321 005 20200520144314.0 010 $a1-280-23872-0 010 $a9786610238729 010 $a0-470-01242-0 010 $a0-470-85685-8 035 $a(CKB)1000000000356087 035 $a(EBL)239019 035 $a(OCoLC)475950055 035 $a(SSID)ssj0000135090 035 $a(PQKBManifestationID)11150166 035 $a(PQKBTitleCode)TC0000135090 035 $a(PQKBWorkID)10056426 035 $a(PQKB)11445964 035 $a(MiAaPQ)EBC239019 035 $a(EXLCZ)991000000000356087 100 $a20041019d2005 uy 0 101 0 $aeng 135 $aur|n|---||||| 181 $ctxt 182 $cc 183 $acr 200 00$aDatabase annotation in molecular biology /$feditor, Arthur M. Lesk 210 $aChichester, West Sussex ;$aHoboken, NJ $cJohn Wiley$dc2005 215 $a1 online resource (267 p.) 300 $aDescription based upon print version of record. 311 $a0-470-85681-5 320 $aIncludes bibliographical references and index. 327 $aDatabase Annotation in Molecular Biology; Contents; Preface; List of Contributors; 1 Annotation and Databases: Status and Prospects; 1.1 Introduction; 1.2 Annotation of Genomic Data; 1.3 Databases: Concepts and Definitions; 1.4 Access to Annotation Databases; Glossary; References; I THE DATABANKS; 2 Survey of Sequence Databases: Archival Projects; 2.1 Introduction; 2.2 Nucleotide Sequence Databases; 2.3 Swiss-Prot; 2.4 TrEMBL; 2.5 PIR; 2.6 UniProt; References; 3 Survey of Sequence Databases: Derived Databases; 3.1 Introduction; 3.2 Protein and Gene Family Databases; 3.3 Discussion; References 327 $a4 Databanks of Macromolecular Structure4.1 Introduction; 4.2 Background; 4.3 Archival Structural Databases Now; 4.4 Contextual Databases; 4.5 Derived Structural Data Databases; 4.6 Summary and View of the Future; References; 5 Gene Expression Databases; 5.1 Introduction; 5.2 What Do We Mean by Microarray Gene Expression Data?; 5.3 Data Complexity; 5.4 Minimum Information About a Microarray Experiment (MIAME); 5.5 Journals and MIAME; 5.6 Storage and Exchange Formats: MAGE-OM and MAGE-ML; 5.7 ArrayExpress; 5.8 Annotation Tools; 5.9 Curation; 5.10 Standardization and Semantics 327 $a5.11 Public Microarray Databases5.12 ArrayExpress, an Example of a Public Repository; 5.13 Submissions to ArrayExpress; 5.14 MIAMExpress and Other MIAME Compliant Annotation Systems; 5.15 Databases of Protein Expression Patterns; 5.16 The Gene Expression Database (GXD); 5.17 Conclusion; References; II THE BASIS OF ANNOTATION; 6 Taxonomy: a Moving Target for Sequence Data; 6.1 Introduction; 6.2 Nomenclature; 6.3 Operational Definitions; 6.4 Searching for the Taxonomic Gold Standard; 6.5 Conclusions; References; 7 Genomics and Proteomics: Design and Sources of Annotation 327 $a7.1 Beyond the Sequence: the Challenge of Complete Genome Analysis7.2 Extracting the Genes; 7.3 Organism Specific Peculiarities; 7.4 Topology of Genomes; 7.5 Gene Extraction Pipelines; 7.6 Added Value and Knowledge; 7.7 Beyond the Parts List; References; 8 Annotation of Protein Sequences; 8.1 Introduction; 8.2 What is Annotation?; 8.3 UniProt: Universal Protein Resource; 8.4 Protein Family Classification; 8.5 InterPro: Integrated Resource of Protein Families, Domains and Sites; 8.6 PIR Protein Families and Superfamilies; 8.7 Ontologies 327 $a8.8 Protein Names, Source Information and Unique Identifiers8.9 Common Identification Errors; 8.10 Evidence Attribution; 8.11 Position Specific Annotations; 8.12 Rule-based Annotation; 8.13 Conclusions; Acknowledgements; References; 9 Issues in the Annotation of Protein Structures; 9.1 Data Harvesting; 9.2 Identification of the Biologically Relevant Assembly; 9.3 Taxonomy; 9.4 Sequence Recognition and Cross-reference; 9.5 Recognition of Secondary Structure Elements; 9.6 Validation of Structures; 9.7 Residue Identification; 9.8 Hetgroup Identification; 9.9 Solvent Handling 327 $a9.10 Miscellaneous Annotation Issues 330 $aTwo factors dominate current molecular biology: the amount of raw data is increasing very rapidly and successful applications in biomedical research require carefully curated and annotated databases. The quality of the experimental data -- especially nucleic acid sequences -- is satisfactory; however, annotations depend on features inferred from the data rather than measured directly, for instance the identification of genes in genome sequences. It is essential that these inferences are as accurate as possible and this requires human intervention.With the recognition of the importance 606 $aBioinformatics 606 $aNucleotide sequence$xData processing 606 $aAmino acid sequence$xData processing 615 0$aBioinformatics. 615 0$aNucleotide sequence$xData processing. 615 0$aAmino acid sequence$xData processing. 676 $a572.8 701 $aLesk$b Arthur M$066237 801 0$bMiAaPQ 801 1$bMiAaPQ 801 2$bMiAaPQ 906 $aBOOK 912 $a9910143748003321 996 $aDatabase annotation in molecular biology$91974496 997 $aUNINA