LEADER 04416nam 22006735 450 001 9910899895203321 005 20241026125731.0 010 $a3-031-69366-3 024 7 $a10.1007/978-3-031-69366-3 035 $a(MiAaPQ)EBC31741729 035 $a(Au-PeEL)EBL31741729 035 $a(CKB)36403438800041 035 $a(DE-He213)978-3-031-69366-3 035 $a(EXLCZ)9936403438800041 100 $a20241026d2024 u| 0 101 0 $aeng 135 $aurcnu|||||||| 181 $ctxt$2rdacontent 182 $cc$2rdamedia 183 $acr$2rdacarrier 200 10$aBig Data Infrastructure Technologies for Data Analytics $eScaling Data Science Applications for Continuous Growth /$fby Yuri Demchenko, Juan J. Cuadrado-Gallego, Oleg Chertov, Marharyta Aleksandrova 205 $a1st ed. 2024. 210 1$aCham :$cSpringer Nature Switzerland :$cImprint: Springer,$d2024. 215 $a1 online resource (553 pages) 311 $a3-031-69365-5 327 $aChapter 1 Introduction. - Chapter 2 Big Data Technologies Foundation: Definition, Reference Architecture, use cases. - Chapter 3 Cloud Computing Foundation: Definition, Reference Architecture, Foundational Technologies, Use cases. - Chapter 4 Cloud and Big Data Service Providers and Platforms. - Chapter 5 Big Data Algorithms, MapReduce and Hadoop ecosystem -- Chapter 6 Streaming Analytics and Spark -- Chapter 7 Data Structures for Big Data, Modern Big Data SQL and NoSQL Databases.-Chapter 8 Enterprise Data Governance and Management -- Chapter 9 Research Data Management -- Chapter 10 Big Data Security and Compliance, Data Privacy Protection -- Chapter 11 Finding Data on the Web, Data sets, Web Scraping, Web API -- Chapter 12 Data Science Projects Management,DataOps, MLOPs -- Chapter13 Data Science Projects Development with Amazon SageMaker -- Chapter 14 Data Validation for Data Science Projects. 330 $aThis book provides a comprehensive overview and introduction to Big Data Infrastructure technologies, existing cloud-based platforms, and tools for Big Data processing and data analytics, combining both a conceptual approach in architecture design and a practical approach in technology selection and project implementation. Readers will learn the core functionality of major Big Data Infrastructure components and how they integrate to form a coherent solution with business benefits. Specific attention will be given to understanding and using the major Big Data platform Apache Hadoop ecosystem, its main functional components MapReduce, HBase, Hive, Pig, Spark and streaming analytics. The book includes topics related to enterprise and research data management and governance and explains modern approaches to cloud and Big Data security and compliance. The book covers two knowledge areas defined in the EDISON Data Science Framework (EDSF): Data Science Engineering and Data Management and Governance and can be used as a textbook for university courses or provide a basis for practitioners for further self-study and practical use of Big Data technologies and competent evaluation and implementation of practical projects in their organizations. 606 $aArtificial intelligence$xData processing 606 $aQuantitative research 606 $aSoftware engineering 606 $aArtificial intelligence 606 $aApplication software 606 $aData Science 606 $aData Analysis and Big Data 606 $aSoftware Engineering 606 $aArtificial Intelligence 606 $aComputer and Information Systems Applications 615 0$aArtificial intelligence$xData processing. 615 0$aQuantitative research. 615 0$aSoftware engineering. 615 0$aArtificial intelligence. 615 0$aApplication software. 615 14$aData Science. 615 24$aData Analysis and Big Data. 615 24$aSoftware Engineering. 615 24$aArtificial Intelligence. 615 24$aComputer and Information Systems Applications. 676 $a005.7 700 $aDemchenko$b Yuri$01439883 701 $aCuadrado-Gallego$b Juan J$01439882 701 $aChertov$b Oleg$01767643 701 $aAleksandrova$b Marharyta$01767644 801 0$bMiAaPQ 801 1$bMiAaPQ 801 2$bMiAaPQ 906 $aBOOK 912 $a9910899895203321 996 $aBig Data Infrastructure Technologies for Data Analytics$94213958 997 $aUNINA