04979nam 2200673Ia 450 991100678850332120240313145636.01-68015-416-81-299-44085-11-78216-265-8(CKB)2550000001018294(EBL)1126744(SSID)ssj0000906520(PQKBManifestationID)11536132(PQKBTitleCode)TC0000906520(PQKBWorkID)10855556(PQKB)10328307(MiAaPQ)EBC1126744(CaSebORM)9781782162643(Au-PeEL)EBL1126744(CaPaEBR)ebr10682464(CaONFJC)MIL475335(OCoLC)840899851(PPN)228038472(OCoLC)852522514(OCoLC)ocn852522514(EXLCZ)99255000000101829420130418d2013 uy 0engur|n|---|||||txtccrClojure data analysis cookbook /Eric Rochester1st editionBirmingham, UK Packt Pub.c20131 online resource (342 p.)Includes index.1-78216-264-X Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Importing Data for Analysis; Introduction; Creating a new project; Reading CSV data into Incanter datasets; Reading JSON data into Incanter datasets; Reading data from Excel with Incanter; Reading data from JDBC databases; Reading XML data into Incanter datasets; Scraping data from tables in web pages; Scraping textual data from web pages; Reading RDF data; Reading RDF data with SPARQL; Aggregating data from different formats; Chapter 2: Cleaning and Validating DataIntroductionCleaning data with regular expressions; Maintaining consistency with synonym maps; Identifying and removing duplicate data; Normalizing numbers; Rescaling values; Normalizing dates and times; Lazily processing very large data sets; Sampling from very large data sets; Fixing spelling errors; Parsing custom data formats; Validating data with Valip; Chapter 3: Managing Complexity with Concurrent Programming; Introduction; Managing program complexity with STM; Managing program complexity with agents; Getting better performance with commute; Combining agents and STMMaintaining consistency with ensureIntroducing safe side effects into the STM; Maintaining data consistency with validators; Tracking processing with watchers; Debugging concurrent programs with watchers; Recovering from errors in agents; Managing input with sized queues; Chapter 4: Improving Performance with Parallel Programming; Introduction; Parallelizing processing with pmap; Parallelizing processing with Incanter; Partitioning Monte Carlo simulations for better pmap performance; Finding the optimal partition size with simulated annealing; Parallelizing with reducersGenerating online summary statistics with reducersHarnessing your GPU with OpenCL and Calx; Using type hints; Benchmarking with Criterium; Chapter 5: Distributed Data Processing with Cascalog; Introduction; Distributed processing with Cascalog and Hadoop; Querying data with Cascalog; Distributing data with Apache HDFS; Parsing CSV files with Cascalog; Complex queries with Cascalog; Aggregating data with Cascalog; Defining new Cascalog operators; Composing Cascalog queries; Handling errors in Cascalog workflows; Transforming data with CascalogExecuting Cascalog queries in the Cloud with PalletChapter 6: Working with Incanter Datasets; Introduction; Loading Incanter's sample datasets; Loading Clojure data structures into datasets; Viewing datasets interactively with view; Converting datasets to matrices; Using infix formulas in Incanter; Selecting columns with ; Selecting rows with ; Filtering datasets with where; Grouping data with group-by; Saving datasets to CSV and JSON; Projecting from multiple datasets with join; Chapter 7: Preparing for and Performing Statistical Data Analysis with Incanter; IntroductionGenerating summary statistics with rollupFull of practical tips, the ""Clojure Data Analysis Cookbook"" will help you fully utilize your data through a series of step-by-step, real world recipes covering every aspect of data analysis.Prior experience with Clojure and data analysis techniques and workflows will be beneficial, but not essential.Database searchingClojure (Computer program language)Database searching.Clojure (Computer program language)005.133Rochester Eric1685079MiAaPQMiAaPQMiAaPQBOOK9911006788503321Clojure data analysis cookbook4393167UNINA