Record Display for the EPA National Library Catalog
RECORD NUMBER: 13 OF 36Main Title | Introduction to data science : data analysis and prediction algorithms with R / | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Author | Irizarry, Rafael A., | ||||||||||||
Publisher | CRC Press, | ||||||||||||
Year Published | 2020 | ||||||||||||
OCLC Number | 1104856206 | ||||||||||||
ISBN | 9780367357986; 0367357984; 9780367357993; 0367357992 | ||||||||||||
Subjects | R (Computer program language) ; R (Computer program language)--Problems, exercises, etc ; Data mining--Problems, exercises, etc ; Information visualization ; Statistics--Data processing ; Probabilities--Data processing ; Computer algorithms ; Quantitative research ; Computer algorithms--Problems, exercises, etc ; Datenanalyse ; R--Programm ; Statistik ; Visualisierung | ||||||||||||
Holdings |
|
||||||||||||
Collation | xxx, 713 pages : color illustrations, charts (some color) ; 26 cm. | ||||||||||||
Notes | Includes bibliographical references and index. |
||||||||||||
Contents Notes | "The book begins by going over the basics of R and the tidyverse. You learn R throughout the book, but in the first part we go over the building blocks needed to keep learning during the rest of the book"-- Getting started with R and RStudio -- R Basics -- Programming basics -- The tidyverse -- Importing data -- Data visualization -- Introduction to data visualization -- ggplot2 -- Visualizing data distributions -- Data visualization in practice -- Data visualization principles -- Robust summaries -- Introduction to statistics with R -- Probability -- Random variables -- Statistical inference -- Statistical models -- Regression -- Linear models -- Association is not causation -- Introduction to data wrangling -- Reshaping data -- Joining tables -- Web scraping -- String processing -- Parsing dates and times -- Text mining -- Introduction to machine learning -- Smoothing -- Cross validation -- The caret package -- Examples of algorithms -- Machine learning in practice -- Large datasets -- Clustering -- Introduction to productivty tools -- Organizing with Unix -- Git and GitHub -- Reproducible projects with RStudio and R markdown. |