R for Data Science (2e)
Welcome
This is the website for the 2nd edition of “R for Data Science”. This book will teach you how to do data science with R: You’ll learn how to get your data into R, get it into the most useful structure, transform it and visualize.
What you’re reading is a lightly modified version of R for Data Science (2nd edition) for Dan Villarreal’s Data Science for Linguists class at the University of Pittsburgh. R4DS (2nd ed.) was originally written by Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund (book site, GitHub repo).
Text in yellow “callout” boxes like this are from Dan. 99% of the rest is by the original authors.
In this book, you will find a practicum of skills for data science. Just as a chemist learns how to clean test tubes and stock a lab, you’ll learn how to clean data and draw plots—and many other things besides. These are the skills that allow data science to happen, and here you will find the best practices for doing each of these things with R. You’ll learn how to use the grammar of graphics, literate programming, and reproducible research to save time. You’ll also learn how to manage cognitive resources to facilitate discoveries when wrangling, visualizing, and exploring data.
This website is and will always be free, licensed under the CC BY-NC-ND 3.0 License. If you’d like a physical copy of the book, you can order it on Bookshop.org. If you appreciate reading the book for free and would like to give back, please make a donation to Kākāpō Recovery: the kākāpō (which appears on the cover of R4DS) is a critically endangered parrot native to New Zealand; there are only 238 left.
If you speak another language, you might be interested in the freely available translations of the 1st edition:
You can find suggested answers to exercises in the book at https://mine-cetinkaya-rundel.github.io/r4ds-solutions.
Please note that R4DS uses a Contributor Code of Conduct. By contributing to this book, you agree to abide by its terms.