Beyond Spreadsheets with R: A beginner's guide to R and RStudio

Beyond Spreadsheets with R: A beginner's guide to R and RStudio

by Jonathan Carroll

Paperback(1st Edition)

$49.99
Members save with free shipping everyday! 
See details

Overview

Summary

Beyond Spreadsheets with R shows you how to take raw data and transform it for use in computations, tables, graphs, and more. You'll build on simple programming techniques like loops and conditionals to create your own custom functions. You'll come away with a toolkit of strategies for analyzing and visualizing data of all sorts using R and RStudio.

Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.

About the Technology

Spreadsheets are powerful tools for many tasks, but if you need to interpret, interrogate, and present data, they can feel like the wrong tools for the task. That's when R programming is the way to go. The R programming language provides a comfortable environment to properly handle all types of data. And within the open source RStudio development suite, you have at your fingertips easy-to-use ways to simplify complex manipulations and create reproducible processes for analysis and reporting.

About the Book

With Beyond Spreadsheets with R you'll learn how to go from raw data to meaningful insights using R and RStudio. Each carefully crafted chapter covers a unique way to wrangle data, from understanding individual values to interacting with complex collections of data, including data you scrape from the web. You'll build on simple programming techniques like loops and conditionals to create your own custom functions. You'll come away with a toolkit of strategies for analyzing and visualizing data of all sorts.

What's inside

  • How to start programming with R and RStudio
  • Understanding and implementing important R structures and operators
  • Installing and working with R packages
  • Tidying, refining, and plotting your data

About the Reader

If you're comfortable writing formulas in Excel, you're ready for this book.

About the Author

Dr Jonathan Carroll is a data science consultant providing R programming services. He holds a PhD in theoretical physics.

Table of Contents

  1. Introducing data and the R language
  2. Getting to know R data types
  3. Making new data values
  4. Understanding the tools you'll use: Functions
  5. Combining data values
  6. Selecting data values
  7. Doing things with lots of data
  8. Doing things conditionally: Control
  9. structures
  10. Visualizing data: Plotting
  11. Doing more with your data with extensions

Product Details

ISBN-13: 9781617294594
Publisher: Manning Publications Company
Publication date: 12/17/2018
Edition description: 1st Edition
Pages: 352
Product dimensions: 7.30(w) x 9.10(h) x 0.80(d)

About the Author

Dr Jonathan Carroll holds a PhD from the University of Adelaide in theoretical astrophysics, currently working in statistical modelling. He contributes packages to R, is a frequent contributor of answers on StackOverflow and an avid science communicator.

Table of Contents

Preface xiii

Acknowledgments xv

About this book xvii

About the authors xxv

About the cover illustration xxvi

1 Introducing data and the R language 1

1.1 Data: What, where, how? 2

What is data? 2

Seeing the world as data sources 2

Data munging 4

What you can do with well-handled data 4

Data as an asset 7

Reproducible research and version control 9

1.2 Introducing R 11

The origins of R 12

What R is and what it isn't 13

1.3 How R works 14

1.4 Introducing RStudio 17

Working with R within RStudio 17

Built-in packages (data and functions) 22

Built-in documentation 23

Vignettes 24

1.5 Try it yourself 24

2 Getting to know R data types 26

2.1 Types of data 27

Numbers 27

Text (strings) 31

Categories (factors) 32

Dates and times 35

Logicals 36

Missing values 37

2.2 Storing values (assigning) 38

Naming data (variables) 38

Unchanging data 43

The assignment operators (<- vs. =) 44

2.3 Specifying the data type 46

2.4 Telling R to ignore something 50

2.5 Try it yourself 51

3 Making new data values 53

3.1 Basic mathematics 53

3.2 Operator precedence 56

3.3 String concatenation (joining) 57

3.4 Comparisons 59

3.5 Automatic conversion (coercion) 63

3.6 Try it yourself 65

4 Understanding the tools you'll use: Functions 67

4.1 Functions 68

Under the hood 70

Function template 72

Arguments 75

Multiple arguments 78

Default arguments 80

Argument name matching 82

Partial matching 84

Scope 86

4.2 Packages 90

Installing packages 92

How does R (not) know about this function? 95

Namespaces 95

4.3 Messages, warnings, and errors, oh my! 97

Creating messages, warnings, and errors 98

Diagnosing messages, warnings, and errors 100

4.4 Testing 102

4.5 Project: Generalizing a function 103

4.6 Try it yourself 104

5 Combining data values 106

5.1 Simple collections 106

Coercion 108

Missing values 109

Attributes 109

Names 110

5.2 Sequences 112

Vector functions 116

Vector math operations 117

5.3 Matrices 119

Naming dimensions 121

5.4 Lists 122

5.5 data.frames 125

5.6 Classes 129

The tibble class 131

Structures as function arguments 135

5.7 Try it yourself 136

6 Selecting data values 139

6.1 Text processing 140

Text matching 140

Substrings 142

Text substitutions 142

Regular expressions 143

6.2 Selecting components from structures 146

Vectors 146

Lists 149

Matrices 153

6.3 Replacing values 155

6.4 data.frames and dplyr 159

Dplyr verbs 160

Non-standard evaluation 162

Pipes 164

Subsetting data.frame the hard way 167

6.5 Replacing NA 170

6.6 Selecting conditionally 171

6.7 Summarizing values 174

6.8 A worked example: Excel vs. R 177

6.9 Try it yourself 178

Solutions-no peeking 179

7 Doing things with lots of data 182

7.1 Tidy data principles 182

The working directory 184

Stored data formats 186

Reading data into R 187

Scraping data 191

Inspecting data 195

Dealing with odd values in data (sentinel values) 196

Converting to tidy data 199

7.2 Merging data 202

7.3 Writing data from R 208

7.4 Try if yourself 211

8 Doing things conditionally: Control structures 213

8.1 Looping 213

Vectorization 214

Tidy repetition: Looping with purrr 215

For loops 220

8.2 Wider and narrower loop scope 222

While loops 224

8.3 Conditional evaluation 225

If conditions 225

Ifelse conditions 229

8.4 Try it yourself 233

9 Visualizing data: Plotting 235

9.1 Data preparation 235

Tidy data, revisited 236

Importance of data types 236

9.2 ggplot2 237

General construction 237

Adding points 241

Style aesthetics 243

Adding lines 247

Adding bars 251

Other types of plots 258

Scales 260

Facetting 268

Additional options 273

9.3 Plots as objects 276

9.4 Saving plots 278

9.5 Try it yourself 279

10 Doing more with your data with extensions 281

10.1 Writing your own packages 282

Creating a minimal package 282

Documentation 283

10.2 Analyzing your package 287

Unit testing 288

Profiling 290

10.3 What to do nest? 291

Regression 291

Clustering 294

Working with maps 297

Interacting with APIs 300

Sharing your package 302

10.4 More resources 303

Appendix A Installing R 305

Appendix B Installing RStudio 307

Appendix C Graphics in bas R 309

Index 317

Customer Reviews