Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pull request #1690

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions _quarto.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
project:
type: book
output-dir: _book

type: website
output-dir: docs
book:
title: "R for Data Science (2e)"
reader-mode: true
Expand Down
3 changes: 2 additions & 1 deletion data-tidy.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,8 @@ billboard |>
The number of rows is now much lower, indicating that many rows with `NA`s were dropped.

You might also wonder what happens if a song is in the top 100 for more than 76 weeks?
We can't tell from this data, but you might guess that additional columns `wk77`, `wk78`, ... would be added to the dataset.
We can't tell from this data, but you might guess that additional columns `wk77`, `wk78`, ...
would be added to the dataset.

This data is now tidy, but we could make future computation a bit easier by converting values of `week` from character strings to numbers using `mutate()` and `readr::parse_number()`.
`parse_number()` is a handy function that will extract the first number from a string, ignoring all other text.
Expand Down
11 changes: 7 additions & 4 deletions data-transform.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@ You'll learn how to do all that (and more!) in this chapter, which will introduc
The goal of this chapter is to give you an overview of all the key tools for transforming a data frame.
We'll start with functions that operate on rows and then columns of a data frame, then circle back to talk more about the pipe, an important tool that you use to combine verbs.
We will then introduce the ability to work with groups.
We will end the chapter with a case study that showcases these functions in action. In later chapters, we'll return to the functions in more detail as we start to dig into specific types of data (e.g., numbers, strings, dates).
We will end the chapter with a case study that showcases these functions in action.
In later chapters, we'll return to the functions in more detail as we start to dig into specific types of data (e.g., numbers, strings, dates).

### Prerequisites

Expand Down Expand Up @@ -86,14 +87,15 @@ flights |>
```

dplyr's verbs are organized into four groups based on what they operate on: **rows**, **columns**, **groups**, or **tables**.
In the following sections, you'll learn the most important verbs for rows, columns, and groups. Then, we'll return to the join verbs that work on tables in @sec-joins.
In the following sections, you'll learn the most important verbs for rows, columns, and groups.
Then, we'll return to the join verbs that work on tables in @sec-joins.
Let's dive in!

## Rows

The most important verbs that operate on rows of a dataset are `filter()`, which changes which rows are present without changing their order, and `arrange()`, which changes the order of the rows without changing which are present.
Both functions only affect the rows, and the columns are left unchanged.
We'll also discuss `distinct()` which finds rows with unique values.
We'll also discuss `distinct()` which finds rows with unique values.
Unlike `arrange()` and `filter()` it can also optionally modify the columns.

### `filter()`
Expand Down Expand Up @@ -214,7 +216,8 @@ flights |>

It's not a coincidence that all of these distinct flights are on January 1: `distinct()` will find the first occurrence of a unique row in the dataset and discard the rest.

If you want to find the number of occurrences instead, you're better off swapping `distinct()` for `count()`. With the `sort = TRUE` argument, you can arrange them in descending order of the number of occurrences.
If you want to find the number of occurrences instead, you're better off swapping `distinct()` for `count()`.
With the `sort = TRUE` argument, you can arrange them in descending order of the number of occurrences.
You'll learn more about count in @sec-counts.

```{r}
Expand Down
Empty file added docs/.nojekyll
Empty file.
588 changes: 588 additions & 0 deletions docs/CODE_OF_CONDUCT.html

Large diffs are not rendered by default.

1,062 changes: 1,062 additions & 0 deletions docs/EDA.html

Large diffs are not rendered by default.

827 changes: 827 additions & 0 deletions docs/arrow.html

Large diffs are not rendered by default.

1,028 changes: 1,028 additions & 0 deletions docs/base-R.html

Large diffs are not rendered by default.

Loading