Maturing lifecycle

Calculates summary statistics from outputs of generate() or hypothesize().

Learn more in vignette("infer").

calculate(
  x,
  stat = c("mean", "median", "sum", "sd", "prop", "count", "diff in means",
    "diff in medians", "diff in props", "Chisq", "F", "slope", "correlation", "t", "z"),
  order = NULL,
  ...
)

Arguments

x

The output from generate() for computation-based inference or the output from hypothesize() piped in to here for theory-based inference.

stat

A string giving the type of the statistic to calculate. Current options include "mean", "median", "sum", "sd", "prop", "count", "diff in means", "diff in medians", "diff in props", "Chisq", "F", "t", "z", "slope", and "correlation".

order

A string vector of specifying the order in which the levels of the explanatory variable should be ordered for subtraction, where order = c("first", "second") means ("first" - "second") Needed for inference on difference in means, medians, or proportions and t and z statistics.

...

To pass options like na.rm = TRUE into functions like mean(), sd(), etc.

Value

A tibble containing a stat column of calculated statistics.

Examples

# calculate a null distribution of hours worked per week under # the null hypothesis that the mean is 40 gss %>% specify(response = hours) %>% hypothesize(null = "point", mu = 40) %>% generate(reps = 1000, type = "bootstrap") %>% calculate(stat = "mean")
#> Warning: Removed 1244 rows containing missing values.
#> # A tibble: 1,000 x 2 #> replicate stat #> <int> <dbl> #> 1 1 39.9 #> 2 2 39.9 #> 3 3 39.8 #> 4 4 40.1 #> 5 5 40.4 #> 6 6 40.1 #> 7 7 40.2 #> 8 8 40.4 #> 9 9 39.9 #> 10 10 40.3 #> # … with 990 more rows
# calculate a null distribution assuming independence between age # of respondent and whether they have a college degree gss %>% specify(age ~ college) %>% hypothesize(null = "independence") %>% generate(reps = 1000, type = "permute") %>% calculate("diff in means", order = c("degree", "no degree"))
#> Warning: Removed 22 rows containing missing values.
#> # A tibble: 1,000 x 2 #> replicate stat #> <int> <dbl> #> 1 1 0.814 #> 2 2 0.619 #> 3 3 0.146 #> 4 4 0.972 #> 5 5 0.381 #> 6 6 1.15 #> 7 7 0.0394 #> 8 8 0.651 #> 9 9 0.731 #> 10 10 0.111 #> # … with 990 more rows
# More in-depth explanation of how to use the infer package vignette("infer")
#> Warning: vignette ‘infer’ not found