Skip to contents

cross_tab() produces a cross-tabulation of x by y, with optional stratification using a grouping variable (by). It supports weighted frequencies, row or column percentages, and association statistics (Chi-squared test, Cramer's V).

Usage

cross_tab(
  d = parent.frame(),
  x,
  y = NULL,
  by = NULL,
  weights = NULL,
  rescale_weights = FALSE,
  digits = 1,
  rowprct = FALSE,
  row_total = TRUE,
  column_total = TRUE,
  n = TRUE,
  drop = TRUE,
  include_stats = TRUE,
  combine = FALSE,
  ...
)

Arguments

d

A data.frame, or a vector (when using vector input). Must contain all variables used in x, y, by, and weights.

x

Variable for table rows. Can be unquoted (tidy) or quoted (standard). Must match column name if d is a data frame.

y

Optional variable for table columns. Same rules as x. If NULL, computes a one-way frequency table.

by

Optional grouping variable (or interaction of variables). Used to produce stratified crosstabs. Must refer to columns in d, or be a vector of the same length as x.

weights

Optional numeric vector of weights. Must match length of x.

rescale_weights

Logical. If TRUE, rescales weights so that total weighted count matches unweighted count.

digits

Integer. Number of decimal places shown in percentages. Default is 1.

rowprct

Logical. If TRUE, computes percentages by row; otherwise by column.

row_total

Logical. If TRUE, adds row totals (default TRUE).

column_total

Logical. If TRUE, adds column totals (default TRUE).

n

Logical. If TRUE, displays effective counts N as an extra row or column (default TRUE).

drop

Logical. If TRUE, drops empty rows or columns (default TRUE).

include_stats

Logical. If TRUE, includes Chi-squared test and Cramer's V when possible (default TRUE).

combine

Logical. If TRUE, combines all stratified tables into one tibble with a by column.

...

Additional arguments passed to print.spicy(), such as show_all = TRUE

Value

A tibble of class spicy, or a list of such tibbles if combine = FALSE and by is used.

Details

The function is flexible:

  • Accepts both standard (quoted) and tidy (unquoted) variable input

  • Performs stratified tabulations using a grouping variable (by)

  • Optionally combines group-level tables into a single tibble with combine = TRUE

  • Pipe-friendly with both base R (|>) and magrittr (%>%)

All variables (x, y, by, weights) must be present in the data frame d (unless vector input is used).

Warnings and Errors

  • If weights is non-numeric, an error is thrown.

  • If weights does not match the number of observations, an error is thrown.

  • If rescale_weights = TRUE but no weights are provided, a warning is issued.

  • If all values in by are NA, an error is thrown.

  • If by has only one unique level (or all NA), a warning is issued.

Examples

data(mtcars)
mtcars$gear <- factor(mtcars$gear)
mtcars$cyl <- factor(mtcars$cyl)
mtcars$vs <- factor(mtcars$vs, labels = c("V", "S"))
mtcars$am <- factor(mtcars$am, labels = c("auto", "manual"))

# Basic usage
cross_tab(mtcars, cyl, gear)
#> Crosstable: cyl x gear (%)
#> ─────────────────────────────────────────
#>  Values           3     4     5 Row_Total
#> ─────────────────────────────────────────
#>  4              6.7  66.7  40.0      34.4
#>  6             13.3  33.3  20.0      21.9
#>  8             80.0   0.0  40.0      43.8
#>  Column_Total 100.0 100.0 100.0     100.0
#>  N             15.0  12.0   5.0      32.0
#> ─────────────────────────────────────────
#> Chi-2 = 18 (df = 4), p = 0.00121, Cramer's V = 0.53

# Using extracted variables
cross_tab(mtcars$cyl, mtcars$gear)
#> Crosstable: cyl x gear (%)
#> ─────────────────────────────────────────
#>  Values           3     4     5 Row_Total
#> ─────────────────────────────────────────
#>  4              6.7  66.7  40.0      34.4
#>  6             13.3  33.3  20.0      21.9
#>  8             80.0   0.0  40.0      43.8
#>  Column_Total 100.0 100.0 100.0     100.0
#>  N             15.0  12.0   5.0      32.0
#> ─────────────────────────────────────────
#> Chi-2 = 18 (df = 4), p = 0.00121, Cramer's V = 0.53

# Pipe-friendly syntax
mtcars |> cross_tab(cyl, gear, by = am)
#> $auto
#> Crosstable: cyl x gear | am = auto (%)
#> ───────────────────────────────────
#>  Values           3     4 Row_Total
#> ───────────────────────────────────
#>  4              6.7  50.0      15.8
#>  6             13.3  50.0      21.1
#>  8             80.0   0.0      63.2
#>  Column_Total 100.0 100.0     100.0
#>  N             15.0   4.0      19.0
#> ───────────────────────────────────
#> Chi-2 = 9 (df = 2), p = 0.0113, Cramer's V = 0.69
#> 
#> $manual
#> Crosstable: cyl x gear | am = manual (%)
#> ───────────────────────────────────
#>  Values           4     5 Row_Total
#> ───────────────────────────────────
#>  4             75.0  40.0      61.5
#>  6             25.0  20.0      23.1
#>  8              0.0  40.0      15.4
#>  Column_Total 100.0 100.0     100.0
#>  N              8.0   5.0      13.0
#> ───────────────────────────────────
#> Chi-2 = 3.8 (df = 2), p = 0.146, Cramer's V = 0.54
#> 

# With row percentages
cross_tab(mtcars, cyl, gear, by = am, rowprct = TRUE)
#> $auto
#> Crosstable: cyl x gear | am = auto (%)
#> ─────────────────────────────────────
#>  Values           3    4 Row_Total  N
#> ─────────────────────────────────────
#>  4             33.3 66.7     100.0  3
#>  6             50.0 50.0     100.0  4
#>  8            100.0  0.0     100.0 12
#>  Column_Total  78.9 21.1     100.0 19
#> ─────────────────────────────────────
#> Chi-2 = 9 (df = 2), p = 0.0113, Cramer's V = 0.69
#> 
#> $manual
#> Crosstable: cyl x gear | am = manual (%)
#> ─────────────────────────────────────
#>  Values          4     5 Row_Total  N
#> ─────────────────────────────────────
#>  4            75.0  25.0     100.0  8
#>  6            66.7  33.3     100.0  3
#>  8             0.0 100.0     100.0  2
#>  Column_Total 61.5  38.5     100.0 13
#> ─────────────────────────────────────
#> Chi-2 = 3.8 (df = 2), p = 0.146, Cramer's V = 0.54
#> 

# Using weights
cross_tab(mtcars, cyl, gear, weights = mpg)
#> Crosstable: cyl x gear (%)
#> ─────────────────────────────────────────
#>  Values           3     4     5 Row_Total
#> ─────────────────────────────────────────
#>  4              8.9  73.2  52.8      45.6
#>  6             16.3  26.8  18.4      21.5
#>  8             74.8   0.0  28.8      32.9
#>  Column_Total 100.0 100.0 100.0     100.0
#>  N            241.6 294.4 106.9     642.9
#> ─────────────────────────────────────────
#> Chi-2 = 355.1 (df = 4), p < 0.001, Cramer's V = 0.53

# With rescaled weights
cross_tab(mtcars, cyl, gear, weights = mpg, rescale_weights = TRUE)
#> Crosstable: cyl x gear (%)
#> ─────────────────────────────────────────
#>  Values           3     4     5 Row_Total
#> ─────────────────────────────────────────
#>  4              8.9  73.2  52.8      45.6
#>  6             16.3  26.8  18.4      21.5
#>  8             74.8   0.0  28.8      32.9
#>  Column_Total 100.0 100.0 100.0     100.0
#>  N             12.0  14.7   5.3      32.0
#> ─────────────────────────────────────────
#> Chi-2 = 17.7 (df = 4), p = 0.00143, Cramer's V = 0.53

# Grouped by a single variable
cross_tab(mtcars, cyl, gear, by = am)
#> $auto
#> Crosstable: cyl x gear | am = auto (%)
#> ───────────────────────────────────
#>  Values           3     4 Row_Total
#> ───────────────────────────────────
#>  4              6.7  50.0      15.8
#>  6             13.3  50.0      21.1
#>  8             80.0   0.0      63.2
#>  Column_Total 100.0 100.0     100.0
#>  N             15.0   4.0      19.0
#> ───────────────────────────────────
#> Chi-2 = 9 (df = 2), p = 0.0113, Cramer's V = 0.69
#> 
#> $manual
#> Crosstable: cyl x gear | am = manual (%)
#> ───────────────────────────────────
#>  Values           4     5 Row_Total
#> ───────────────────────────────────
#>  4             75.0  40.0      61.5
#>  6             25.0  20.0      23.1
#>  8              0.0  40.0      15.4
#>  Column_Total 100.0 100.0     100.0
#>  N              8.0   5.0      13.0
#> ───────────────────────────────────
#> Chi-2 = 3.8 (df = 2), p = 0.146, Cramer's V = 0.54
#> 

# Grouped by interaction of two variables
cross_tab(mtcars, cyl, gear, by = interaction(am, vs), combine = TRUE)
#> Crosstable: cyl x gear by interaction(am, vs)
#> ─────────────────────────────────────────────────────────────
#>  Values           3     4     5 Row_Total interaction(am, vs)
#> ─────────────────────────────────────────────────────────────
#>  8            100.0  <NA>  <NA>     100.0              auto.V
#>  Column_Total 100.0  <NA>  <NA>     100.0              auto.V
#>  N             12.0  <NA>  <NA>      12.0              auto.V
#>  4             <NA>   0.0  25.0      16.7            manual.V
#>  6             <NA> 100.0  25.0      50.0            manual.V
#>  8             <NA>   0.0  50.0      33.3            manual.V
#>  Column_Total  <NA> 100.0 100.0     100.0            manual.V
#>  N             <NA>   2.0   4.0       6.0            manual.V
#>  4             33.3  50.0  <NA>      42.9              auto.S
#>  6             66.7  50.0  <NA>      57.1              auto.S
#>  Column_Total 100.0 100.0  <NA>     100.0              auto.S
#>  N              3.0   4.0  <NA>       7.0              auto.S
#>  4             <NA> 100.0 100.0     100.0            manual.S
#>  Column_Total  <NA> 100.0 100.0     100.0            manual.S
#>  N             <NA>   6.0   1.0       7.0            manual.S
#> ─────────────────────────────────────────────────────────────
#> [interaction(am, vs) = auto.V] Chi-squared test not applicable (table too small).
#> [interaction(am, vs) = manual.V] Chi-2 = 3 (df = 2), p = 0.223, Cramer's V = 0.71
#> [interaction(am, vs) = auto.S] Chi-2 = 0.2 (df = 1), p = 0.659, Cramer's V = 0.17
#> [interaction(am, vs) = manual.S] Chi-squared test not applicable (table too small).

# Combined output for grouped data
cross_tab(mtcars, cyl, gear, by = am, combine = TRUE)
#> Crosstable: cyl x gear by am
#> ────────────────────────────────────────────────
#>  Values           3     4     5 Row_Total     am
#> ────────────────────────────────────────────────
#>  4              6.7  50.0  <NA>      15.8   auto
#>  6             13.3  50.0  <NA>      21.1   auto
#>  8             80.0   0.0  <NA>      63.2   auto
#>  Column_Total 100.0 100.0  <NA>     100.0   auto
#>  N             15.0   4.0  <NA>      19.0   auto
#>  4             <NA>  75.0  40.0      61.5 manual
#>  6             <NA>  25.0  20.0      23.1 manual
#>  8             <NA>   0.0  40.0      15.4 manual
#>  Column_Total  <NA> 100.0 100.0     100.0 manual
#>  N             <NA>   8.0   5.0      13.0 manual
#> ────────────────────────────────────────────────
#> [am = auto] Chi-2 = 9 (df = 2), p = 0.0113, Cramer's V = 0.69
#> [am = manual] Chi-2 = 3.8 (df = 2), p = 0.146, Cramer's V = 0.54

# Without totals or sample size
cross_tab(mtcars, cyl, gear, row_total = FALSE, column_total = FALSE, n = FALSE)
#> Crosstable: cyl x gear (%)
#> ──────────────────────
#>  Values    3    4    5
#> ──────────────────────
#>  4       6.7 66.7 40.0
#>  6      13.3 33.3 20.0
#>  8      80.0  0.0 40.0
#> ──────────────────────
#> Chi-2 = 18 (df = 4), p = 0.00121, Cramer's V = 0.53