Skip to contents

freq() creates a frequency table for a variable or vector, with options for weighting, sorting, handling missing values, and calculating percentages.

Usage

freq(
  data,
  x = NULL,
  weights = NULL,
  digits = 1,
  cum = FALSE,
  total = TRUE,
  exclude = NULL,
  sort = "",
  valid = TRUE,
  na_val = NULL,
  rescale_weights = FALSE,
  info = TRUE,
  labelled_levels = c("prefixed", "labels", "values"),
  styled = TRUE,
  show_empty_levels = FALSE,
  ...
)

Arguments

data

A data.frame, vector or factor. If a data.frame is provided, the target variable x must be specified. Matrices are not supported; please extract a column or convert to a vector or tibble before use.

x

A dataframe variable.

weights

A numeric vector of weights. Must be the same length as x.

digits

Numeric. Number of digits to be displayed for percentages. Default is 1. For N, 2 digits are displayed if there is a weight variable with non-integer weights or if rescale_weight = T, otherwise 0.

cum

Logical. If FALSE (the default), do not display cumulative percentages. If TRUE, display cumulative percentages.

total

Logical. If TRUE (the default), add a final row of totals. If FALSE, remove a final row of totals.

exclude

Values to exclude (e.g., NA, "Other"). Default is NULL.

sort

Sorting method for values:

  • "" (default): No specific sorting.

  • "+": Sort by increasing frequency.

  • "-": Sort by decreasing frequency.

  • "name+": Sort alphabetically (A-Z).

  • "name-": Sort alphabetically (Z-A).

valid

Logical. If TRUE (the default), display valid percentages (excluding missing values). If FALSE, do not display valid percentages.

na_val

Character or numeric. For factors, character or numeric vectors, values to be treated as NA.

rescale_weights

Logical. If FALSE (the default), do not rescale weights. If TRUE, the total count will be the same as the unweighted x.

info

Logical. If TRUE (the default), print a title and a note (label and class of x, variable weight, dataframe name) information about the model (model formula, number of observations, residual standard deviation and more).

labelled_levels

For labelled variables, controls how values are displayed using labelled::to_factor(levels = "prefixed"):

  • "prefixed" or "p" (default): Show labels as [value] label

  • "labels" or "l": Show only the label

  • "values" or "v": Show only the underlying value

styled

Logical. If TRUE (default), formats the output using print.spicy(), which aligns columns dynamically in a structured three-line table. If FALSE, returns a standard data.frame without formatting.

show_empty_levels

Logical. If FALSE (default), factor levels with N = 0 are removed from the output. Set to TRUE to retain all levels, even those with no observations.

...

Additional arguments passed to print.spicy(), such as show_all = TRUE

Value

A formatted data.frame containing unique values of x, their frequencies (N), percentages (%), percentages of valid values (Valid%), with a "Total" row.

  • If cum = TRUE, cumulative frequencies (%cum and Valid%cum) are included.

Examples

data(iris)
data(mtcars)
freq(iris, Species)
#> Frequency table: Species
#> ────────────────────────────
#>  Values       N     % Valid%
#> ────────────────────────────
#>  setosa      50  33.3   33.3
#>  versicolor  50  33.3   33.3
#>  virginica   50  33.3   33.3
#>  Total      150 100.0  100.0
#> ────────────────────────────
#> Class: factor
#> Data: iris
#> 
iris |> freq(Species, cum = TRUE)
#> Frequency table: Species
#> ────────────────────────────────────────────
#>  Values       N     % Valid%  %cum Valid%cum
#> ────────────────────────────────────────────
#>  setosa      50  33.3   33.3  33.3      33.3
#>  versicolor  50  33.3   33.3  66.7      66.7
#>  virginica   50  33.3   33.3 100.0     100.0
#>  Total      150 100.0  100.0 100.0     100.0
#> ────────────────────────────────────────────
#> Class: factor
#> Data: iris
#> 
freq(mtcars, cyl, sort = "-", cum = TRUE)
#> Frequency table: cyl
#> ───────────────────────────────────────
#>  Values  N     % Valid%  %cum Valid%cum
#> ───────────────────────────────────────
#>  8      14  43.8   43.8  43.8      43.8
#>  4      11  34.4   34.4  78.1      78.1
#>  6       7  21.9   21.9 100.0     100.0
#>  Total  32 100.0  100.0 100.0     100.0
#> ───────────────────────────────────────
#> Class: numeric
#> Data: mtcars
#> 
freq(mtcars, gear, weights = mpg, rescale_weights = TRUE)
#> Frequency table: gear
#> ──────────────────────────
#>  Values     N     % Valid%
#> ──────────────────────────
#>  3      12.03  37.6   37.6
#>  4      14.65  45.8   45.8
#>  5       5.32  16.6   16.6
#>  Total  32.00 100.0  100.0
#> ──────────────────────────
#> Class: numeric
#> Data: mtcars
#> Weight: mpg
#> 

# With labelled variable
library(labelled)
df <- data.frame(
var1 = set_variable_labels(1:5, label = "Numeric Variable with Label"),
var2 = labelled(1:5, c("Low" = 1, "Medium" = 2, "High" = 3)),
var3 = set_variable_labels(
labelled(1:5, c("Bad" = 1, "Average" = 2, "Good" = 3)),
label = "Labelled Variable with Label"))
df |> freq(var2)
#> Frequency table: var2
#> ──────────────────────────
#>  Values     N     % Valid%
#> ──────────────────────────
#>  [1] Low    1  20.0   20.0
#>  [2] Medium 1  20.0   20.0
#>  [3] High   1  20.0   20.0
#>  [4] 4      1  20.0   20.0
#>  [5] 5      1  20.0   20.0
#>  Total      5 100.0  100.0
#> ──────────────────────────
#> Class: haven_labelled, vctrs_vctr, integer
#> Data: df
#> 
df |> freq(var2,labelled_levels = "l")
#> Frequency table: var2
#> ──────────────────────
#>  Values N     % Valid%
#> ──────────────────────
#>  Low    1  20.0   20.0
#>  Medium 1  20.0   20.0
#>  High   1  20.0   20.0
#>  4      1  20.0   20.0
#>  5      1  20.0   20.0
#>  Total  5 100.0  100.0
#> ──────────────────────
#> Class: haven_labelled, vctrs_vctr, integer
#> Data: df
#> 
df |> freq(var2,labelled_levels = "v")
#> Frequency table: var2
#> ──────────────────────
#>  Values N     % Valid%
#> ──────────────────────
#>  1      1  20.0   20.0
#>  2      1  20.0   20.0
#>  3      1  20.0   20.0
#>  4      1  20.0   20.0
#>  5      1  20.0   20.0
#>  Total  5 100.0  100.0
#> ──────────────────────
#> Class: haven_labelled, vctrs_vctr, integer
#> Data: df
#> 
df |> freq(var3)
#> Frequency table: var3
#> ───────────────────────────
#>  Values      N     % Valid%
#> ───────────────────────────
#>  [1] Bad     1  20.0   20.0
#>  [2] Average 1  20.0   20.0
#>  [3] Good    1  20.0   20.0
#>  [4] 4       1  20.0   20.0
#>  [5] 5       1  20.0   20.0
#>  Total       5 100.0  100.0
#> ───────────────────────────
#> Label: Labelled Variable with Label
#> Class: haven_labelled, vctrs_vctr, integer
#> Data: df
#> 
df |> freq(var3,labelled_levels = "v")
#> Frequency table: var3
#> ──────────────────────
#>  Values N     % Valid%
#> ──────────────────────
#>  1      1  20.0   20.0
#>  2      1  20.0   20.0
#>  3      1  20.0   20.0
#>  4      1  20.0   20.0
#>  5      1  20.0   20.0
#>  Total  5 100.0  100.0
#> ──────────────────────
#> Label: Labelled Variable with Label
#> Class: haven_labelled, vctrs_vctr, integer
#> Data: df
#> 
df |> freq(var3,labelled_levels = "l")
#> Frequency table: var3
#> ───────────────────────
#>  Values  N     % Valid%
#> ───────────────────────
#>  Bad     1  20.0   20.0
#>  Average 1  20.0   20.0
#>  Good    1  20.0   20.0
#>  4       1  20.0   20.0
#>  5       1  20.0   20.0
#>  Total   5 100.0  100.0
#> ───────────────────────
#> Label: Labelled Variable with Label
#> Class: haven_labelled, vctrs_vctr, integer
#> Data: df
#>