freq()
creates a frequency table for a variable or vector, with options for weighting, sorting, handling missing values, and calculating percentages.
Usage
freq(
data,
x = NULL,
weights = NULL,
digits = 1,
cum = FALSE,
total = TRUE,
exclude = NULL,
sort = "",
valid = TRUE,
na_val = NULL,
rescale_weights = FALSE,
info = TRUE,
labelled_levels = c("prefixed", "labels", "values"),
styled = TRUE,
show_empty_levels = FALSE,
...
)
Arguments
- data
A
data.frame
, vector or factor. If adata.frame
is provided, the target variablex
must be specified. Matrices are not supported; please extract a column or convert to a vector or tibble before use.- x
A dataframe variable.
- weights
A numeric vector of weights. Must be the same length as
x
.- digits
Numeric. Number of digits to be displayed for percentages. Default is
1
. For N, 2 digits are displayed if there is a weight variable with non-integer weights or if rescale_weight = T, otherwise 0.- cum
Logical. If
FALSE
(the default), do not display cumulative percentages. IfTRUE
, display cumulative percentages.- total
Logical. If
TRUE
(the default), add a final row of totals. IfFALSE
, remove a final row of totals.- exclude
Values to exclude (e.g.,
NA
, "Other"). Default isNULL
.- sort
Sorting method for values:
""
(default): No specific sorting."+"
: Sort by increasing frequency."-"
: Sort by decreasing frequency."name+"
: Sort alphabetically (A-Z)."name-"
: Sort alphabetically (Z-A).
- valid
Logical. If
TRUE
(the default), display valid percentages (excluding missing values). IfFALSE
, do not display valid percentages.- na_val
Character or numeric. For factors, character or numeric vectors, values to be treated as
NA
.- rescale_weights
Logical. If
FALSE
(the default), do not rescale weights. IfTRUE
, the total count will be the same as the unweightedx
.- info
Logical. If
TRUE
(the default), print a title and a note (label and class ofx
, variable weight, dataframe name) information about the model (model formula, number of observations, residual standard deviation and more).- labelled_levels
For
labelled
variables, controls how values are displayed usinglabelled::to_factor(levels = "prefixed")
:"prefixed"
or"p"
(default): Show labels as[value] label
"labels"
or"l"
: Show only the label"values"
or"v"
: Show only the underlying value
- styled
Logical. If
TRUE
(default), formats the output usingprint.spicy()
, which aligns columns dynamically in a structured three-line table. IfFALSE
, returns a standarddata.frame
without formatting.- show_empty_levels
Logical. If
FALSE
(default), factor levels withN = 0
are removed from the output. Set toTRUE
to retain all levels, even those with no observations.- ...
Additional arguments passed to
print.spicy()
, such asshow_all = TRUE
Value
A formatted data.frame
containing unique values of x
, their frequencies (N
), percentages (%
), percentages of valid values (Valid%
), with a "Total" row.
If
cum = TRUE
, cumulative frequencies (%cum
andValid%cum
) are included.
Examples
data(iris)
data(mtcars)
freq(iris, Species)
#> Frequency table: Species
#> ────────────────────────────
#> Values N % Valid%
#> ────────────────────────────
#> setosa 50 33.3 33.3
#> versicolor 50 33.3 33.3
#> virginica 50 33.3 33.3
#> Total 150 100.0 100.0
#> ────────────────────────────
#> Class: factor
#> Data: iris
#>
iris |> freq(Species, cum = TRUE)
#> Frequency table: Species
#> ────────────────────────────────────────────
#> Values N % Valid% %cum Valid%cum
#> ────────────────────────────────────────────
#> setosa 50 33.3 33.3 33.3 33.3
#> versicolor 50 33.3 33.3 66.7 66.7
#> virginica 50 33.3 33.3 100.0 100.0
#> Total 150 100.0 100.0 100.0 100.0
#> ────────────────────────────────────────────
#> Class: factor
#> Data: iris
#>
freq(mtcars, cyl, sort = "-", cum = TRUE)
#> Frequency table: cyl
#> ───────────────────────────────────────
#> Values N % Valid% %cum Valid%cum
#> ───────────────────────────────────────
#> 8 14 43.8 43.8 43.8 43.8
#> 4 11 34.4 34.4 78.1 78.1
#> 6 7 21.9 21.9 100.0 100.0
#> Total 32 100.0 100.0 100.0 100.0
#> ───────────────────────────────────────
#> Class: numeric
#> Data: mtcars
#>
freq(mtcars, gear, weights = mpg, rescale_weights = TRUE)
#> Frequency table: gear
#> ──────────────────────────
#> Values N % Valid%
#> ──────────────────────────
#> 3 12.03 37.6 37.6
#> 4 14.65 45.8 45.8
#> 5 5.32 16.6 16.6
#> Total 32.00 100.0 100.0
#> ──────────────────────────
#> Class: numeric
#> Data: mtcars
#> Weight: mpg
#>
# With labelled variable
library(labelled)
df <- data.frame(
var1 = set_variable_labels(1:5, label = "Numeric Variable with Label"),
var2 = labelled(1:5, c("Low" = 1, "Medium" = 2, "High" = 3)),
var3 = set_variable_labels(
labelled(1:5, c("Bad" = 1, "Average" = 2, "Good" = 3)),
label = "Labelled Variable with Label"))
df |> freq(var2)
#> Frequency table: var2
#> ──────────────────────────
#> Values N % Valid%
#> ──────────────────────────
#> [1] Low 1 20.0 20.0
#> [2] Medium 1 20.0 20.0
#> [3] High 1 20.0 20.0
#> [4] 4 1 20.0 20.0
#> [5] 5 1 20.0 20.0
#> Total 5 100.0 100.0
#> ──────────────────────────
#> Class: haven_labelled, vctrs_vctr, integer
#> Data: df
#>
df |> freq(var2,labelled_levels = "l")
#> Frequency table: var2
#> ──────────────────────
#> Values N % Valid%
#> ──────────────────────
#> Low 1 20.0 20.0
#> Medium 1 20.0 20.0
#> High 1 20.0 20.0
#> 4 1 20.0 20.0
#> 5 1 20.0 20.0
#> Total 5 100.0 100.0
#> ──────────────────────
#> Class: haven_labelled, vctrs_vctr, integer
#> Data: df
#>
df |> freq(var2,labelled_levels = "v")
#> Frequency table: var2
#> ──────────────────────
#> Values N % Valid%
#> ──────────────────────
#> 1 1 20.0 20.0
#> 2 1 20.0 20.0
#> 3 1 20.0 20.0
#> 4 1 20.0 20.0
#> 5 1 20.0 20.0
#> Total 5 100.0 100.0
#> ──────────────────────
#> Class: haven_labelled, vctrs_vctr, integer
#> Data: df
#>
df |> freq(var3)
#> Frequency table: var3
#> ───────────────────────────
#> Values N % Valid%
#> ───────────────────────────
#> [1] Bad 1 20.0 20.0
#> [2] Average 1 20.0 20.0
#> [3] Good 1 20.0 20.0
#> [4] 4 1 20.0 20.0
#> [5] 5 1 20.0 20.0
#> Total 5 100.0 100.0
#> ───────────────────────────
#> Label: Labelled Variable with Label
#> Class: haven_labelled, vctrs_vctr, integer
#> Data: df
#>
df |> freq(var3,labelled_levels = "v")
#> Frequency table: var3
#> ──────────────────────
#> Values N % Valid%
#> ──────────────────────
#> 1 1 20.0 20.0
#> 2 1 20.0 20.0
#> 3 1 20.0 20.0
#> 4 1 20.0 20.0
#> 5 1 20.0 20.0
#> Total 5 100.0 100.0
#> ──────────────────────
#> Label: Labelled Variable with Label
#> Class: haven_labelled, vctrs_vctr, integer
#> Data: df
#>
df |> freq(var3,labelled_levels = "l")
#> Frequency table: var3
#> ───────────────────────
#> Values N % Valid%
#> ───────────────────────
#> Bad 1 20.0 20.0
#> Average 1 20.0 20.0
#> Good 1 20.0 20.0
#> 4 1 20.0 20.0
#> 5 1 20.0 20.0
#> Total 5 100.0 100.0
#> ───────────────────────
#> Label: Labelled Variable with Label
#> Class: haven_labelled, vctrs_vctr, integer
#> Data: df
#>