Computes descriptive statistics (mean, SD, min, max, confidence interval of the mean, n) for one or many continuous variables selected with tidyselect syntax.
With by, produces grouped summaries and reports a group-comparison
p-value by default (Welch test; change via test). Additional
inferential output is opt-in: test statistics (statistic) and
effect sizes (effect_size / effect_size_ci). Set p_value = FALSE
to suppress the p-value column. Without by, produces one-way
descriptive summaries.
Multiple output formats are available via output: a printed ASCII
table ("default"), a plain data.frame ("data.frame" or
"long" – synonyms for the underlying long-format data, see
Details), or publication-ready tables ("tinytable", "gt",
"flextable", "excel", "clipboard", "word").
This is the descriptive companion to table_continuous_lm(). The
two functions share their layout, alignment, and reporting precision
so descriptive and model-based analyses of the same data look
uniform side by side. Use table_continuous_lm() when you need
robust SE, weighted contrasts, fitted means, or covariate
adjustment.
Usage
table_continuous(
data,
select = tidyselect::everything(),
by = NULL,
exclude = NULL,
regex = FALSE,
test = c("welch", "student", "nonparametric"),
p_value = NULL,
statistic = FALSE,
show_n = TRUE,
effect_size = c("none", "auto", "hedges_g", "eta_sq", "r_rb", "epsilon_sq"),
effect_size_ci = FALSE,
ci = TRUE,
labels = NULL,
ci_level = 0.95,
digits = 2,
effect_size_digits = 2,
p_digits = 3,
decimal_mark = ".",
align = c("decimal", "auto", "center", "right"),
output = c("default", "data.frame", "long", "tinytable", "gt", "flextable", "excel",
"clipboard", "word"),
excel_path = NULL,
excel_sheet = "Descriptives",
clipboard_delim = "\t",
word_path = NULL,
verbose = FALSE
)Arguments
- data
A
data.frame.- select
Columns to include. If
regex = FALSE, use tidyselect syntax or a character vector of column names (default:tidyselect::everything()). Ifregex = TRUE, provide a regular expression pattern (character string).- by
Optional grouping column. Accepts an unquoted column name or a single character column name. Coerced to factor for grouping; non-numeric grouping columns (factor, character, logical) are supported as-is.
- exclude
Columns to exclude. Supports tidyselect syntax and character vectors of column names.
- regex
Logical. If
FALSE(the default), uses tidyselect helpers. IfTRUE, theselectargument is treated as a regular expression.- test
Character. Statistical test to use when comparing groups. One of
"welch"(default),"student", or"nonparametric"."welch": Welch t-test (2 groups) or Welch one-way ANOVA (3+ groups). Does not assume equal variances."student": Student t-test (2 groups) or classic one-way ANOVA (3+ groups). Assumes equal variances."nonparametric": Wilcoxon rank-sum / Mann–Whitney U (2 groups) or Kruskal–Wallis H (3+ groups).
Used whenever
byis supplied (sincep_valuedefaults toTRUEin that case) or whenstatistic = TRUE/effect_size = TRUE. Ignored whenbyis not used, or when all three display toggles are turned off.- p_value
Logical or
NULL. IfTRUEandbyis used, adds a p-value column from the test specified bytest. WhenNULL(the default), the p-value is shown automatically wheneverbyis supplied, and hidden otherwise. Passp_value = FALSEto suppress the column explicitly. Ignored whenbyis not used.- statistic
Logical. If
TRUEandbyis used, the test statistic is shown in an additional column (e.g.,t(df) = ...,F(df1, df2) = ...,W = ..., orH(df) = ...). Bothp_valueandstatisticare independent; either or both can be enabled. Defaults toFALSE. Ignored whenbyis not used.- show_n
Logical. If
TRUE, includes an unweightedncolumn in the printed ASCII table and in every rendered output (tinytable,gt,flextable,word,excel,clipboard). Set toFALSEto drop thencolumn structurally from those outputs (no empty placeholder, no spanner). Thencolumn is always present in the rawoutput = "data.frame"/"long"for downstream programmatic access. Defaults toTRUE.- effect_size
Effect-size measure to include in the rendered outputs. One of:
"none"(default): no effect-size column."auto": auto-select the canonical measure for the chosentestand group count – Hedges' g (parametric, 2 groups), eta-squared (parametric, 3+ groups), rank-biserial r (nonparametric, 2 groups), epsilon-squared (nonparametric, 3+ groups)."hedges_g": Hedges' g (bias-corrected standardised mean difference, 2 groups, parametric). CI via the Hedges & Olkin normal approximation."eta_sq": Eta-squared (\(\eta^2\), parametric ANOVA-styleSS_between / SS_total). CI via inversion of the noncentral F distribution."r_rb": Rank-biserial r from the Wilcoxon / Mann-Whitney statistic (2 groups, nonparametric). CI via Fisher z-transform."epsilon_sq": Epsilon-squared (\(\varepsilon^2\)) from the Kruskal-Wallis statistic (3+ groups, nonparametric). CI via percentile bootstrap (2 000 replicates).
For backward compatibility,
effect_size = TRUEis silently coerced to"auto"andeffect_size = FALSEto"none". Explicit choices are validated against the activetestand the number of groups; an incompatible request (e.g."eta_sq"with two groups, or"hedges_g"withtest = "nonparametric") triggers an actionable error. Ignored whenbyis not used.- effect_size_ci
Logical. If
TRUE, appends the confidence interval of the effect size in brackets (e.g.,g = 0.45 [0.22, 0.68]). Implies a non-"none"effect size: if left at the defaulteffect_size = "none", the function warns and promoteseffect_sizeto"auto"so the requested CI can be shown. Defaults toFALSE.- ci
Logical. If
TRUE, includes the mean confidence interval columns (<level>% CI LL/<level>% CI UL) and their spanner in the printed ASCII table and in every rendered output (tinytable,gt,flextable,word,excel,clipboard). Set toFALSEto drop both columns and the CI spanner structurally from those outputs (no empty placeholders, no border lines under an empty header). The CI bounds are always present asci_lower/ci_upperin the rawoutput = "data.frame"/"long"for downstream programmatic access. Defaults toTRUE. The CI level is taken fromci_level.- labels
An optional named character vector of variable labels. Names must match column names in
data. WhenNULL(the default), labels are auto-detected from variable attributes (e.g., haven labels); if none are found, the column name is used.- ci_level
Confidence level for the mean confidence interval (default:
0.95). Must be between 0 and 1 exclusive.- digits
Number of decimal places for descriptive values and test statistics (default:
2).- effect_size_digits
Number of decimal places for effect-size values in formatted displays (default:
2).- p_digits
Integer >= 1. Number of decimal places used to render p-values in the
pcolumn (default:3, the APA Publication Manual standard). Both the displayed precision and the small-p threshold derive from this argument:p_digits = 3prints.045and<.001;p_digits = 4prints.0451and<.0001;p_digits = 2prints.05and<.01. Useful for genomics / GWAS contexts with very small p-values, or for journals using a coarser convention. Leading zeros are always stripped, following APA convention.- decimal_mark
Character used as decimal separator. Either
"."(default) or",".- align
Horizontal alignment of numeric columns in the printed ASCII table and in the
tinytable,gt,flextable,word, andclipboardoutputs. The first column (Variable) andGroup(when present) are always left-aligned. One of:"decimal"(default): align numeric columns on the decimal mark, the standard scientific-publication convention used by SPSS, SAS, LaTeXsiunitx,gt::cols_align_decimal()andtinytable::style_tt(align = "d"). For engines without a native decimal-alignment primitive (flextable,word,clipboard, ASCII print), values are pre-padded with leading and trailing spaces so the dots line up vertically; the body of theflextable/wordoutput additionally uses a monospace font to make character widths uniform."center": center-align all numeric columns."right": right-align all numeric columns."auto": legacy per-column rule (center for the descriptive columns, right fornandp).
The
exceloutput uses the engine's default alignment in any case: cell-string padding does not align decimals under proportional fonts. Same default and semantics astable_continuous_lm().- output
Output format. One of:
"default": a printed ASCII table, returned invisibly."data.frame"/"long": a plaindata.framewith one row per(variable x group)(or one row pervariablewhenbyis not used). The two names are synonyms; pick whichever reads better in your pipeline ("long"matchestable_continuous_lm()'s naming)."tinytable"(requirestinytable)"gt"(requiresgt)"flextable"(requiresflextable)"excel"(requiresopenxlsx2)"clipboard"(requiresclipr)"word"(requiresflextableandofficer)
- excel_path
File path for
output = "excel".- excel_sheet
Sheet name for
output = "excel"(default:"Descriptives").- clipboard_delim
Delimiter for
output = "clipboard"(default:"\t").- word_path
File path for
output = "word".- verbose
Logical. If
TRUE, prints messages about excluded non-numeric columns (default:FALSE).
Value
Depends on output:
"default": prints a styled ASCII table and returns the underlyingdata.frameinvisibly (S3 class"spicy_continuous_table"/"spicy_table"). The object can be re-coerced viaas.data.frame.spicy_continuous_table()or piped intobroom::tidy()/broom::glance()."data.frame"/"long": a plaindata.framewith columnsvariable,label,group(whenbyis used),mean,sd,min,max,ci_lower,ci_upper,n. Whenbyis used together withp_value = TRUE,statistic = TRUE, oreffect_size != "none", additional columns are appended (populated on the first row of each variable block only):test_type– test identifier (e.g.,"welch_t","welch_anova","student_t","anova","wilcoxon","kruskal").statistic,df1,df2,p.value– test results.es_type– effect-size identifier ("hedges_g","eta_sq","r_rb", or"epsilon_sq"), wheneffect_size != "none".es_value,es_ci_lower,es_ci_upper– effect-size estimate and confidence interval bounds.
The two names
"data.frame"and"long"are synonyms (the descriptive output is naturally already long). Pick whichever reads better in your code."tinytable": atinytableobject."gt": agt_tblobject."flextable": aflextableobject."excel"/"word": writes to disk and returns the file path invisibly."clipboard": copies the table and returns the displaydata.frameinvisibly.
Tests
The omnibus test is computed only when by is supplied and at
least two groups remain after dropping NAs, with every group
contributing at least two observations. Choice of test family is
driven by test (see the @param entry for the full dispatch
and the underlying stats:: functions called).
For model-based contrasts (heteroskedasticity-consistent SE,
cluster-robust SE, weighted contrasts, fitted means, covariate
adjustment), use table_continuous_lm().
Effect sizes
See @param effect_size for the dispatch table (canonical
measure for each (test, n_groups) combination) and the
validation rules applied to explicit requests.
Confidence intervals (enabled with effect_size_ci = TRUE) use
noncentral F inversion for \(\eta^2\), the Hedges-Olkin
normal approximation for g, the Fisher z-transform for r,
and percentile bootstrap (2,000 replicates) for
\(\varepsilon^2\).
For Cohen's d, Hays' \(\omega^2\), and Cohen's f\(^2\)
(derived from a fitted, possibly weighted lm()), use the
model-based companion table_continuous_lm().
Display conventions
Decimal alignment, p-value formatting, and required suggested
packages per output engine are documented under @param align,
@param p_digits, and @param output respectively.
Non-numeric columns are silently dropped (set verbose = TRUE to
see which columns were excluded). When a constant column is
passed, SD and CI are shown as "--" in the ASCII table.
See also
table_continuous_lm() for the model-based companion
(heteroskedasticity-consistent SE, cluster-robust SE, weighted
contrasts, fitted means);
table_categorical() for categorical variables;
freq() for one-way frequency tables;
cross_tab() for two-way cross-tabulations.
Other spicy tables:
table_categorical(),
table_continuous_lm()
Examples
# --- Basic usage ---------------------------------------------------------
# Default: ASCII console table.
table_continuous(
sochealth,
select = c(bmi, wellbeing_score)
)
#> Descriptive statistics
#>
#> Variable │ M SD Min Max 95% CI LL
#> ───────────────────────────────┼────────────────────────────────────────
#> Body mass index │ 25.93 3.72 16.00 38.90 25.72
#> WHO-5 wellbeing index (0-100) │ 69.04 15.62 18.70 100.00 68.16
#>
#> Variable │ 95% CI UL n
#> ───────────────────────────────┼─────────────────
#> Body mass index │ 26.14 1188
#> WHO-5 wellbeing index (0-100) │ 69.93 1200
# Grouped by education (Welch p-value added by default).
table_continuous(
sochealth,
select = c(bmi, wellbeing_score),
by = education
)
#> Descriptive statistics
#>
#> Variable │ Group M SD Min Max
#> ───────────────────────────────┼──────────────────────────────────────────────
#> Body mass index │ Lower secondary 28.09 3.47 18.20 38.90
#> │ Upper secondary 26.02 3.43 16.00 37.10
#> │ Tertiary 24.39 3.52 16.00 33.00
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> WHO-5 wellbeing index (0-100) │ Lower secondary 57.22 15.44 18.70 97.90
#> │ Upper secondary 68.97 13.62 26.70 100.00
#> │ Tertiary 76.85 13.23 40.40 100.00
#>
#> Variable │ Group 95% CI LL 95% CI UL n
#> ───────────────────────────────┼────────────────────────────────────────────
#> Body mass index │ Lower secondary 27.66 28.51 260
#> │ Upper secondary 25.73 26.31 534
#> │ Tertiary 24.04 24.74 394
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> WHO-5 wellbeing index (0-100) │ Lower secondary 55.33 59.10 261
#> │ Upper secondary 67.82 70.12 539
#> │ Tertiary 75.55 78.15 400
#>
#> Variable │ Group p
#> ───────────────────────────────┼────────────────────────
#> Body mass index │ Lower secondary <.001
#> │ Upper secondary
#> │ Tertiary
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> WHO-5 wellbeing index (0-100) │ Lower secondary <.001
#> │ Upper secondary
#> │ Tertiary
# Test statistic alongside the p-value.
table_continuous(
sochealth,
select = c(bmi, wellbeing_score),
by = education,
statistic = TRUE
)
#> Descriptive statistics
#>
#> Variable │ Group M SD Min Max
#> ───────────────────────────────┼──────────────────────────────────────────────
#> Body mass index │ Lower secondary 28.09 3.47 18.20 38.90
#> │ Upper secondary 26.02 3.43 16.00 37.10
#> │ Tertiary 24.39 3.52 16.00 33.00
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> WHO-5 wellbeing index (0-100) │ Lower secondary 57.22 15.44 18.70 97.90
#> │ Upper secondary 68.97 13.62 26.70 100.00
#> │ Tertiary 76.85 13.23 40.40 100.00
#>
#> Variable │ Group 95% CI LL 95% CI UL n
#> ───────────────────────────────┼────────────────────────────────────────────
#> Body mass index │ Lower secondary 27.66 28.51 260
#> │ Upper secondary 25.73 26.31 534
#> │ Tertiary 24.04 24.74 394
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> WHO-5 wellbeing index (0-100) │ Lower secondary 55.33 59.10 261
#> │ Upper secondary 67.82 70.12 539
#> │ Tertiary 75.55 78.15 400
#>
#> Variable │ Group Test p
#> ───────────────────────────────┼───────────────────────────────────────────────
#> Body mass index │ Lower secondary F(2, 654.48) = 87.96 <.001
#> │ Upper secondary
#> │ Tertiary
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> WHO-5 wellbeing index (0-100) │ Lower secondary F(2, 638.59) = 144.35 <.001
#> │ Upper secondary
#> │ Tertiary
# --- Effect sizes -------------------------------------------------------
# Auto-selected effect size with confidence interval (Hedges' g for
# binary `by`, eta-squared for k > 2).
table_continuous(
sochealth,
select = wellbeing_score,
by = sex,
effect_size = "auto",
effect_size_ci = TRUE
)
#> Descriptive statistics
#>
#> Variable │ Group M SD Min Max 95% CI LL
#> ───────────────────────────────┼────────────────────────────────────────────────
#> WHO-5 wellbeing index (0-100) │ Female 67.16 14.80 19.60 100.00 65.99
#> │ Male 71.05 16.23 18.70 100.00 69.73
#>
#> Variable │ Group 95% CI UL n p
#> ───────────────────────────────┼───────────────────────────────
#> WHO-5 wellbeing index (0-100) │ Female 68.33 620 <.001
#> │ Male 72.37 580
#>
#> Variable │ Group ES
#> ───────────────────────────────┼──────────────────────────────────
#> WHO-5 wellbeing index (0-100) │ Female g = -0.25 [-0.36, -0.14]
#> │ Male
# Explicit effect-size measure.
table_continuous(
sochealth,
select = wellbeing_score,
by = education,
effect_size = "eta_sq",
effect_size_ci = TRUE,
effect_size_digits = 3
)
#> Descriptive statistics
#>
#> Variable │ Group M SD Min Max
#> ───────────────────────────────┼──────────────────────────────────────────────
#> WHO-5 wellbeing index (0-100) │ Lower secondary 57.22 15.44 18.70 97.90
#> │ Upper secondary 68.97 13.62 26.70 100.00
#> │ Tertiary 76.85 13.23 40.40 100.00
#>
#> Variable │ Group 95% CI LL 95% CI UL n
#> ───────────────────────────────┼────────────────────────────────────────────
#> WHO-5 wellbeing index (0-100) │ Lower secondary 55.33 59.10 261
#> │ Upper secondary 67.82 70.12 539
#> │ Tertiary 75.55 78.15 400
#>
#> Variable │ Group p
#> ───────────────────────────────┼────────────────────────
#> WHO-5 wellbeing index (0-100) │ Lower secondary <.001
#> │ Upper secondary
#> │ Tertiary
#>
#> Variable │ Group ES
#> ───────────────────────────────┼────────────────────────────────────────────
#> WHO-5 wellbeing index (0-100) │ Lower secondary η² = 0.208 [0.169, 0.246]
#> │ Upper secondary
#> │ Tertiary
# --- Selection helpers --------------------------------------------------
# Regex selection.
table_continuous(
sochealth,
select = "^life_sat",
regex = TRUE
)
#> Descriptive statistics
#>
#> Variable │ M SD Min Max 95% CI LL
#> ────────────────────────────────────────────┼───────────────────────────────────
#> Satisfaction with health (1-5) │ 3.55 1.25 1.00 5.00 3.48
#> Satisfaction with work (1-5) │ 3.38 1.18 1.00 5.00 3.31
#> Satisfaction with relationships (1-5) │ 3.72 1.10 1.00 5.00 3.66
#> Satisfaction with standard of living (1-5) │ 3.40 1.16 1.00 5.00 3.33
#>
#> Variable │ 95% CI UL n
#> ────────────────────────────────────────────┼─────────────────
#> Satisfaction with health (1-5) │ 3.62 1192
#> Satisfaction with work (1-5) │ 3.45 1192
#> Satisfaction with relationships (1-5) │ 3.79 1192
#> Satisfaction with standard of living (1-5) │ 3.46 1192
# Pretty labels keyed by column name.
table_continuous(
sochealth,
select = c(bmi, life_sat_health),
labels = c(
bmi = "Body mass index",
life_sat_health = "Satisfaction with health"
)
)
#> Descriptive statistics
#>
#> Variable │ M SD Min Max 95% CI LL 95% CI UL
#> ──────────────────────────┼─────────────────────────────────────────────────
#> Body mass index │ 25.93 3.72 16.00 38.90 25.72 26.14
#> Satisfaction with health │ 3.55 1.25 1.00 5.00 3.48 3.62
#>
#> Variable │ n
#> ──────────────────────────┼──────
#> Body mass index │ 1188
#> Satisfaction with health │ 1192
# --- Output formats -----------------------------------------------------
# The rendered outputs below all wrap the same call:
# table_continuous(sochealth,
# select = c(bmi, wellbeing_score),
# by = sex)
# only `output` changes. Assign each result to a variable -- some
# engines auto-print as a console-friendly text fallback inside
# the `?` help viewer.
# Wide / long data.frame (synonyms): one row per (variable x group).
table_continuous(
sochealth,
select = c(bmi, wellbeing_score),
by = sex,
output = "data.frame"
)
#> variable label group mean sd min
#> 1 bmi Body mass index Female 25.68506 3.781113 16.0
#> 2 bmi Body mass index Male 26.19685 3.638092 16.0
#> 3 wellbeing_score WHO-5 wellbeing index (0-100) Female 67.16194 14.798488 19.6
#> 4 wellbeing_score WHO-5 wellbeing index (0-100) Male 71.04879 16.227304 18.7
#> max ci_lower ci_upper n test_type statistic df1 df2 p.value
#> 1 38.9 25.38588 25.98425 616 welch_t -2.377237 1184.497 NA 1.760093e-02
#> 2 37.7 25.89808 26.49563 572 <NA> NA NA NA NA
#> 3 100.0 65.99480 68.32907 620 welch_t -4.326141 1168.700 NA 1.647005e-05
#> 4 100.0 69.72540 72.37219 580 <NA> NA NA NA NA
# \donttest{
# Rendered HTML / docx objects -- best viewed inside a
# Quarto / R Markdown document or a pkgdown article.
if (requireNamespace("tinytable", quietly = TRUE)) {
tt <- table_continuous(
sochealth, select = c(bmi, wellbeing_score), by = sex,
output = "tinytable"
)
}
if (requireNamespace("gt", quietly = TRUE)) {
tbl <- table_continuous(
sochealth, select = c(bmi, wellbeing_score), by = sex,
output = "gt"
)
}
if (requireNamespace("flextable", quietly = TRUE)) {
ft <- table_continuous(
sochealth, select = c(bmi, wellbeing_score), by = sex,
output = "flextable"
)
}
# Excel and Word: write to a temporary file.
if (requireNamespace("openxlsx2", quietly = TRUE)) {
tmp <- tempfile(fileext = ".xlsx")
table_continuous(
sochealth, select = c(bmi, wellbeing_score), by = sex,
output = "excel", excel_path = tmp
)
unlink(tmp)
}
if (
requireNamespace("flextable", quietly = TRUE) &&
requireNamespace("officer", quietly = TRUE)
) {
tmp <- tempfile(fileext = ".docx")
table_continuous(
sochealth, select = c(bmi, wellbeing_score), by = sex,
output = "word", word_path = tmp
)
unlink(tmp)
}
# }
if (FALSE) { # \dontrun{
# Clipboard: writes to the system clipboard.
table_continuous(
sochealth, select = c(bmi, wellbeing_score), by = sex,
output = "clipboard"
)
} # }