table_apa() builds publication-ready cross-tabulation
tables suitable for APA-style reporting in social science and data
science research. It crosses one grouping variable with one or many row
variables, handling chi-squared p-values, effect sizes, confidence
intervals, and multi-level headers automatically. Export to gt,
tinytable, flextable, Excel, or Word. This vignette walks through the
main features.
Basic usage
At minimum, provide a data frame, one or more row variables, and a grouping variable:
table_apa(
sochealth,
row_vars = c("smoking", "physical_activity", "dentist_12m"),
group_var = "education"
)
#> Variable Level Lower secondary n Lower secondary % Upper secondary n
#> 1 smoking No 179 69.6 415
#> 2 smoking Yes 78 30.4 112
#> 3 physical_activity No 177 67.8 310
#> 4 physical_activity Yes 84 32.2 229
#> 5 dentist_12m No 113 43.3 174
#> 6 dentist_12m Yes 148 56.7 365
#> Upper secondary % Tertiary n Tertiary % Total n Total % p
#> 1 78.7 332 84.9 926 78.8 2.012877e-05
#> 2 21.3 59 15.1 249 21.2 2.012877e-05
#> 3 57.5 163 40.8 650 54.2 8.333584e-12
#> 4 42.5 237 59.2 550 45.8 8.333584e-12
#> 5 32.3 67 16.8 354 29.5 3.883413e-13
#> 6 67.7 333 83.2 846 70.5 3.883413e-13
#> Cramer's V
#> 1 0.1356677
#> 2 0.1356677
#> 3 0.2061986
#> 4 0.2061986
#> 5 0.2182388
#> 6 0.2182388The default output is "wide" with
style = "raw", which returns a data frame with numeric
values suitable for further processing.
Output formats
table_apa() supports several output formats. The table
below summarizes the options:
| Format | Description |
|---|---|
"wide" |
Data frame, one row per modality (default) |
"long" |
Data frame, one row per modality x group |
"gt" |
Formatted gt table |
"tinytable" |
Formatted tinytable |
"flextable" |
Formatted flextable |
"excel" |
Excel file (requires excel_path) |
"clipboard" |
Copy to clipboard |
"word" |
Word document (requires word_path) |
gt output
The "gt" format produces a table with APA-style borders,
column spanners, and proper alignment:
table_apa(
sochealth,
row_vars = c("smoking", "physical_activity", "dentist_12m"),
group_var = "education",
output = "gt"
)|
Variable
|
Lower secondary
|
Upper secondary
|
Tertiary
|
Total
|
p
|
Cramer's V
|
||||
|---|---|---|---|---|---|---|---|---|---|---|
| n | % | n | % | n | % | n | % | |||
| smoking | < .001 | .14 | ||||||||
| No | 179 | 69.6 | 415 | 78.7 | 332 | 84.9 | 926 | 78.8 | ||
| Yes | 78 | 30.4 | 112 | 21.3 | 59 | 15.1 | 249 | 21.2 | ||
| physical_activity | < .001 | .21 | ||||||||
| No | 177 | 67.8 | 310 | 57.5 | 163 | 40.8 | 650 | 54.2 | ||
| Yes | 84 | 32.2 | 229 | 42.5 | 237 | 59.2 | 550 | 45.8 | ||
| dentist_12m | < .001 | .22 | ||||||||
| No | 113 | 43.3 | 174 | 32.3 | 67 | 16.8 | 354 | 29.5 | ||
| Yes | 148 | 56.7 | 365 | 67.7 | 333 | 83.2 | 846 | 70.5 | ||
tinytable output
table_apa(
sochealth,
row_vars = c("smoking", "physical_activity"),
group_var = "education",
output = "tinytable"
)| Variable | Lower secondary | Upper secondary | Tertiary | Total | p | Cramer's V | ||||
|---|---|---|---|---|---|---|---|---|---|---|
| n | % | n | % | n | % | n | % | |||
| smoking | < .001 | .14 | ||||||||
| No | 179 | 69.6 | 415 | 78.7 | 332 | 84.9 | 926 | 78.8 | ||
| Yes | 78 | 30.4 | 112 | 21.3 | 59 | 15.1 | 249 | 21.2 | ||
| physical_activity | < .001 | .21 | ||||||||
| No | 177 | 67.8 | 310 | 57.5 | 163 | 40.8 | 650 | 54.2 | ||
| Yes | 84 | 32.2 | 229 | 42.5 | 237 | 59.2 | 550 | 45.8 | ||
Data frame output
Use style = "report" with "wide" or
"long" to get formatted character columns (ready for
display), or style = "raw" for numeric values (ready for
analysis):
table_apa(
sochealth,
row_vars = "smoking",
group_var = "education",
output = "wide",
style = "report"
)
#> Variable Lower secondary n Lower secondary % Upper secondary n
#> 1 smoking
#> 2 No 179 69.6 415
#> 3 Yes 78 30.4 112
#> Upper secondary % Tertiary n Tertiary % Total n Total % p Cramer's V
#> 1 < .001 .14
#> 2 78.7 332 84.9 926 78.8
#> 3 21.3 59 15.1 249 21.2Custom labels
By default, table_apa() uses variable names as row
headers. Use the labels argument to provide human-readable
labels:
table_apa(
sochealth,
row_vars = c("smoking", "physical_activity"),
group_var = "education",
labels = c("Smoking status", "Regular physical activity"),
output = "gt"
)|
Variable
|
Lower secondary
|
Upper secondary
|
Tertiary
|
Total
|
p
|
Cramer's V
|
||||
|---|---|---|---|---|---|---|---|---|---|---|
| n | % | n | % | n | % | n | % | |||
| Smoking status | < .001 | .14 | ||||||||
| No | 179 | 69.6 | 415 | 78.7 | 332 | 84.9 | 926 | 78.8 | ||
| Yes | 78 | 30.4 | 112 | 21.3 | 59 | 15.1 | 249 | 21.2 | ||
| Regular physical activity | < .001 | .21 | ||||||||
| No | 177 | 67.8 | 310 | 57.5 | 163 | 40.8 | 650 | 54.2 | ||
| Yes | 84 | 32.2 | 229 | 42.5 | 237 | 59.2 | 550 | 45.8 | ||
Association measures and confidence intervals
By default, table_apa() reports Cramer’s V for nominal
variables and automatically switches to Kendall’s Tau-b when both
variables are ordered factors. Override with
assoc_measure:
table_apa(
sochealth,
row_vars = "smoking",
group_var = "education",
assoc_measure = "phi",
output = "tinytable"
)| Variable | Lower secondary | Upper secondary | Tertiary | Total | p | Phi | ||||
|---|---|---|---|---|---|---|---|---|---|---|
| n | % | n | % | n | % | n | % | |||
| smoking | < .001 | |||||||||
| No | 179 | 69.6 | 415 | 78.7 | 332 | 84.9 | 926 | 78.8 | ||
| Yes | 78 | 30.4 | 112 | 21.3 | 59 | 15.1 | 249 | 21.2 | ||
Add confidence intervals with assoc_ci = TRUE. In
rendered formats (gt, tinytable, flextable), the CI is shown inline:
table_apa(
sochealth,
row_vars = c("smoking", "physical_activity"),
group_var = "education",
assoc_ci = TRUE,
output = "gt"
)|
Variable
|
Lower secondary
|
Upper secondary
|
Tertiary
|
Total
|
p
|
Cramer's V
|
||||
|---|---|---|---|---|---|---|---|---|---|---|
| n | % | n | % | n | % | n | % | |||
| smoking | < .001 | .14 [.08, .19] | ||||||||
| No | 179 | 69.6 | 415 | 78.7 | 332 | 84.9 | 926 | 78.8 | ||
| Yes | 78 | 30.4 | 112 | 21.3 | 59 | 15.1 | 249 | 21.2 | ||
| physical_activity | < .001 | .21 [.15, .26] | ||||||||
| No | 177 | 67.8 | 310 | 57.5 | 163 | 40.8 | 650 | 54.2 | ||
| Yes | 84 | 32.2 | 229 | 42.5 | 237 | 59.2 | 550 | 45.8 | ||
In data formats (wide, long, excel, clipboard), separate
CI lower and CI upper columns are added:
table_apa(
sochealth,
row_vars = "smoking",
group_var = "education",
assoc_ci = TRUE,
output = "wide",
style = "report"
)
#> Variable Lower secondary n Lower secondary % Upper secondary n
#> 1 smoking
#> 2 No 179 69.6 415
#> 3 Yes 78 30.4 112
#> Upper secondary % Tertiary n Tertiary % Total n Total % p Cramer's V
#> 1 < .001 .14
#> 2 78.7 332 84.9 926 78.8
#> 3 21.3 59 15.1 249 21.2
#> CI lower CI upper
#> 1 .08 .19
#> 2
#> 3Weighted tables
Pass survey weights with the weights argument. Use
rescale = TRUE so the total weighted N matches the
unweighted N:
table_apa(
sochealth,
row_vars = c("smoking", "physical_activity"),
group_var = "education",
weights = "weight",
rescale = TRUE,
output = "gt"
)|
Variable
|
Lower secondary
|
Upper secondary
|
Tertiary
|
Total
|
p
|
Cramer's V
|
||||
|---|---|---|---|---|---|---|---|---|---|---|
| n | % | n | % | n | % | n | % | |||
| smoking | < .001 | .13 | ||||||||
| No | 176 | 69.0 | 419 | 78.5 | 325 | 84.4 | 920.9 | 78.4 | ||
| Yes | 79 | 31.0 | 115 | 21.5 | 60 | 15.6 | 254.1 | 21.6 | ||
| physical_activity | < .001 | .19 | ||||||||
| No | 174 | 67.2 | 315 | 57.7 | 166 | 41.9 | 654.8 | 54.6 | ||
| Yes | 85 | 32.8 | 231 | 42.3 | 229 | 58.1 | 545.2 | 45.4 | ||
Handling missing values
By default, rows with missing values are dropped
(drop_na = TRUE). Set drop_na = FALSE to
display them as a “(Missing)” category:
table_apa(
sochealth,
row_vars = "income_group",
group_var = "education",
drop_na = FALSE,
output = "gt"
)|
Variable
|
Lower secondary
|
Upper secondary
|
Tertiary
|
Total
|
p
|
Cramer's V
|
||||
|---|---|---|---|---|---|---|---|---|---|---|
| n | % | n | % | n | % | n | % | |||
| income_group | < .001 | .18 | ||||||||
| (Missing) | 3 | 1.1 | 9 | 1.7 | 6 | 1.5 | 18 | 1.5 | ||
| High | 21 | 8.0 | 94 | 17.4 | 104 | 26.0 | 219 | 18.2 | ||
| Low | 87 | 33.3 | 115 | 21.3 | 45 | 11.2 | 247 | 20.6 | ||
| Lower middle | 92 | 35.2 | 186 | 34.5 | 110 | 27.5 | 388 | 32.3 | ||
| Upper middle | 58 | 22.2 | 135 | 25.0 | 135 | 33.8 | 328 | 27.3 | ||
Filtering and reordering levels
Use levels_keep to display only specific modalities. The
order you specify controls the display order, which is useful for
placing “(Missing)” last:
table_apa(
sochealth,
row_vars = "income_group",
group_var = "education",
drop_na = FALSE,
levels_keep = c("Low", "High", "(Missing)"),
output = "gt"
)|
Variable
|
Lower secondary
|
Upper secondary
|
Tertiary
|
Total
|
p
|
Cramer's V
|
||||
|---|---|---|---|---|---|---|---|---|---|---|
| n | % | n | % | n | % | n | % | |||
| income_group | < .001 | .18 | ||||||||
| Low | 87 | 33.3 | 115 | 21.3 | 45 | 11.2 | 247 | 20.6 | ||
| High | 21 | 8.0 | 94 | 17.4 | 104 | 26.0 | 219 | 18.2 | ||
| (Missing) | 3 | 1.1 | 9 | 1.7 | 6 | 1.5 | 18 | 1.5 | ||
Formatting options
Control the number of digits for percentages, p-values, and the association measure:
table_apa(
sochealth,
row_vars = "smoking",
group_var = "education",
percent_digits = 2,
p_digits = 4,
v_digits = 3,
output = "gt"
)|
Variable
|
Lower secondary
|
Upper secondary
|
Tertiary
|
Total
|
p
|
Cramer's V
|
||||
|---|---|---|---|---|---|---|---|---|---|---|
| n | % | n | % | n | % | n | % | |||
| smoking | < .001 | .136 | ||||||||
| No | 179 | 69.60 | 415 | 78.70 | 332 | 84.90 | 926 | 78.80 | ||
| Yes | 78 | 30.40 | 112 | 21.30 | 59 | 15.10 | 249 | 21.20 | ||
Exporting to Excel, Word, or clipboard
For Excel export, provide a file path:
table_apa(
sochealth,
row_vars = c("smoking", "physical_activity", "dentist_12m"),
group_var = "education",
output = "excel",
excel_path = "my_table.xlsx"
)For Word, use output = "word":
table_apa(
sochealth,
row_vars = c("smoking", "physical_activity", "dentist_12m"),
group_var = "education",
output = "word",
word_path = "my_table.docx"
)You can also copy directly to the clipboard for pasting into a spreadsheet or a text editor:
