A simulated dataset of 1200 respondents from a fictional social-health survey, designed to illustrate the main features of the spicy package: variable labels, ordered factors, survey weights, association measures, and APA-style reporting.
Format
A tibble with 1200 rows and 24 variables:
- sex
Factor. Sex of the respondent.
- age
Numeric. Age in years (25–75).
- age_group
Ordered factor. Age group (25–34, 35–49, 50–64, 65–75).
- education
Ordered factor. Highest education level (Lower secondary, Upper secondary, Tertiary).
- social_class
Ordered factor. Subjective social class (Lower, Working, Lower middle, Middle, Upper middle).
- region
Factor. Region of residence (6 regions).
- employment_status
Factor. Employment status (Employed, Student, Unemployed, Inactive).
- income_group
Ordered factor. Household income group (Low, Lower middle, Upper middle, High). Contains missing values.
- income
Numeric. Monthly household income in CHF.
- smoking
Factor. Current smoker (No, Yes). Contains missing values.
- physical_activity
Factor. Regular physical activity (No, Yes).
- dentist_12m
Factor. Dentist visit in the last 12 months (No, Yes).
- self_rated_health
Ordered factor. Self-rated health (Poor, Fair, Good, Very good). Contains missing values.
- wellbeing_score
Numeric. WHO-5 wellbeing index (0–100).
- bmi
Numeric. Body mass index. Contains missing values.
- bmi_category
Ordered factor. BMI category (Normal weight, Overweight, Obesity). Contains missing values.
- institutional_trust
Ordered factor. Trust in institutions (Very low, Low, High, Very high).
- political_position
Numeric. Political position on a 0 (left) to 10 (right) scale. Contains missing values.
- life_sat_health
Integer. Satisfaction with own health (1–5 Likert scale). Contains missing values.
- life_sat_work
Integer. Satisfaction with work or main activity (1–5 Likert scale). Contains missing values.
- life_sat_relationships
Integer. Satisfaction with personal relationships (1–5 Likert scale). Contains missing values.
- life_sat_standard
Integer. Satisfaction with standard of living (1–5 Likert scale). Contains missing values.
- response_date
POSIXct. Date and time of survey response (September–November 2024).
- weight
Numeric. Survey design weight.
Details
All variables carry labels (accessible via labelled::var_label()
and displayed by varlist()). Several ordered factors are included
so that cross_tab() can demonstrate automatic ordinal measure
selection.
Examples
data(sochealth)
varlist(sochealth)
#> Non-interactive session: use `tbl = TRUE` to return a tibble.
freq(sochealth, education)
#> Frequency table: education
#>
#> Category │ Values Freq. Percent
#> ──────────┼─────────────────────────────────
#> Valid │ Lower secondary 261 21.8
#> │ Upper secondary 539 44.9
#> │ Tertiary 400 33.3
#> ──────────┼─────────────────────────────────
#> Total │ 1200 100.0
#>
#> Label: Highest education level
#> Class: ordered, factor
#> Data: sochealth
cross_tab(sochealth, education, self_rated_health)
#> Crosstable: education x self_rated_health (N)
#>
#> Values │ Poor Fair Good Very good │ Total
#> ──────────────────────┼─────────────────────────────────────────────────┼────────────
#> Lower secondary │ 28 86 102 44 │ 260
#> Upper secondary │ 28 118 263 118 │ 527
#> Tertiary │ 5 62 193 133 │ 393
#> ──────────────────────┼─────────────────────────────────────────────────┼────────────
#> Total │ 61 266 558 295 │ 1180
#>
#> Chi-2(6) = 73.2, p < 0.001
#> Kendall's Tau-b = 0.20
