Skip to contents

A simulated dataset of 1200 respondents from a fictional social-health survey, designed to illustrate the main features of the spicy package: variable labels, ordered factors, survey weights, association measures, and APA-style reporting.

Usage

sochealth

Format

A tibble with 1200 rows and 24 variables:

sex

Factor. Sex of the respondent.

age

Numeric. Age in years (25–75).

age_group

Ordered factor. Age group (25–34, 35–49, 50–64, 65–75).

education

Ordered factor. Highest education level (Lower secondary, Upper secondary, Tertiary).

social_class

Ordered factor. Subjective social class (Lower, Working, Lower middle, Middle, Upper middle).

region

Factor. Region of residence (6 regions).

employment_status

Factor. Employment status (Employed, Student, Unemployed, Inactive).

income_group

Ordered factor. Household income group (Low, Lower middle, Upper middle, High). Contains missing values.

income

Numeric. Monthly household income in CHF.

smoking

Factor. Current smoker (No, Yes). Contains missing values.

physical_activity

Factor. Regular physical activity (No, Yes).

dentist_12m

Factor. Dentist visit in the last 12 months (No, Yes).

self_rated_health

Ordered factor. Self-rated health (Poor, Fair, Good, Very good). Contains missing values.

wellbeing_score

Numeric. WHO-5 wellbeing index (0–100).

bmi

Numeric. Body mass index. Contains missing values.

bmi_category

Ordered factor. BMI category (Normal weight, Overweight, Obesity). Contains missing values.

institutional_trust

Ordered factor. Trust in institutions (Very low, Low, High, Very high).

political_position

Numeric. Political position on a 0 (left) to 10 (right) scale. Contains missing values.

life_sat_health

Integer. Satisfaction with own health (1–5 Likert scale). Contains missing values.

life_sat_work

Integer. Satisfaction with work or main activity (1–5 Likert scale). Contains missing values.

life_sat_relationships

Integer. Satisfaction with personal relationships (1–5 Likert scale). Contains missing values.

life_sat_standard

Integer. Satisfaction with standard of living (1–5 Likert scale). Contains missing values.

response_date

POSIXct. Date and time of survey response (September–November 2024).

weight

Numeric. Survey design weight.

Source

Simulated data for illustration purposes.

Details

All variables carry labels (accessible via labelled::var_label() and displayed by varlist()). Several ordered factors are included so that cross_tab() can demonstrate automatic ordinal measure selection.

Examples

data(sochealth)
varlist(sochealth)
#> Non-interactive session: use `tbl = TRUE` to return a tibble.
freq(sochealth, education)
#> Frequency table: education
#> 
#>  Category    Values               Freq.    Percent 
#> ────────────┼───────────────────────────────────────
#>  Valid       Lower secondary        261       21.8 
#>              Upper secondary        539       44.9 
#>              Tertiary               400       33.3 
#> ────────────┼───────────────────────────────────────
#>  Total                             1200      100.0 
#> 
#> Label: Highest education level
#> Class: ordered, factor
#> Data: sochealth
cross_tab(sochealth, education, self_rated_health)
#> Crosstable: education x self_rated_health (N)
#> 
#>  Values               Poor    Fair    Good    Very good    Total 
#> ───────────────────┼─────────────────────────────────────┼─────────
#>  Lower secondary        28      86     102           44      260 
#>  Upper secondary        28     118     263          118      527 
#>  Tertiary                5      62     193          133      393 
#> ───────────────────┼─────────────────────────────────────┼─────────
#>  Total                  61     266     558          295     1180 
#> 
#> Chi-2(6) = 73.2, p <.001
#> Kendall's Tau-b = 0.20