Counts, for each row of a data.frame or matrix, how many
times one or more values appear across selected columns. Supports
type-safe comparison (allow_coercion = FALSE), case-insensitive
string matching (ignore_case = TRUE), and detection of special
values (NA, NaN, Inf, -Inf) via special. Designed to
flow inside dplyr::mutate() pipelines.
Usage
count_n(
data = NULL,
select = tidyselect::everything(),
exclude = NULL,
count = NULL,
special = NULL,
allow_coercion = TRUE,
ignore_case = FALSE,
regex = FALSE,
verbose = FALSE
)Arguments
- data
A
data.frameormatrix. Optional insidedplyr::mutate(), where the current data context is used automatically.- select
Columns to include. Defaults to
tidyselect::everything(). Uses tidyselect helpers liketidyselect::starts_with(), etc. Ifregex = TRUE,selectis treated as a regex string.- exclude
Character vector of column names to exclude after selection. Defaults to
NULL(no exclusion).- count
Value(s) to count. Defaults to
NULL. Ignored ifspecialis used. Multiple values are allowed (e.g.,count = c(1, 2, 3)orcount = c("yes", "no")). R automatically coerces all values incountto a common type (e.g.,c(2, "2")becomesc("2", "2")), so all values are expected to be of the same final type. Ifallow_coercion = FALSE, matching is type-safe usingidentical(), and the type ofcountmust match that of the values in the data.- special
Character vector of special values to count:
"NA","NaN","Inf","-Inf", or"all". Defaults toNULL."NA"usesis.na(), and therefore includes bothNAandNaNvalues."NaN"usesis.nan()to match only actual NaN values.- allow_coercion
Logical. If
TRUE(the default), values are compared after coercion. IfFALSE, uses strict matching viaidentical().- ignore_case
Logical. If
FALSE(the default), comparisons are case-sensitive. IfTRUE, performs case-insensitive string comparisons.- regex
Logical. If
FALSE(the default), uses tidyselect helpers. IfTRUE, interpretsselectas a regular expression pattern.- verbose
Logical. If
FALSE(the default), messages are suppressed. IfTRUE, prints processing messages.
Strict matching (allow_coercion = FALSE)
Comparison falls back to identical() when types differ, which
also inspects factor levels. Two consequences:
count = "b"does not match a factor"b"value: pass a factor, e.g.count = factor("b", levels = levels(df$x)).Even with a factor
count, comparisons against columns whose level set differs will return0. To guarantee a perfect match (label and levels), reuse a value taken from the data itself (e.g.df$x[2]).
Case-insensitive matching (ignore_case = TRUE)
All values are converted to lowercase via tolower() before
matching; factor columns are first coerced to character. This
mode takes precedence over allow_coercion: equality becomes
lowercase string equality, so "b" and "B" match even when
allow_coercion = FALSE.
Coercion of count itself
R coerces mixed-type vectors at construction time: count = c(2, "2") becomes c("2", "2") before the function ever sees it.
To get type-sensitive matching, keep count homogeneous.
See also
datawizard::row_count() for a closely related row-wise
counter; count_n() adds element-wise type-safe matching,
multi-value count, and special-value detection.
Examples
library(dplyr)
#>
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#>
#> filter, lag
#> The following objects are masked from ‘package:base’:
#>
#> intersect, setdiff, setequal, union
library(tibble)
library(labelled)
# Basic usage
df <- tibble(
x = c(1, 2, 2, 3, NA),
y = c(2, 2, NA, 3, 2),
z = c("2", "2", "2", "3", "2")
)
count_n(df, count = 2)
#> [1] 2 3 2 0 2
count_n(df, count = 2, allow_coercion = FALSE)
#> [1] 1 2 1 0 1
df |> mutate(num_twos = count_n(count = 2))
#> # A tibble: 5 × 4
#> x y z num_twos
#> <dbl> <dbl> <chr> <dbl>
#> 1 1 2 2 2
#> 2 2 2 2 3
#> 3 2 NA 2 2
#> 4 3 3 3 0
#> 5 NA 2 2 2
# Mixed types and special values
df <- tibble(
num = c(1, 2, NA, -Inf, NaN),
char = c("a", "B", "b", "a", NA),
fact = factor(c("a", "b", "b", "a", "c")),
date = as.Date(c("2023-01-01", "2023-01-01", NA, "2023-01-02", "2023-01-01")),
lab = labelled(c(1, 2, 1, 2, NA), labels = c(No = 1, Yes = 2)),
logic = c(TRUE, FALSE, NA, TRUE, FALSE)
)
count_n(df, count = 2)
#> [1] 0 2 0 1 0
count_n(df, count = "b", ignore_case = TRUE)
#> [1] 0 2 3 0 0
count_n(df, count = "a", select = fact)
#> [1] 1 0 0 1 0
count_n(df, count = as.Date("2023-01-01"), select = date)
#> [1] 1 1 0 0 1
# Count special values
count_n(df, special = "NA")
#> [1] 0 0 3 0 3
# Column selection strategies
df <- tibble(
score_math = c(1, 2, 2, 3, NA),
score_science = c(2, 2, NA, 3, 2),
score_lang = c("2", "2", "2", "3", "2"),
name = c("Jean", "Marie", "Ali", "Zoe", "Nina")
)
count_n(df, select = c(score_math, score_science), count = 2)
#> [1] 1 2 1 0 1
count_n(df, select = starts_with("score_"), exclude = "score_lang", count = 2)
#> [1] 1 2 1 0 1
count_n(df, select = "^score_", regex = TRUE, count = 2)
#> [1] 2 3 2 0 2
df |> mutate(nb_two = count_n(count = 2))
#> # A tibble: 5 × 5
#> score_math score_science score_lang name nb_two
#> <dbl> <dbl> <chr> <chr> <dbl>
#> 1 1 2 2 Jean 2
#> 2 2 2 2 Marie 3
#> 3 2 NA 2 Ali 2
#> 4 3 3 3 Zoe 0
#> 5 NA 2 2 Nina 2
# Strict type-safe matching with factor columns
df <- tibble(
x = factor(c("a", "b", "c")),
y = factor(c("b", "B", "a"))
)
# Coercion: character "b" matches both x and y
count_n(df, count = "b")
#> [1] 1 1 0
# Strict match: fails because "b" is character, not factor (returns only 0s)
count_n(df, count = "b", allow_coercion = FALSE)
#> [1] 0 0 0
# Strict match with factor value: works only where levels match
count_n(df, count = factor("b", levels = levels(df$x)), allow_coercion = FALSE)
#> [1] 0 1 0