Replace valid observations by NAs when a given subject has more then max_na missing values.

clean_observations(data, id, var, max_na)

Arguments

data

A data frame, or data frame extension (e.g. a tibble).

id

The bare (unquoted) name of the column that identifies each subject.

var

The bare (unquoted) name of the column to be cleaned.

max_na

An integer indicating the maximum number of NAs per subject.

Value

The original data with the var observations matching the max_na criterion replaced by NA.

Examples

set.seed(10) data <- data.frame( id = rep(1:5, each = 4), time = rep(1:4, 5), score = sample(c(1:5, rep(NA, 2)), 20, replace = TRUE) ) clean_observations(data, id, score, 1)
#> id time score #> 1 1 1 3 #> 2 1 2 1 #> 3 1 3 2 #> 4 1 4 4 #> 5 2 1 NA #> 6 2 2 NA #> 7 2 3 NA #> 8 2 4 NA #> 9 3 1 NA #> 10 3 2 NA #> 11 3 3 NA #> 12 3 4 NA #> 13 4 1 NA #> 14 4 2 2 #> 15 4 3 2 #> 16 4 4 5 #> 17 5 1 NA #> 18 5 2 NA #> 19 5 3 NA #> 20 5 4 NA