These functions test variables for uniqueness.

expect_unique(
  vars,
  exclude = getOption("testdat.miss"),
  flt = TRUE,
  data = get_testdata()
)

expect_unique_across(
  vars,
  exclude = getOption("testdat.miss"),
  flt = TRUE,
  data = get_testdata()
)

expect_unique_combine(
  vars,
  exclude = getOption("testdat.miss"),
  flt = TRUE,
  data = get_testdata()
)

Arguments

vars

<tidy-select> A set of columns to test.

exclude

a vector of values to exclude from uniqueness check. The testdat.miss option is used by default. To include all values, set exclude = NULL.

flt

<data-masking> A filter specifying a subset of the data frame to test.

data

A data frame to test. The global test data is used by default.

Value

expect_*() functions are mainly called for their side effects. The expectation signals its result (e.g. "success", "failure"), which is logged by the current test reporter. In a non-testing context the expectation will raise an error with class expectation_failure if it fails.

Details

  • expect_unique() tests a set of columns (vars) and fails if the combined columns do not uniquely identify each row.

  • expect_unique_across() tests a set of columns (vars) and fails if each row does not have unique values in each column.

  • expect_unique_combine() tests a set of columns (vars) and fails if any value appears more than once across all of them.

By default the uniqueness check excludes missing values (as specified by the testdat.miss option). Setting exclude = NULL will include all values.

Examples


student_fruit_preferences <- data.frame(
  student_id = c(1:5, NA, NA),
  apple = c(1, 1, 1, 1, 99, NA, NA),
  orange = c(2, 3, 2, 3, 99, NA, NA),
  banana = c(3, 2, 3, 2, 99, NA, NA),
  phone1 = c(123, 456, 789, 987, 654, NA, NA),
  phone2 = c(345, 678, 987, 567, 000, NA, NA)
)

# Check that key is unique, excluding NAs by default
expect_unique(student_id, data = student_fruit_preferences)

# Check that key is unique, including NAs
try(expect_unique(student_id, exclude = NULL, data = student_fruit_preferences))
#> Error : `student_fruit_preferences` has 2 duplicate records on variable `student_id`.
#> Filter: None

# Check each fruit has unique preference number
try(
expect_unique_across(
  c(apple, orange, banana),
  data = student_fruit_preferences
)
)
#> Error : `student_fruit_preferences` has 1 records with duplicates across variables `apple, orange, banana`.
#> Filter: None

# Check each fruit has unique preference number, allowing multiple 99 (item
# skipped) codes
expect_unique_across(
  c(apple, orange, banana),
  exclude = c(99, NA), data = student_fruit_preferences
)

# Check that each phone number appears at most once
try(expect_unique_combine(c(phone1, phone2), data = student_fruit_preferences))
#> Error : `student_fruit_preferences` has 2 records with duplicate values across variables `phone1, phone2`.
#> Filter: None