dialr is an R interface to Google’s libphonenumber java library.
libphonenumber defines the PhoneNumberUtil
class, a set of functions for extracting information from and performing processing on a parsed Phonenumber
object. A phone number must be parsed before any other operations (e.g. checking phone number validity, formatting) can be performed.
dialr provides an interface to these functions to easily parse and process phone numbers in R.
A phone class vector stores a parsed java Phonenumber
object for further processing alongside the original raw text phone number and default region. This “default region” is required to determine the processing context for non-international numbers.
To create a phone vector, use the phone()
function. This takes a character vector of phone numbers to parse and a default region for phone numbers not stored in an international format (i.e. with a leading “+”).
is_parsed(x) # Was the phone number successfully parsed?
#> [1] FALSE TRUE TRUE TRUE TRUE
is_valid(x) # Is the phone number valid?
#> [1] FALSE FALSE TRUE TRUE TRUE
is_possible(x) # Is the phone number possible?
#> [1] FALSE FALSE TRUE TRUE TRUE
get_region(x) # What region (ISO country code) is the phone number from?
#> [1] NA NA "AU" "AU" "US"
get_type(x) # Is the phone number a fixed line, mobile etc.
#> [1] NA "UNKNOWN" "MOBILE"
#> [4] "MOBILE" "FIXED_LINE_OR_MOBILE"
Equality comparisons for phone numbers ignore formatting differences and compare the underlying phone number.
phone("0404 753 123", "AU") == phone("+61404753123", "US")
#> [1] TRUE
phone("0404 753 123", "AU") == phone("0404 753 123", "US")
#> [1] FALSE
phone("0404 753 123", "AU") != phone("0404 753 123", "US")
#> [1] TRUE
Parsed phone numbers can also be compared to character phone numbers stored in an international format.
Use is_match()
for more customisable comparisons.
is_match(phone("0404 753 123", "AU"), c("+61404753123", "0404753123", "1234"))
#> [1] TRUE FALSE FALSE
is_match(phone("0404 753 123", "AU"), c("+61404753123", "0404753123", "1234"), detailed = TRUE)
#> [1] "EXACT_MATCH" "NSN_MATCH" "NO_MATCH"
is_match(phone("0404 753 123", "AU"), c("+61404753123", "0404753123", "1234"), strict = FALSE)
#> [1] TRUE TRUE FALSE
The phone class has a format()
method implementing libphonenumber’s core formatting functionality.
There are four phone number formats used by libphonenumber (see “Further reading” for details): "E164"
, "NATIONAL"
, "INTERNATIONAL"
and"RFC3966"
. These can be specified by the format
argument, or a default can be specifed in option dialr.format
.
If clean = TRUE
, all non-numeric characters are removed except for a leading +
. clean = TRUE
by default.
x <- phone(c(0, 0123, "0404 753 123", "61410123817", "+12015550123"), "AU")
format(x, format = "RFC3966")
#> [1] NA "+61123" "+61404753123" "+61410123817" "+12015550123"
format(x, format = "RFC3966", clean = FALSE)
#> [1] NA "tel:+61-123" "tel:+61-404-753-123"
#> [4] "tel:+61-410-123-817" "tel:+1-201-555-0123"
format(x, format = "E164", clean = FALSE)
#> [1] NA "+61123" "+61404753123" "+61410123817" "+12015550123"
format(x, format = "NATIONAL", clean = FALSE)
#> [1] NA "123" "0404 753 123" "0410 123 817"
#> [5] "(201) 555-0123"
format(x, format = "INTERNATIONAL", clean = FALSE)
#> [1] NA "+61 123" "+61 404 753 123" "+61 410 123 817"
#> [5] "+1 201-555-0123"
format(x, format = "RFC3966", clean = FALSE)
#> [1] NA "tel:+61-123" "tel:+61-404-753-123"
#> [4] "tel:+61-410-123-817" "tel:+1-201-555-0123"
# Change the default
getOption("dialr.format")
#> [1] "E164"
format(x)
#> [1] NA "+61123" "+61404753123" "+61410123817" "+12015550123"
options(dialr.format = "NATIONAL")
format(x)
#> [1] NA "123" "0404753123" "0410123817" "2015550123"
options(dialr.format = "E164")
If the home
argument is supplied, the phone number is formatted for dialling from the specified country.
format(x, home = "AU")
#> [1] NA "123" "0404753123" "0410123817"
#> [5] "001112015550123"
format(x, home = "US")
#> [1] NA "01161123" "01161404753123" "01161410123817"
#> [5] "12015550123"
format(x, home = "JP")
#> [1] NA "01061123" "01061404753123" "01061410123817"
#> [5] "01012015550123"
If strict = TRUE
, invalid phone numbers (determined using is_valid()
) return NA
.
format(x)
#> [1] NA "+61123" "+61404753123" "+61410123817" "+12015550123"
format(x, strict = TRUE)
#> [1] NA NA "+61404753123" "+61410123817" "+12015550123"
By default, as.character()
returns the raw text phone number. Use raw = FALSE
to use the format()
method instead.
as.character(x)
#> [1] "0" "123" "0404 753 123" "61410123817" "+12015550123"
as.character(x, raw = FALSE)
#> [1] NA "+61123" "+61404753123" "+61410123817" "+12015550123"
dialr functions are designed to work well in dplyr workflows.
# Use with dplyr
library(dplyr)
y <- tibble(id = 1:4,
phone1 = c(0, 0123, "0404 753 123", "61410123817"),
phone2 = c("03 9388 1234", 1234, "+12015550123", 0),
country = c("AU", "AU", "AU", "AU"))
y %>%
mutate_at(vars(matches("^phone")), ~phone(., country)) %>%
mutate_at(vars(matches("^phone")),
list(valid = is_valid,
region = get_region,
type = get_type,
clean = format))
#> # A tibble: 4 × 12
#> id phone1 phone2 country phone1_valid phone2_…¹ phone…² phone…³
#> <int> <phone> <phone> <chr> <lgl> <lgl> <chr> <chr>
#> 1 1 NA +61393881234 AU FALSE TRUE NA AU
#> 2 2 +61123 +611234 AU FALSE FALSE NA NA
#> 3 3 +61404753123 +12015550123 AU TRUE TRUE AU US
#> 4 4 +61410123817 NA AU TRUE FALSE AU NA
#> # … with 4 more variables: phone1_type <chr>, phone2_type <chr>,
#> # phone1_clean <chr>, phone2_clean <chr>, and abbreviated variable names
#> # ¹phone2_valid, ²phone1_region, ³phone2_region
"E164"
: general format for international telephone numbers from ITU-T Recommendation E.164
"NATIONAL"
: national notation from ITU-T Recommendation E.123
"INTERNATIONAL"
: international notation from ITU-T Recommendation E.123
"RFC3966"
: “tel” URI syntax from the IETF tel URI for Telephone Numbers