Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consider changes to NA handling in is_duplicate() #12

Open
ccsarapas opened this issue Jul 7, 2024 · 0 comments
Open

consider changes to NA handling in is_duplicate() #12

ccsarapas opened this issue Jul 7, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@ccsarapas
Copy link
Owner

Currently, NAs and NaNs can count as duplicates if incomparables = TRUE. First, this is different from usage of incomparables in other functions such as match(), merge(), and duplicated(), and in particular, the behavior of is_duplicate(incomparables = FALSE) is opposite to that of, e.g., duplicated(incomparables = FALSE). Second, I'd like to incorporate three options for handling of NA and NaN:

  1. never count them as duplicates (ie, return value will be FALSE for all NA and NaN elements)
  2. allow them to count as duplicates (ie, return value may be TRUE if NA or NaN occurs more than nmax times
  3. return NA for all NA elements (since in principle these could be a duplicate of an other value
    • the logic here is tricky though -- strictly speaking, shouldn't everything be NA if there's any NA in the vector, since in principle any of the values might match that NA?
    • would also need to think about how to treat NaN in this case, which doesn't have the same implications of NA.

At the very least, the argument name should be changed and made clearer, and I need to give more thought to the other options.

@ccsarapas ccsarapas added the enhancement New feature or request label Jul 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant