Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reorder passed info when setting new pop and strat #207

Open
thibautjombart opened this issue Oct 27, 2017 · 6 comments
Open

reorder passed info when setting new pop and strat #207

thibautjombart opened this issue Oct 27, 2017 · 6 comments
Assignees

Comments

@thibautjombart
Copy link
Owner

thibautjombart commented Oct 27, 2017

If foo is named in these operations:

pop(x) <- foo
strata(x) <- foo

then it would make sense to check the names against indNames(x) and:

  1. issue error/warning if mismatches
  2. reorder otherwise

@zkamvar what do you think?

@romunov
Copy link
Collaborator

romunov commented Oct 27, 2017

How about preserving the pop(x) order of names and match names and assign values? Would this warrant a warning or a note?

@zkamvar
Copy link
Collaborator

zkamvar commented Oct 27, 2017

This could definitely be possible. @romunov, I don't think it would be wise to change the order of the data to mach the newly-assigned pop.

I DO think it would be a good idea to add the indNames to the unnamed vectors.

I also think think it would be a good idea to issue a warning if the population or the strata were in the wrong order.

  • check for named vector/data.frames
  • re-order and warn if necessary
  • name vectors that are unnamed
  • add documentation in the tutorials/documentation.

@romunov
Copy link
Collaborator

romunov commented Oct 27, 2017

I would also add a unit test for this.

@zkamvar
Copy link
Collaborator

zkamvar commented Oct 27, 2017 via email

@zkamvar
Copy link
Collaborator

zkamvar commented Oct 30, 2017

One more thing that should be considered with this:

Some monte carlo tests involve shuffling the data by using:

pop(x) <- sample(pop(x))

Under the proposed changes, this would do absolutely nothing, which would effectively break the above procedure. The remedy for this is to recommend users to do the following with monte carlo tests:

pop(x) <- unname(sample(pop(x)))

Perhaps one way to get around this would be to add a check.names = FALSE argument to the pop() strata() and other() constructors, recommending that people use:

pop(x, check.names = TRUE) <- myPop

when setting a new population? I'm still a bit on the fence about it because, on the one hand, you don't break things, but on the other hand, it's harder to enforce good practices. Additionally, there is something slightly unsettling about having an unordered population factor floating around.... but then again, there could be a method called check() for the genind which will reset the population factor/strata/other if they are named....

@thibautjombart
Copy link
Owner Author

Yes I think we want to keep the default behaviour of pop <- so that pop(x) <- sample(pop(x)) still does the same; adding a check.name argument would make sense. How about a second argument reorder which would trigger the automated reordering, defaulting to check.names? The use case would look like:

pop(x) <- sample(pop(x)) # re-shuffle pop
pop(x, check.names = TRUE) <- myPop # assign new pop with automated matching
pop(x, check.names = TRUE, reorder = FALSE) <- myPop # merely a sanity check

I am not sure if we want the last use case. If not, I would rename the check.name into something more explicit, e.g. match.labels or match.names.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants