Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NA-row Handling on timeseries functions #45

Open
lawinslow opened this issue Dec 4, 2014 · 6 comments
Open

NA-row Handling on timeseries functions #45

lawinslow opened this issue Dec 4, 2014 · 6 comments

Comments

@lawinslow
Copy link
Member

We have "na.rm" option, but what happens if an entire row is NA? (#43) If we skip such a row, we will still be returning an NA, breaking the API contract.

Should we drop that row? Should we return an NA? Should we throw an error?

@jzwart
Copy link
Member

jzwart commented Dec 4, 2014

You mean the user has the expectation they will not have any NA's returned to them if they use na.rm?

I think it should still return NA if an entire row is NA; a warning might be helpful to the user or modify the documentation to indicate that the na.rm option makes the tool calculate given physical parameter based on non-NA data, if all data is NA, then it returns an NA.

If it removes a row, then the data.frame dimensions get changed - I find that annoying when that happens.

@lawinslow
Copy link
Member Author

Interesting

sum(c(NA,NA,NA), na.rm=TRUE)
[1] 0

@lawinslow
Copy link
Member Author

max(c(NA,NA,NA), na.rm=TRUE)
[1] -Inf
Warning message:
In max(c(NA, NA, NA), na.rm = TRUE) :
  no non-missing arguments to max; returning -Inf

@jzwart
Copy link
Member

jzwart commented Dec 4, 2014

Huh. This is almost getting philosophical... the sum of nothing is 0 but the max of nothing is -Inf.

I like the way max() returns a warning message

@lawinslow
Copy link
Member Author

Sometimes when defining APIs, it gets philosophical. I just wanted to know what base functions were doing with this challenge. Clearly, it's kinda weird behavior that I don't think we'd want to adopt (return -Inf or 0, nope).

Let's call na.rm "best effort". Warning if it can't handle the input well, but still return an NA. I agree that it is fairly nice if the output is of same nrow as input.

I propose applying this style of handling to all other timeseries functions.

@jzwart
Copy link
Member

jzwart commented Dec 4, 2014

That sounds good to me.

On Thu, Dec 4, 2014 at 5:02 PM, Luke Winslow [email protected]
wrote:

Sometimes when defining APIs, it gets philosophical. I just wanted to know
what base functions were doing with this challenge. Clearly, it's kinda
weird behavior that I don't think we'd want to adopt (return -Inf or 0,
nope).

Let's call na.rm "best effort". Warning if it can't handle the input
well, but still return an NA. I agree that it is fairly nice if the output
is of same nrow as input.

I propose applying this style of handling to all other timeseries
functions.


Reply to this email directly or view it on GitHub
#45 (comment).

Jacob A. Zwart
University of Notre Dame
Jones Lab & MFE
263 Galvin Life Sciences
Notre Dame, IN 46556
269.370.2788

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants