-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"read_camtrap_dp" duplicates sequenceID when motionDetection and timeLapse are taken simultanuously #297
Comments
Thanks a lot, @lrdijkhuis, for reporting! I will try to look at it this week. |
@damianooldoni Here is an example to better understand the bug. Here is the sample data: Here is a code to inspect the bug: dat <- read_camtrap_dp("C:/data/sequencebug-20240213120120/datapackage.json") dat$data$observations %>% select(sequenceID, observationType) %>% count(sequenceID) %>% arrange(desc(n)) no duplicate sequenceID in observationscount number of duplicates without losing captureMethoddat$data$media %>% select(sequenceID, captureMethod) %>% distinct() %>% group_by(sequenceID) %>% mutate(n = n()) many duplicate sequences in mediafilter some duplicate seq from mediadat$data$media %>% filter(sequenceID %in% c("b87211da-aae9-4829-a5f3-ede037518617", Inspect one sequence in media: sequenceID is unique when captureMethod is not unique, same meta data.dat$data$media %>% filter(sequenceID %in% c("b87211da-aae9-4829-a5f3-ede037518617")) from above selection: media ID is not unique!dat$data$media %>% filter(mediaID %in% c("63bceb5b-8ec7-40cb-bcaf-1a4380edf47a")) %>% as.data.frame() |
Hi @lrdijkhuis. Sorry for the delay. Today I will work on this, at last! Thanks for the example. Very much appreciated. |
We are moving the reading/writing functionalities of camtrap Data Packages to a dedicated package, camtrapdp. See also #298). This issue arises during a downconversion from v1.0 to v0.1.6, something we will stop to support very soon. As camtrapdp R package will support camtrap Data Packages from v1.0 onwards, your issue will be automatically solved by using camtrapdp:
|
Hi @damianooldoni, Thanks for your solution, however it only partially fixes my issue. With the camtrapdp package there no longer is a link connecting the media information to the event (photo-sequence). How do you plan to deal with this, while keeping the hierarchical structure: deployments > observations > media? |
Hi @lrdijkhuis. I think this has been solved now in camtrapdp package. @peterdesmet added 3 weeks ago library(camtrapdp)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
# read datapackage
dat <- read_camtrapdp("C://Documents and Settings/damiano_oldoni/Documents/sequencebug-20240213120120/datapackage.json")
# Some mediaIDs are duplicated
media(dat) %>%
group_by(mediaID) %>%
add_tally() %>%
filter(n > 1)
#> # A tibble: 232 × 13
#> # Groups: mediaID [116]
#> mediaID deploymentID captureMethod timestamp filePath filePublic
#> <chr> <chr> <fct> <dttm> <chr> <lgl>
#> 1 b8d325f4-… 28906770-05… activityDete… 2019-06-01 17:42:21 https:/… FALSE
#> 2 b8d325f4-… 28906770-05… activityDete… 2019-06-01 17:42:21 https:/… FALSE
#> 3 5e48f4a5-… 28906770-05… timeLapse 2019-06-01 17:42:21 https:/… FALSE
#> 4 5e48f4a5-… 28906770-05… timeLapse 2019-06-01 17:42:21 https:/… FALSE
#> 5 5518912c-… 28906770-05… activityDete… 2019-06-02 13:11:06 https:/… FALSE
#> 6 5518912c-… 28906770-05… activityDete… 2019-06-02 13:11:06 https:/… FALSE
#> 7 c2127096-… 28906770-05… timeLapse 2019-06-02 13:11:06 https:/… FALSE
#> 8 c2127096-… 28906770-05… timeLapse 2019-06-02 13:11:06 https:/… FALSE
#> 9 7d72c1c1-… 28906770-05… activityDete… 2019-06-02 13:11:08 https:/… FALSE
#> 10 7d72c1c1-… 28906770-05… activityDete… 2019-06-02 13:11:08 https:/… FALSE
#> # ℹ 222 more rows
#> # ℹ 7 more variables: fileName <chr>, fileMediatype <chr>, exifData <chr>,
#> # favorite <lgl>, mediaComments <chr>, eventID <chr>, n <int>
# But the pair (mediaID - eventID) is unique: no duplicates!
media(dat) %>%
group_by(mediaID, eventID) %>%
add_tally() %>%
filter(n > 1)
#> # A tibble: 0 × 13
#> # Groups: mediaID, eventID [0]
#> # ℹ 13 variables: mediaID <chr>, deploymentID <chr>, captureMethod <fct>,
#> # timestamp <dttm>, filePath <chr>, filePublic <lgl>, fileName <chr>,
#> # fileMediatype <chr>, exifData <chr>, favorite <lgl>, mediaComments <chr>,
#> # eventID <chr>, n <int> Created on 2024-05-15 with reprex v2.1.0 In camtraptor we are going to read data packages using camtrapdp under the hood. So, same behavior will be expected once the new camtraptor will be released. I will not fix this in the actual version of camtraptor, as I would rather work on the refactoring to avoid any downconversion. Please, @lrdijkhuis, let me know if having |
When a timeLapse photo is taken while a motion trigger is active, read_camtrap_dp() now duplicates the eventID of the timeLapse sequence and activityDetection sequence. This issue seems to be triggered only when a timeLapse is taken during a acitivity trigger. It occurs here in the source code:
camtraptor/R/zzz.R
Line 716 in 45b72a2
A proper way to solve this issue would be adding back the eventID-identifier to the media.csv (now absent) and joining to eventID. Otherwise adding an extra grouping to the event_obs join that differentiates between timeLapse and activityDetection would likely solve the issue. However, it does not seem right to drop a key column like eventID from the media.csv because it holds the very much required connection of asset-info to the photo sequences.
A dummy project reproducing the issue is available, as is an example script.
See below an example of the issue
Kind regards,
Laurens Dijkhuis
The text was updated successfully, but these errors were encountered: