Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CWA reading - skipping time #81

Closed
muschellij2 opened this issue Oct 21, 2020 · 18 comments
Closed

CWA reading - skipping time #81

muschellij2 opened this issue Oct 21, 2020 · 18 comments
Assignees

Comments

@muschellij2
Copy link

muschellij2 commented Oct 21, 2020

I'm going to open one here and at wadpac/GGIR#369

Below I show 3 things:

  1. some really weird values from GGIR (way too high)
  2. Not the same values are being returned for GGIR and biobankAccelerometerAnalysis
  3. Skips in python3 reading from @activityMonitoring

GGIR

library(readr)
library(GGIR)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
fname = "data/example_90001_0_0.cwa.gz"
tfile = R.utils::gunzip(fname, 
                        remove = FALSE, temporary = TRUE)

out = GGIR::g.cwaread(tfile, end = Inf)
data = out$data[, c("time", "x", "y", "z")]
data$time = as.POSIXct(data$time, origin = "1970-01-01")
dim(data)
#> [1] 29864812        4
data[22656010:22656030,]
#>                         time        x         y          z
#> 22656010 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656011 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656012 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656013 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656014 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656015 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656016 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656017 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656018 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656019 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656020 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656021 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656022 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656023 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656024 2015-06-22 00:56:04 -0.50000  0.359375   0.609375
#> 22656025 2015-06-22 00:56:04 48.11477 76.729466 123.842931
#> 22656026 2015-06-22 00:56:04 48.09753 76.702373 123.799213
#> 22656027 2015-06-22 00:56:04 48.08029 76.675281 123.755496
#> 22656028 2015-06-22 00:56:04 48.06305 76.648189 123.711779
#> 22656029 2015-06-22 00:56:04 48.04581 76.621096 123.668062
#> 22656030 2015-06-22 00:56:04 48.02857 76.594004 123.624344

accProcess from biobankAccelerometerAnalysis

After running

python3 accProcess.py --skipCalibration=True --rawOutput=True data/example_90001_0_0.cwa.gz

we get the CSV, and read it in:

csv = readr::read_csv(sub("cwa", "csv", fname))
#> 
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   time = col_character(),
#>   x = col_double(),
#>   y = col_double(),
#>   z = col_double()
#> )
dim(csv)
#> [1] 29856000        4
head(csv)
#> # A tibble: 6 x 4
#>   time                                              x     y      z
#>   <chr>                                         <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03.771+0100 [Europe/London] -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03.781+0100 [Europe/London] -0.422 0.25   0.703
#> 3 2015-06-19 10:00:03.791+0100 [Europe/London] -0.422 0.281  0.734
#> 4 2015-06-19 10:00:03.801+0100 [Europe/London] -0.406 0.297  0.734
#> 5 2015-06-19 10:00:03.811+0100 [Europe/London] -0.391 0.313  0.734
#> 6 2015-06-19 10:00:03.821+0100 [Europe/London] -0.375 0.328  0.734

Making time an actual time object

csv = csv %>% 
  mutate(time = substr(time, 1, 23),
         time = lubridate::as_datetime(time, 
                                       tz = "Europe/London"))

Goes from 12:56AM to 12:46PM the next day - I'm not sure why this would be or if this is correct, but seems odd.

csv[22655995:22656020,]
#> # A tibble: 26 x 4
#>    time                     x      y      z
#>    <dttm>               <dbl>  <dbl>  <dbl>
#>  1 2015-06-22 00:56:03 -0.484  0.359  0.609
#>  2 2015-06-22 00:56:03 -0.484  0.359  0.609
#>  3 2015-06-22 00:56:03 -0.484  0.359  0.609
#>  4 2015-06-22 00:56:03 -0.484  0.359  0.609
#>  5 2015-06-22 00:56:03 -0.484  0.359  0.609
#>  6 2015-06-22 00:56:03 -0.484  0.359  0.609
#>  7 2015-06-23 12:46:03 -0.922 -0.328 -0.5  
#>  8 2015-06-23 12:46:03 -0.922 -0.328 -0.5  
#>  9 2015-06-23 12:46:03 -0.922 -0.328 -0.5  
#> 10 2015-06-23 12:46:03 -0.922 -0.328 -0.5  
#> # … with 16 more rows

Created on 2020-10-21 by the reprex package (v0.3.0.9001)

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       macOS Catalina 10.15.6      
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2020-10-21                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                           
#>  assertthat    0.2.1      2019-03-21 [2] CRAN (R 4.0.0)                   
#>  backports     1.1.10     2020-09-15 [1] CRAN (R 4.0.2)                   
#>  cli           2.1.0      2020-10-12 [1] CRAN (R 4.0.2)                   
#>  crayon        1.3.4      2017-09-16 [2] CRAN (R 4.0.0)                   
#>  data.table    1.13.0     2020-07-24 [2] CRAN (R 4.0.2)                   
#>  digest        0.6.26     2020-10-17 [1] CRAN (R 4.0.2)                   
#>  dplyr       * 1.0.2      2020-08-18 [2] CRAN (R 4.0.2)                   
#>  ellipsis      0.3.1      2020-05-15 [2] CRAN (R 4.0.0)                   
#>  evaluate      0.14       2019-05-28 [2] CRAN (R 4.0.0)                   
#>  fansi         0.4.1      2020-01-08 [2] CRAN (R 4.0.0)                   
#>  fs            1.5.0      2020-07-31 [2] CRAN (R 4.0.2)                   
#>  generics      0.0.2      2018-11-29 [2] CRAN (R 4.0.0)                   
#>  GGIR        * 2.0-1      2020-06-01 [2] Github (wadpac/GGIR@2bd3c76)     
#>  glue          1.4.2      2020-08-27 [1] CRAN (R 4.0.2)                   
#>  highr         0.8        2019-03-20 [2] CRAN (R 4.0.0)                   
#>  hms           0.5.3      2020-01-08 [2] CRAN (R 4.0.0)                   
#>  htmltools     0.5.0      2020-06-16 [2] CRAN (R 4.0.0)                   
#>  knitr         1.30       2020-09-22 [1] CRAN (R 4.0.2)                   
#>  lifecycle     0.2.0      2020-03-06 [2] CRAN (R 4.0.0)                   
#>  lubridate     1.7.9      2020-06-08 [2] CRAN (R 4.0.0)                   
#>  magrittr      1.5        2014-11-22 [2] CRAN (R 4.0.0)                   
#>  pillar        1.4.6      2020-07-10 [2] CRAN (R 4.0.2)                   
#>  pkgconfig     2.0.3      2019-09-22 [2] CRAN (R 4.0.0)                   
#>  purrr         0.3.4      2020-04-17 [2] CRAN (R 4.0.0)                   
#>  R.methodsS3   1.8.1      2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R.oo          1.24.0     2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R.utils       2.10.1     2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R6            2.4.1      2019-11-12 [2] CRAN (R 4.0.0)                   
#>  Rcpp          1.0.5      2020-07-06 [2] CRAN (R 4.0.0)                   
#>  readr       * 1.4.0      2020-10-05 [1] CRAN (R 4.0.2)                   
#>  reprex        0.3.0.9001 2020-09-30 [1] Github (tidyverse/reprex@d3fc4b8)
#>  rlang         0.4.8.9000 2020-10-20 [1] Github (r-lib/rlang@011cb4c)     
#>  rmarkdown     2.4        2020-09-30 [1] CRAN (R 4.0.2)                   
#>  rstudioapi    0.11       2020-02-07 [2] CRAN (R 4.0.0)                   
#>  sessioninfo   1.1.1      2018-11-05 [2] CRAN (R 4.0.0)                   
#>  stringi       1.5.3      2020-09-09 [1] CRAN (R 4.0.2)                   
#>  stringr       1.4.0      2019-02-10 [2] CRAN (R 4.0.0)                   
#>  styler        1.3.2      2020-02-23 [2] CRAN (R 4.0.0)                   
#>  tibble        3.0.4      2020-10-12 [1] CRAN (R 4.0.2)                   
#>  tidyselect    1.1.0      2020-05-11 [2] CRAN (R 4.0.0)                   
#>  utf8          1.1.4      2018-05-24 [2] CRAN (R 4.0.0)                   
#>  vctrs         0.3.4      2020-08-29 [1] CRAN (R 4.0.2)                   
#>  withr         2.3.0      2020-09-22 [1] CRAN (R 4.0.2)                   
#>  xfun          0.18       2020-09-29 [1] CRAN (R 4.0.2)                   
#>  yaml          2.2.1      2020-02-01 [2] CRAN (R 4.0.0)                   
#> 
#> [1] /Users/johnmuschelli/Library/R/4.0/library
#> [2] /Library/Frameworks/R.framework/Versions/4.0/Resources/library
@aidendoherty
Copy link
Member

Hi @muschellij2 - thanks for raising this. To help us investigate the potential accProcess time skipping issue - can you show what the output is for "csv[22655995:22656020,]" before converting the raw input to an R object ?

Many thanks,
Aiden

@muschellij2
Copy link
Author

What do you mean? You want to just read the lines in?

@aidendoherty
Copy link
Member

I would like to see the lines 22655995:22656020 from your 'csv' dataframe printed out before converting to an R datetime object. Thanks, Aiden

@muschellij2
Copy link
Author

# python3 accProcess.py --skipCalibration=True --rawOutput=True data/example_90001_0_0.cwa.gz

setwd("~/Dropbox/Packages/biobankAccelerometerAnalysis/")
library(readr)
library(GGIR)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
fname = "data/example_90001_0_0.cwa.gz"
xyz = c("x", "y", "z")
tfile = R.utils::gunzip(fname, 
                        remove = FALSE, temporary = TRUE)

out = GGIR::g.cwaread(tfile, end = Inf)
data = out$data[, c("time", xyz)]
data = tibble::as_tibble(data)
data$time = as.POSIXct(data$time, origin = "1970-01-01")
dim(data)
#> [1] 29864812        4
as.data.frame(data[22656010:22656030,])
#>                   time        x         y          z
#> 1  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 2  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 3  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 4  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 5  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 6  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 7  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 8  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 9  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 10 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 11 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 12 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 13 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 14 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 15 2015-06-22 00:56:04 -0.50000  0.359375   0.609375
#> 16 2015-06-22 00:56:04 48.11477 76.729466 123.842931
#> 17 2015-06-22 00:56:04 48.09753 76.702373 123.799213
#> 18 2015-06-22 00:56:04 48.08029 76.675281 123.755496
#> 19 2015-06-22 00:56:04 48.06305 76.648189 123.711779
#> 20 2015-06-22 00:56:04 48.04581 76.621096 123.668062
#> 21 2015-06-22 00:56:04 48.02857 76.594004 123.624344

csv_file = R.utils::gunzip(sub("cwa", "csv", fname), 
                        remove = FALSE, temporary = TRUE)
csvlines = readr::read_lines(csv_file, skip = 22655996, n_max = 50)
csv = readr::read_csv(csv_file)
#> 
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   time = col_character(),
#>   x = col_double(),
#>   y = col_double(),
#>   z = col_double()
#> )

dim(csv)
#> [1] 29856000        4
head(csv)
#> # A tibble: 6 x 4
#>   time                                              x     y      z
#>   <chr>                                         <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03.771+0100 [Europe/London] -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03.781+0100 [Europe/London] -0.422 0.25   0.703
#> 3 2015-06-19 10:00:03.791+0100 [Europe/London] -0.422 0.281  0.734
#> 4 2015-06-19 10:00:03.801+0100 [Europe/London] -0.406 0.297  0.734
#> 5 2015-06-19 10:00:03.811+0100 [Europe/London] -0.391 0.313  0.734
#> 6 2015-06-19 10:00:03.821+0100 [Europe/London] -0.375 0.328  0.734

as.data.frame(csv[22655995:22656020,])
#>                                            time      x      y      z
#> 1  2015-06-22 00:56:03.711+0100 [Europe/London] -0.484  0.359  0.609
#> 2  2015-06-22 00:56:03.721+0100 [Europe/London] -0.484  0.359  0.609
#> 3  2015-06-22 00:56:03.731+0100 [Europe/London] -0.484  0.359  0.609
#> 4  2015-06-22 00:56:03.741+0100 [Europe/London] -0.484  0.359  0.609
#> 5  2015-06-22 00:56:03.751+0100 [Europe/London] -0.484  0.359  0.609
#> 6  2015-06-22 00:56:03.761+0100 [Europe/London] -0.484  0.359  0.609
#> 7  2015-06-23 12:46:03.771+0100 [Europe/London] -0.922 -0.328 -0.500
#> 8  2015-06-23 12:46:03.781+0100 [Europe/London] -0.922 -0.328 -0.500
#> 9  2015-06-23 12:46:03.791+0100 [Europe/London] -0.922 -0.328 -0.500
#> 10 2015-06-23 12:46:03.801+0100 [Europe/London] -0.922 -0.328 -0.500
#> 11 2015-06-23 12:46:03.811+0100 [Europe/London] -0.922 -0.328 -0.500
#> 12 2015-06-23 12:46:03.821+0100 [Europe/London] -0.922 -0.328 -0.500
#> 13 2015-06-23 12:46:03.831+0100 [Europe/London] -0.922 -0.328 -0.500
#> 14 2015-06-23 12:46:03.841+0100 [Europe/London] -0.922 -0.328 -0.500
#> 15 2015-06-23 12:46:03.851+0100 [Europe/London] -0.922 -0.328 -0.500
#> 16 2015-06-23 12:46:03.861+0100 [Europe/London] -0.922 -0.328 -0.500
#> 17 2015-06-23 12:46:03.871+0100 [Europe/London] -0.922 -0.328 -0.500
#> 18 2015-06-23 12:46:03.881+0100 [Europe/London] -0.922 -0.328 -0.500
#> 19 2015-06-23 12:46:03.891+0100 [Europe/London] -0.922 -0.328 -0.500
#> 20 2015-06-23 12:46:03.901+0100 [Europe/London] -0.922 -0.328 -0.500
#> 21 2015-06-23 12:46:03.911+0100 [Europe/London] -0.922 -0.328 -0.500
#> 22 2015-06-23 12:46:03.921+0100 [Europe/London] -0.922 -0.328 -0.500
#> 23 2015-06-23 12:46:03.931+0100 [Europe/London] -0.922 -0.328 -0.500
#> 24 2015-06-23 12:46:03.941+0100 [Europe/London] -0.922 -0.328 -0.500
#> 25 2015-06-23 12:46:03.951+0100 [Europe/London] -0.922 -0.328 -0.500
#> 26 2015-06-23 12:46:03.961+0100 [Europe/London] -0.922 -0.328 -0.500
csv = csv %>% 
  mutate(time = substr(time, 1, 23),
         time = lubridate::as_datetime(time, 
                                       tz = "Europe/London"))
csvlines[22655995:22656020,]
#> Error in csvlines[22655995:22656020, ]: incorrect number of dimensions

as.data.frame(csv[22655995:22656020,])
#>                   time      x      y      z
#> 1  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 2  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 3  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 4  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 5  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 6  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 7  2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 8  2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 9  2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 10 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 11 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 12 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 13 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 14 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 15 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 16 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 17 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 18 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 19 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 20 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 21 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 22 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 23 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 24 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 25 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 26 2015-06-23 12:46:03 -0.922 -0.328 -0.500

con_fname = sub("cwa", "csv", fname)
con_fname = file.path(dirname(con_fname), paste0("cwaconvert_", basename(con_fname)))
con = readr::read_csv(con_fname, col_names = FALSE)
#> 
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   X1 = col_datetime(format = ""),
#>   X2 = col_double(),
#>   X3 = col_double(),
#>   X4 = col_double()
#> )
colnames(con) = c("time", xyz)
dim(con)
#> [1] 29245200        4
head(con)
#> # A tibble: 6 x 4
#>   time                     x     y      z
#>   <dttm>               <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03 -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03 -0.422 0.25   0.703
#> 3 2015-06-19 10:00:03 -0.422 0.281  0.734
#> 4 2015-06-19 10:00:03 -0.406 0.297  0.734
#> 5 2015-06-19 10:00:03 -0.391 0.312  0.734
#> 6 2015-06-19 10:00:03 -0.375 0.328  0.734

as.data.frame(con[22656010:22656030,])
#>                   time        x        y        z
#> 1  2015-06-23 14:29:32 0.484375 0.656250 0.406250
#> 2  2015-06-23 14:29:32 0.484375 0.640625 0.406250
#> 3  2015-06-23 14:29:32 0.484375 0.640625 0.406250
#> 4  2015-06-23 14:29:32 0.484375 0.640625 0.421875
#> 5  2015-06-23 14:29:32 0.484375 0.625000 0.406250
#> 6  2015-06-23 14:29:32 0.484375 0.625000 0.406250
#> 7  2015-06-23 14:29:32 0.500000 0.640625 0.390625
#> 8  2015-06-23 14:29:32 0.500000 0.625000 0.406250
#> 9  2015-06-23 14:29:32 0.500000 0.625000 0.406250
#> 10 2015-06-23 14:29:32 0.500000 0.625000 0.406250
#> 11 2015-06-23 14:29:32 0.515625 0.625000 0.406250
#> 12 2015-06-23 14:29:32 0.515625 0.625000 0.406250
#> 13 2015-06-23 14:29:32 0.515625 0.625000 0.406250
#> 14 2015-06-23 14:29:32 0.515625 0.625000 0.390625
#> 15 2015-06-23 14:29:32 0.515625 0.625000 0.406250
#> 16 2015-06-23 14:29:32 0.531250 0.625000 0.406250
#> 17 2015-06-23 14:29:32 0.531250 0.625000 0.406250
#> 18 2015-06-23 14:29:32 0.531250 0.609375 0.390625
#> 19 2015-06-23 14:29:32 0.546875 0.609375 0.406250
#> 20 2015-06-23 14:29:32 0.546875 0.609375 0.406250
#> 21 2015-06-23 14:29:32 0.546875 0.609375 0.406250

head(csv)
#> # A tibble: 6 x 4
#>   time                     x     y      z
#>   <dttm>               <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03 -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03 -0.422 0.25   0.703
#> 3 2015-06-19 10:00:03 -0.422 0.281  0.734
#> 4 2015-06-19 10:00:03 -0.406 0.297  0.734
#> 5 2015-06-19 10:00:03 -0.391 0.313  0.734
#> 6 2015-06-19 10:00:03 -0.375 0.328  0.734
head(con)
#> # A tibble: 6 x 4
#>   time                     x     y      z
#>   <dttm>               <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03 -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03 -0.422 0.25   0.703
#> 3 2015-06-19 10:00:03 -0.422 0.281  0.734
#> 4 2015-06-19 10:00:03 -0.406 0.297  0.734
#> 5 2015-06-19 10:00:03 -0.391 0.312  0.734
#> 6 2015-06-19 10:00:03 -0.375 0.328  0.734
round(head(con[, xyz]) - head(csv[, xyz]), 3)
#>   x      y z
#> 1 0  0.000 0
#> 2 0  0.000 0
#> 3 0  0.000 0
#> 4 0  0.000 0
#> 5 0 -0.001 0
#> 6 0  0.000 0
round(head(con[, xyz]) - head(data[, xyz]), 3)
#>        x      y     z
#> 1  0.000  0.000 0.000
#> 2 -0.002 -0.008 0.023
#> 3  0.000  0.001 0.001
#> 4  0.001  0.001 0.000
#> 5  0.001  0.001 0.000
#> 6  0.001  0.001 0.000
head(data)
#> # A tibble: 6 x 4
#>   time                     x     y      z
#>   <dttm>               <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03 -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03 -0.420 0.258  0.680
#> 3 2015-06-19 10:00:03 -0.422 0.280  0.733
#> 4 2015-06-19 10:00:03 -0.407 0.296  0.734
#> 5 2015-06-19 10:00:03 -0.392 0.312  0.734
#> 6 2015-06-19 10:00:03 -0.376 0.327  0.734

Created on 2020-10-21 by the reprex package (v0.3.0.9001)

@muschellij2
Copy link
Author

@muschellij2
Copy link
Author

# python3 accProcess.py --skipCalibration=True --rawOutput=True data/example_90001_0_0.cwa.gz

setwd("~/Dropbox/Packages/biobankAccelerometerAnalysis/")
library(readr)
library(GGIR)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
fname = normalizePath("data/example_90001_0_0.cwa.gz")
xyz = c("x", "y", "z")

GGIR

tfile = R.utils::gunzip(fname, 
                        remove = FALSE, temporary = TRUE)

out = GGIR::g.cwaread(tfile, end = Inf)
data = out$data[, c("time", xyz)]
data = tibble::as_tibble(data)
data$time = as.POSIXct(data$time, origin = "1970-01-01")
dim(data)
#> [1] 29864812        4
as.data.frame(data[22656010:22656030,])
#>                   time        x         y          z
#> 1  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 2  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 3  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 4  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 5  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 6  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 7  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 8  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 9  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 10 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 11 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 12 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 13 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 14 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 15 2015-06-22 00:56:04 -0.50000  0.359375   0.609375
#> 16 2015-06-22 00:56:04 48.11477 76.729466 123.842931
#> 17 2015-06-22 00:56:04 48.09753 76.702373 123.799213
#> 18 2015-06-22 00:56:04 48.08029 76.675281 123.755496
#> 19 2015-06-22 00:56:04 48.06305 76.648189 123.711779
#> 20 2015-06-22 00:56:04 48.04581 76.621096 123.668062
#> 21 2015-06-22 00:56:04 48.02857 76.594004 123.624344

biobankAccelerometerAnalysis

csv_file = R.utils::gunzip(sub("cwa", "csv", fname), 
                        remove = FALSE, temporary = TRUE)
csvlines = readr::read_lines(csv_file, skip = 22655996, n_max = 50)
csvlines
#>  [1] "2015-06-22 00:56:03.721+0100 [Europe/London],-0.484,0.359,0.609"  
#>  [2] "2015-06-22 00:56:03.731+0100 [Europe/London],-0.484,0.359,0.609"  
#>  [3] "2015-06-22 00:56:03.741+0100 [Europe/London],-0.484,0.359,0.609"  
#>  [4] "2015-06-22 00:56:03.751+0100 [Europe/London],-0.484,0.359,0.609"  
#>  [5] "2015-06-22 00:56:03.761+0100 [Europe/London],-0.484,0.359,0.609"  
#>  [6] "2015-06-23 12:46:03.771+0100 [Europe/London],-0.922,-0.328,-0.500"
#>  [7] "2015-06-23 12:46:03.781+0100 [Europe/London],-0.922,-0.328,-0.500"
#>  [8] "2015-06-23 12:46:03.791+0100 [Europe/London],-0.922,-0.328,-0.500"
#>  [9] "2015-06-23 12:46:03.801+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [10] "2015-06-23 12:46:03.811+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [11] "2015-06-23 12:46:03.821+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [12] "2015-06-23 12:46:03.831+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [13] "2015-06-23 12:46:03.841+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [14] "2015-06-23 12:46:03.851+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [15] "2015-06-23 12:46:03.861+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [16] "2015-06-23 12:46:03.871+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [17] "2015-06-23 12:46:03.881+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [18] "2015-06-23 12:46:03.891+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [19] "2015-06-23 12:46:03.901+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [20] "2015-06-23 12:46:03.911+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [21] "2015-06-23 12:46:03.921+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [22] "2015-06-23 12:46:03.931+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [23] "2015-06-23 12:46:03.941+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [24] "2015-06-23 12:46:03.951+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [25] "2015-06-23 12:46:03.961+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [26] "2015-06-23 12:46:03.971+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [27] "2015-06-23 12:46:03.981+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [28] "2015-06-23 12:46:03.991+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [29] "2015-06-23 12:46:04.001+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [30] "2015-06-23 12:46:04.011+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [31] "2015-06-23 12:46:04.021+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [32] "2015-06-23 12:46:04.031+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [33] "2015-06-23 12:46:04.041+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [34] "2015-06-23 12:46:04.051+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [35] "2015-06-23 12:46:04.061+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [36] "2015-06-23 12:46:04.071+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [37] "2015-06-23 12:46:04.081+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [38] "2015-06-23 12:46:04.091+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [39] "2015-06-23 12:46:04.101+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [40] "2015-06-23 12:46:04.111+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [41] "2015-06-23 12:46:04.121+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [42] "2015-06-23 12:46:04.131+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [43] "2015-06-23 12:46:04.141+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [44] "2015-06-23 12:46:04.151+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [45] "2015-06-23 12:46:04.161+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [46] "2015-06-23 12:46:04.171+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [47] "2015-06-23 12:46:04.181+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [48] "2015-06-23 12:46:04.191+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [49] "2015-06-23 12:46:04.201+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [50] "2015-06-23 12:46:04.211+0100 [Europe/London],-0.922,-0.328,-0.500"
csv = readr::read_csv(csv_file)
#> 
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   time = col_character(),
#>   x = col_double(),
#>   y = col_double(),
#>   z = col_double()
#> )

dim(csv)
#> [1] 29856000        4
head(csv)
#> # A tibble: 6 x 4
#>   time                                              x     y      z
#>   <chr>                                         <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03.771+0100 [Europe/London] -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03.781+0100 [Europe/London] -0.422 0.25   0.703
#> 3 2015-06-19 10:00:03.791+0100 [Europe/London] -0.422 0.281  0.734
#> 4 2015-06-19 10:00:03.801+0100 [Europe/London] -0.406 0.297  0.734
#> 5 2015-06-19 10:00:03.811+0100 [Europe/London] -0.391 0.313  0.734
#> 6 2015-06-19 10:00:03.821+0100 [Europe/London] -0.375 0.328  0.734

as.data.frame(csv[22655995:22656020,])
#>                                            time      x      y      z
#> 1  2015-06-22 00:56:03.711+0100 [Europe/London] -0.484  0.359  0.609
#> 2  2015-06-22 00:56:03.721+0100 [Europe/London] -0.484  0.359  0.609
#> 3  2015-06-22 00:56:03.731+0100 [Europe/London] -0.484  0.359  0.609
#> 4  2015-06-22 00:56:03.741+0100 [Europe/London] -0.484  0.359  0.609
#> 5  2015-06-22 00:56:03.751+0100 [Europe/London] -0.484  0.359  0.609
#> 6  2015-06-22 00:56:03.761+0100 [Europe/London] -0.484  0.359  0.609
#> 7  2015-06-23 12:46:03.771+0100 [Europe/London] -0.922 -0.328 -0.500
#> 8  2015-06-23 12:46:03.781+0100 [Europe/London] -0.922 -0.328 -0.500
#> 9  2015-06-23 12:46:03.791+0100 [Europe/London] -0.922 -0.328 -0.500
#> 10 2015-06-23 12:46:03.801+0100 [Europe/London] -0.922 -0.328 -0.500
#> 11 2015-06-23 12:46:03.811+0100 [Europe/London] -0.922 -0.328 -0.500
#> 12 2015-06-23 12:46:03.821+0100 [Europe/London] -0.922 -0.328 -0.500
#> 13 2015-06-23 12:46:03.831+0100 [Europe/London] -0.922 -0.328 -0.500
#> 14 2015-06-23 12:46:03.841+0100 [Europe/London] -0.922 -0.328 -0.500
#> 15 2015-06-23 12:46:03.851+0100 [Europe/London] -0.922 -0.328 -0.500
#> 16 2015-06-23 12:46:03.861+0100 [Europe/London] -0.922 -0.328 -0.500
#> 17 2015-06-23 12:46:03.871+0100 [Europe/London] -0.922 -0.328 -0.500
#> 18 2015-06-23 12:46:03.881+0100 [Europe/London] -0.922 -0.328 -0.500
#> 19 2015-06-23 12:46:03.891+0100 [Europe/London] -0.922 -0.328 -0.500
#> 20 2015-06-23 12:46:03.901+0100 [Europe/London] -0.922 -0.328 -0.500
#> 21 2015-06-23 12:46:03.911+0100 [Europe/London] -0.922 -0.328 -0.500
#> 22 2015-06-23 12:46:03.921+0100 [Europe/London] -0.922 -0.328 -0.500
#> 23 2015-06-23 12:46:03.931+0100 [Europe/London] -0.922 -0.328 -0.500
#> 24 2015-06-23 12:46:03.941+0100 [Europe/London] -0.922 -0.328 -0.500
#> 25 2015-06-23 12:46:03.951+0100 [Europe/London] -0.922 -0.328 -0.500
#> 26 2015-06-23 12:46:03.961+0100 [Europe/London] -0.922 -0.328 -0.500
csv = csv %>% 
  mutate(time = substr(time, 1, 23),
         time = lubridate::as_datetime(time, 
                                       tz = "Europe/London"))
csvlines[22655995:22656020,]
#> Error in csvlines[22655995:22656020, ]: incorrect number of dimensions

as.data.frame(csv[22655995:22656020,])
#>                   time      x      y      z
#> 1  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 2  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 3  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 4  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 5  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 6  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 7  2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 8  2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 9  2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 10 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 11 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 12 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 13 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 14 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 15 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 16 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 17 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 18 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 19 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 20 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 21 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 22 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 23 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 24 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 25 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 26 2015-06-23 12:46:03 -0.922 -0.328 -0.500

cwa-convert

from https://github.com/digitalinteraction/openmovement/tree/master/Software/AX3/cwa-convert

con_fname = sub("cwa", "csv", fname)
con_fname = file.path(dirname(con_fname), paste0("cwaconvert_", basename(con_fname)))
con = readr::read_csv(con_fname, col_names = FALSE)
#> 
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   X1 = col_datetime(format = ""),
#>   X2 = col_double(),
#>   X3 = col_double(),
#>   X4 = col_double()
#> )
colnames(con) = c("time", xyz)
dim(con)
#> [1] 29245200        4
head(con)
#> # A tibble: 6 x 4
#>   time                     x     y      z
#>   <dttm>               <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03 -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03 -0.422 0.25   0.703
#> 3 2015-06-19 10:00:03 -0.422 0.281  0.734
#> 4 2015-06-19 10:00:03 -0.406 0.297  0.734
#> 5 2015-06-19 10:00:03 -0.391 0.312  0.734
#> 6 2015-06-19 10:00:03 -0.375 0.328  0.734

as.data.frame(con[22656010:22656030,])
#>                   time        x        y        z
#> 1  2015-06-23 14:29:32 0.484375 0.656250 0.406250
#> 2  2015-06-23 14:29:32 0.484375 0.640625 0.406250
#> 3  2015-06-23 14:29:32 0.484375 0.640625 0.406250
#> 4  2015-06-23 14:29:32 0.484375 0.640625 0.421875
#> 5  2015-06-23 14:29:32 0.484375 0.625000 0.406250
#> 6  2015-06-23 14:29:32 0.484375 0.625000 0.406250
#> 7  2015-06-23 14:29:32 0.500000 0.640625 0.390625
#> 8  2015-06-23 14:29:32 0.500000 0.625000 0.406250
#> 9  2015-06-23 14:29:32 0.500000 0.625000 0.406250
#> 10 2015-06-23 14:29:32 0.500000 0.625000 0.406250
#> 11 2015-06-23 14:29:32 0.515625 0.625000 0.406250
#> 12 2015-06-23 14:29:32 0.515625 0.625000 0.406250
#> 13 2015-06-23 14:29:32 0.515625 0.625000 0.406250
#> 14 2015-06-23 14:29:32 0.515625 0.625000 0.390625
#> 15 2015-06-23 14:29:32 0.515625 0.625000 0.406250
#> 16 2015-06-23 14:29:32 0.531250 0.625000 0.406250
#> 17 2015-06-23 14:29:32 0.531250 0.625000 0.406250
#> 18 2015-06-23 14:29:32 0.531250 0.609375 0.390625
#> 19 2015-06-23 14:29:32 0.546875 0.609375 0.406250
#> 20 2015-06-23 14:29:32 0.546875 0.609375 0.406250
#> 21 2015-06-23 14:29:32 0.546875 0.609375 0.406250

Comparison

head(csv)
#> # A tibble: 6 x 4
#>   time                     x     y      z
#>   <dttm>               <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03 -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03 -0.422 0.25   0.703
#> 3 2015-06-19 10:00:03 -0.422 0.281  0.734
#> 4 2015-06-19 10:00:03 -0.406 0.297  0.734
#> 5 2015-06-19 10:00:03 -0.391 0.313  0.734
#> 6 2015-06-19 10:00:03 -0.375 0.328  0.734
head(con)
#> # A tibble: 6 x 4
#>   time                     x     y      z
#>   <dttm>               <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03 -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03 -0.422 0.25   0.703
#> 3 2015-06-19 10:00:03 -0.422 0.281  0.734
#> 4 2015-06-19 10:00:03 -0.406 0.297  0.734
#> 5 2015-06-19 10:00:03 -0.391 0.312  0.734
#> 6 2015-06-19 10:00:03 -0.375 0.328  0.734
round(head(con[, xyz]) - head(csv[, xyz]), 3)
#>   x      y z
#> 1 0  0.000 0
#> 2 0  0.000 0
#> 3 0  0.000 0
#> 4 0  0.000 0
#> 5 0 -0.001 0
#> 6 0  0.000 0
round(head(con[, xyz]) - head(data[, xyz]), 3)
#>        x      y     z
#> 1  0.000  0.000 0.000
#> 2 -0.002 -0.008 0.023
#> 3  0.000  0.001 0.001
#> 4  0.001  0.001 0.000
#> 5  0.001  0.001 0.000
#> 6  0.001  0.001 0.000
head(data)
#> # A tibble: 6 x 4
#>   time                     x     y      z
#>   <dttm>               <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03 -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03 -0.420 0.258  0.680
#> 3 2015-06-19 10:00:03 -0.422 0.280  0.733
#> 4 2015-06-19 10:00:03 -0.407 0.296  0.734
#> 5 2015-06-19 10:00:03 -0.392 0.312  0.734
#> 6 2015-06-19 10:00:03 -0.376 0.327  0.734

Created on 2020-10-21 by the reprex package (v0.3.0.9001)

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       macOS Catalina 10.15.6      
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2020-10-21                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                           
#>  assertthat    0.2.1      2019-03-21 [2] CRAN (R 4.0.0)                   
#>  backports     1.1.10     2020-09-15 [1] CRAN (R 4.0.2)                   
#>  cli           2.1.0      2020-10-12 [1] CRAN (R 4.0.2)                   
#>  crayon        1.3.4      2017-09-16 [2] CRAN (R 4.0.0)                   
#>  data.table    1.13.0     2020-07-24 [2] CRAN (R 4.0.2)                   
#>  digest        0.6.26     2020-10-17 [1] CRAN (R 4.0.2)                   
#>  dplyr       * 1.0.2      2020-08-18 [2] CRAN (R 4.0.2)                   
#>  ellipsis      0.3.1      2020-05-15 [2] CRAN (R 4.0.0)                   
#>  evaluate      0.14       2019-05-28 [2] CRAN (R 4.0.0)                   
#>  fansi         0.4.1      2020-01-08 [2] CRAN (R 4.0.0)                   
#>  fs            1.5.0      2020-07-31 [2] CRAN (R 4.0.2)                   
#>  generics      0.0.2      2018-11-29 [2] CRAN (R 4.0.0)                   
#>  GGIR        * 2.0-1      2020-06-01 [2] Github (wadpac/GGIR@2bd3c76)     
#>  glue          1.4.2      2020-08-27 [1] CRAN (R 4.0.2)                   
#>  highr         0.8        2019-03-20 [2] CRAN (R 4.0.0)                   
#>  hms           0.5.3      2020-01-08 [2] CRAN (R 4.0.0)                   
#>  htmltools     0.5.0      2020-06-16 [2] CRAN (R 4.0.0)                   
#>  knitr         1.30       2020-09-22 [1] CRAN (R 4.0.2)                   
#>  lifecycle     0.2.0      2020-03-06 [2] CRAN (R 4.0.0)                   
#>  lubridate     1.7.9      2020-06-08 [2] CRAN (R 4.0.0)                   
#>  magrittr      1.5        2014-11-22 [2] CRAN (R 4.0.0)                   
#>  pillar        1.4.6      2020-07-10 [2] CRAN (R 4.0.2)                   
#>  pkgconfig     2.0.3      2019-09-22 [2] CRAN (R 4.0.0)                   
#>  purrr         0.3.4      2020-04-17 [2] CRAN (R 4.0.0)                   
#>  R.methodsS3   1.8.1      2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R.oo          1.24.0     2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R.utils       2.10.1     2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R6            2.4.1      2019-11-12 [2] CRAN (R 4.0.0)                   
#>  Rcpp          1.0.5      2020-07-06 [2] CRAN (R 4.0.0)                   
#>  readr       * 1.4.0      2020-10-05 [1] CRAN (R 4.0.2)                   
#>  reprex        0.3.0.9001 2020-09-30 [1] Github (tidyverse/reprex@d3fc4b8)
#>  rlang         0.4.8.9000 2020-10-20 [1] Github (r-lib/rlang@011cb4c)     
#>  rmarkdown     2.4        2020-09-30 [1] CRAN (R 4.0.2)                   
#>  rstudioapi    0.11       2020-02-07 [2] CRAN (R 4.0.0)                   
#>  sessioninfo   1.1.1      2018-11-05 [2] CRAN (R 4.0.0)                   
#>  stringi       1.5.3      2020-09-09 [1] CRAN (R 4.0.2)                   
#>  stringr       1.4.0      2019-02-10 [2] CRAN (R 4.0.0)                   
#>  styler        1.3.2      2020-02-23 [2] CRAN (R 4.0.0)                   
#>  tibble        3.0.4      2020-10-12 [1] CRAN (R 4.0.2)                   
#>  tidyselect    1.1.0      2020-05-11 [2] CRAN (R 4.0.0)                   
#>  utf8          1.1.4      2018-05-24 [2] CRAN (R 4.0.0)                   
#>  vctrs         0.3.4      2020-08-29 [1] CRAN (R 4.0.2)                   
#>  withr         2.3.0      2020-09-22 [1] CRAN (R 4.0.2)                   
#>  xfun          0.18       2020-09-29 [1] CRAN (R 4.0.2)                   
#>  yaml          2.2.1      2020-02-01 [2] CRAN (R 4.0.0)                   
#> 
#> [1] /Users/johnmuschelli/Library/R/4.0/library
#> [2] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

@aidendoherty
Copy link
Member

Hi @muschellij2 - thanks for sharing this. Given the different lengths of the raw data files read by each package - it is possible that there exists a faulty time block header within this particular CWA file - that the different packages deal with differently. Would it be possible to share a copy of this individual file with us so that we can look at it in greater depth?

It would be best to email a secure link to either myself ([email protected]) or @chanshing ([email protected])

@vincentvanhees
Copy link

vincentvanhees commented Oct 21, 2020

Not sure if this is related, but just to let you know that in GGIR an issue recently emerged with reading the AX3 cwa demo file from the Axivity website, which I resolved here. The problem was the extraction of sample frequency from page headers.

@vincentvanhees
Copy link

vincentvanhees commented Oct 21, 2020

... this was surprising because I and others have processed that demofile with GGIR without problems in the past. So, file corruption was also what I thought.

@muschellij2
Copy link
Author

muschellij2 commented Oct 22, 2020

I believe it still persists - just tried on the dev version of GGIR.

# python3 accProcess.py --skipCalibration=True --rawOutput=True data/example_90001_0_0.cwa.gz

setwd("~/Dropbox/Packages/biobankAccelerometerAnalysis/")
library(readr)
library(GGIR)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
fname = normalizePath("data/example_90001_0_0.cwa.gz")
xyz = c("x", "y", "z")

GGIR

Look at row 16 on:

tfile = R.utils::gunzip(fname, 
                        remove = FALSE, temporary = TRUE)

out = GGIR::g.cwaread(tfile, end = Inf)
data = out$data[, c("time", xyz)]
data = tibble::as_tibble(data)
data$time = as.POSIXct(data$time, origin = "1970-01-01")
dim(data)
#> [1] 29864812        4
as.data.frame(data[22656010:22656030,])
#>                   time        x         y          z
#> 1  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 2  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 3  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 4  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 5  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 6  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 7  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 8  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 9  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 10 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 11 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 12 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 13 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 14 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 15 2015-06-22 00:56:04 -0.50000  0.359375   0.609375
#> 16 2015-06-22 00:56:04 48.11477 76.729466 123.842931
#> 17 2015-06-22 00:56:04 48.09753 76.702373 123.799213
#> 18 2015-06-22 00:56:04 48.08029 76.675281 123.755496
#> 19 2015-06-22 00:56:04 48.06305 76.648189 123.711779
#> 20 2015-06-22 00:56:04 48.04581 76.621096 123.668062
#> 21 2015-06-22 00:56:04 48.02857 76.594004 123.624344

biobankAccelerometerAnalysis

We see a day jump almost from rows 5 to 6.

csv_file = R.utils::gunzip(sub("cwa", "csv", fname), 
                           remove = FALSE, temporary = TRUE)
csvlines = readr::read_lines(csv_file, skip = 22655996, n_max = 50)
csvlines
#>  [1] "2015-06-22 00:56:03.721+0100 [Europe/London],-0.484,0.359,0.609"  
#>  [2] "2015-06-22 00:56:03.731+0100 [Europe/London],-0.484,0.359,0.609"  
#>  [3] "2015-06-22 00:56:03.741+0100 [Europe/London],-0.484,0.359,0.609"  
#>  [4] "2015-06-22 00:56:03.751+0100 [Europe/London],-0.484,0.359,0.609"  
#>  [5] "2015-06-22 00:56:03.761+0100 [Europe/London],-0.484,0.359,0.609"  
#>  [6] "2015-06-23 12:46:03.771+0100 [Europe/London],-0.922,-0.328,-0.500"
#>  [7] "2015-06-23 12:46:03.781+0100 [Europe/London],-0.922,-0.328,-0.500"
#>  [8] "2015-06-23 12:46:03.791+0100 [Europe/London],-0.922,-0.328,-0.500"
#>  [9] "2015-06-23 12:46:03.801+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [10] "2015-06-23 12:46:03.811+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [11] "2015-06-23 12:46:03.821+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [12] "2015-06-23 12:46:03.831+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [13] "2015-06-23 12:46:03.841+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [14] "2015-06-23 12:46:03.851+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [15] "2015-06-23 12:46:03.861+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [16] "2015-06-23 12:46:03.871+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [17] "2015-06-23 12:46:03.881+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [18] "2015-06-23 12:46:03.891+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [19] "2015-06-23 12:46:03.901+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [20] "2015-06-23 12:46:03.911+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [21] "2015-06-23 12:46:03.921+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [22] "2015-06-23 12:46:03.931+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [23] "2015-06-23 12:46:03.941+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [24] "2015-06-23 12:46:03.951+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [25] "2015-06-23 12:46:03.961+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [26] "2015-06-23 12:46:03.971+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [27] "2015-06-23 12:46:03.981+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [28] "2015-06-23 12:46:03.991+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [29] "2015-06-23 12:46:04.001+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [30] "2015-06-23 12:46:04.011+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [31] "2015-06-23 12:46:04.021+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [32] "2015-06-23 12:46:04.031+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [33] "2015-06-23 12:46:04.041+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [34] "2015-06-23 12:46:04.051+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [35] "2015-06-23 12:46:04.061+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [36] "2015-06-23 12:46:04.071+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [37] "2015-06-23 12:46:04.081+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [38] "2015-06-23 12:46:04.091+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [39] "2015-06-23 12:46:04.101+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [40] "2015-06-23 12:46:04.111+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [41] "2015-06-23 12:46:04.121+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [42] "2015-06-23 12:46:04.131+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [43] "2015-06-23 12:46:04.141+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [44] "2015-06-23 12:46:04.151+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [45] "2015-06-23 12:46:04.161+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [46] "2015-06-23 12:46:04.171+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [47] "2015-06-23 12:46:04.181+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [48] "2015-06-23 12:46:04.191+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [49] "2015-06-23 12:46:04.201+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [50] "2015-06-23 12:46:04.211+0100 [Europe/London],-0.922,-0.328,-0.500"
csv = readr::read_csv(csv_file)
#> 
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   time = col_character(),
#>   x = col_double(),
#>   y = col_double(),
#>   z = col_double()
#> )
dim(csv)
#> [1] 29856000        4
head(csv)
#> # A tibble: 6 x 4
#>   time                                              x     y      z
#>   <chr>                                         <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03.771+0100 [Europe/London] -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03.781+0100 [Europe/London] -0.422 0.25   0.703
#> 3 2015-06-19 10:00:03.791+0100 [Europe/London] -0.422 0.281  0.734
#> 4 2015-06-19 10:00:03.801+0100 [Europe/London] -0.406 0.297  0.734
#> 5 2015-06-19 10:00:03.811+0100 [Europe/London] -0.391 0.313  0.734
#> 6 2015-06-19 10:00:03.821+0100 [Europe/London] -0.375 0.328  0.734

as.data.frame(csv[22655995:22656020,])
#>                                            time      x      y      z
#> 1  2015-06-22 00:56:03.711+0100 [Europe/London] -0.484  0.359  0.609
#> 2  2015-06-22 00:56:03.721+0100 [Europe/London] -0.484  0.359  0.609
#> 3  2015-06-22 00:56:03.731+0100 [Europe/London] -0.484  0.359  0.609
#> 4  2015-06-22 00:56:03.741+0100 [Europe/London] -0.484  0.359  0.609
#> 5  2015-06-22 00:56:03.751+0100 [Europe/London] -0.484  0.359  0.609
#> 6  2015-06-22 00:56:03.761+0100 [Europe/London] -0.484  0.359  0.609
#> 7  2015-06-23 12:46:03.771+0100 [Europe/London] -0.922 -0.328 -0.500
#> 8  2015-06-23 12:46:03.781+0100 [Europe/London] -0.922 -0.328 -0.500
#> 9  2015-06-23 12:46:03.791+0100 [Europe/London] -0.922 -0.328 -0.500
#> 10 2015-06-23 12:46:03.801+0100 [Europe/London] -0.922 -0.328 -0.500
#> 11 2015-06-23 12:46:03.811+0100 [Europe/London] -0.922 -0.328 -0.500
#> 12 2015-06-23 12:46:03.821+0100 [Europe/London] -0.922 -0.328 -0.500
#> 13 2015-06-23 12:46:03.831+0100 [Europe/London] -0.922 -0.328 -0.500
#> 14 2015-06-23 12:46:03.841+0100 [Europe/London] -0.922 -0.328 -0.500
#> 15 2015-06-23 12:46:03.851+0100 [Europe/London] -0.922 -0.328 -0.500
#> 16 2015-06-23 12:46:03.861+0100 [Europe/London] -0.922 -0.328 -0.500
#> 17 2015-06-23 12:46:03.871+0100 [Europe/London] -0.922 -0.328 -0.500
#> 18 2015-06-23 12:46:03.881+0100 [Europe/London] -0.922 -0.328 -0.500
#> 19 2015-06-23 12:46:03.891+0100 [Europe/London] -0.922 -0.328 -0.500
#> 20 2015-06-23 12:46:03.901+0100 [Europe/London] -0.922 -0.328 -0.500
#> 21 2015-06-23 12:46:03.911+0100 [Europe/London] -0.922 -0.328 -0.500
#> 22 2015-06-23 12:46:03.921+0100 [Europe/London] -0.922 -0.328 -0.500
#> 23 2015-06-23 12:46:03.931+0100 [Europe/London] -0.922 -0.328 -0.500
#> 24 2015-06-23 12:46:03.941+0100 [Europe/London] -0.922 -0.328 -0.500
#> 25 2015-06-23 12:46:03.951+0100 [Europe/London] -0.922 -0.328 -0.500
#> 26 2015-06-23 12:46:03.961+0100 [Europe/London] -0.922 -0.328 -0.500

csv = csv %>% 
  mutate(time = substr(time, 1, 23),
         time = lubridate::as_datetime(time, 
                                       tz = "Europe/London"))
csvlines[1:50]
#>  [1] "2015-06-22 00:56:03.721+0100 [Europe/London],-0.484,0.359,0.609"  
#>  [2] "2015-06-22 00:56:03.731+0100 [Europe/London],-0.484,0.359,0.609"  
#>  [3] "2015-06-22 00:56:03.741+0100 [Europe/London],-0.484,0.359,0.609"  
#>  [4] "2015-06-22 00:56:03.751+0100 [Europe/London],-0.484,0.359,0.609"  
#>  [5] "2015-06-22 00:56:03.761+0100 [Europe/London],-0.484,0.359,0.609"  
#>  [6] "2015-06-23 12:46:03.771+0100 [Europe/London],-0.922,-0.328,-0.500"
#>  [7] "2015-06-23 12:46:03.781+0100 [Europe/London],-0.922,-0.328,-0.500"
#>  [8] "2015-06-23 12:46:03.791+0100 [Europe/London],-0.922,-0.328,-0.500"
#>  [9] "2015-06-23 12:46:03.801+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [10] "2015-06-23 12:46:03.811+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [11] "2015-06-23 12:46:03.821+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [12] "2015-06-23 12:46:03.831+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [13] "2015-06-23 12:46:03.841+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [14] "2015-06-23 12:46:03.851+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [15] "2015-06-23 12:46:03.861+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [16] "2015-06-23 12:46:03.871+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [17] "2015-06-23 12:46:03.881+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [18] "2015-06-23 12:46:03.891+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [19] "2015-06-23 12:46:03.901+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [20] "2015-06-23 12:46:03.911+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [21] "2015-06-23 12:46:03.921+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [22] "2015-06-23 12:46:03.931+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [23] "2015-06-23 12:46:03.941+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [24] "2015-06-23 12:46:03.951+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [25] "2015-06-23 12:46:03.961+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [26] "2015-06-23 12:46:03.971+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [27] "2015-06-23 12:46:03.981+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [28] "2015-06-23 12:46:03.991+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [29] "2015-06-23 12:46:04.001+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [30] "2015-06-23 12:46:04.011+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [31] "2015-06-23 12:46:04.021+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [32] "2015-06-23 12:46:04.031+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [33] "2015-06-23 12:46:04.041+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [34] "2015-06-23 12:46:04.051+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [35] "2015-06-23 12:46:04.061+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [36] "2015-06-23 12:46:04.071+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [37] "2015-06-23 12:46:04.081+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [38] "2015-06-23 12:46:04.091+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [39] "2015-06-23 12:46:04.101+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [40] "2015-06-23 12:46:04.111+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [41] "2015-06-23 12:46:04.121+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [42] "2015-06-23 12:46:04.131+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [43] "2015-06-23 12:46:04.141+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [44] "2015-06-23 12:46:04.151+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [45] "2015-06-23 12:46:04.161+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [46] "2015-06-23 12:46:04.171+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [47] "2015-06-23 12:46:04.181+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [48] "2015-06-23 12:46:04.191+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [49] "2015-06-23 12:46:04.201+0100 [Europe/London],-0.922,-0.328,-0.500"
#> [50] "2015-06-23 12:46:04.211+0100 [Europe/London],-0.922,-0.328,-0.500"

as.data.frame(csv[22655995:22656020,])
#>                   time      x      y      z
#> 1  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 2  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 3  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 4  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 5  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 6  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 7  2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 8  2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 9  2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 10 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 11 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 12 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 13 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 14 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 15 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 16 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 17 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 18 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 19 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 20 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 21 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 22 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 23 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 24 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 25 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 26 2015-06-23 12:46:03 -0.922 -0.328 -0.500

cwa-convert

from https://github.com/digitalinteraction/openmovement/tree/master/Software/AX3/cwa-convert

con_fname = sub("cwa", "csv", fname)
con_fname = file.path(dirname(con_fname), paste0("cwaconvert_", basename(con_fname)))
con = readr::read_csv(con_fname, col_names = FALSE)
#> 
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   X1 = col_datetime(format = ""),
#>   X2 = col_double(),
#>   X3 = col_double(),
#>   X4 = col_double()
#> )

colnames(con) = c("time", xyz)
dim(con)
#> [1] 29245200        4
head(con)
#> # A tibble: 6 x 4
#>   time                     x     y      z
#>   <dttm>               <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03 -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03 -0.422 0.25   0.703
#> 3 2015-06-19 10:00:03 -0.422 0.281  0.734
#> 4 2015-06-19 10:00:03 -0.406 0.297  0.734
#> 5 2015-06-19 10:00:03 -0.391 0.312  0.734
#> 6 2015-06-19 10:00:03 -0.375 0.328  0.734

as.data.frame(con[22656010:22656030,])
#>                   time        x        y        z
#> 1  2015-06-23 14:29:32 0.484375 0.656250 0.406250
#> 2  2015-06-23 14:29:32 0.484375 0.640625 0.406250
#> 3  2015-06-23 14:29:32 0.484375 0.640625 0.406250
#> 4  2015-06-23 14:29:32 0.484375 0.640625 0.421875
#> 5  2015-06-23 14:29:32 0.484375 0.625000 0.406250
#> 6  2015-06-23 14:29:32 0.484375 0.625000 0.406250
#> 7  2015-06-23 14:29:32 0.500000 0.640625 0.390625
#> 8  2015-06-23 14:29:32 0.500000 0.625000 0.406250
#> 9  2015-06-23 14:29:32 0.500000 0.625000 0.406250
#> 10 2015-06-23 14:29:32 0.500000 0.625000 0.406250
#> 11 2015-06-23 14:29:32 0.515625 0.625000 0.406250
#> 12 2015-06-23 14:29:32 0.515625 0.625000 0.406250
#> 13 2015-06-23 14:29:32 0.515625 0.625000 0.406250
#> 14 2015-06-23 14:29:32 0.515625 0.625000 0.390625
#> 15 2015-06-23 14:29:32 0.515625 0.625000 0.406250
#> 16 2015-06-23 14:29:32 0.531250 0.625000 0.406250
#> 17 2015-06-23 14:29:32 0.531250 0.625000 0.406250
#> 18 2015-06-23 14:29:32 0.531250 0.609375 0.390625
#> 19 2015-06-23 14:29:32 0.546875 0.609375 0.406250
#> 20 2015-06-23 14:29:32 0.546875 0.609375 0.406250
#> 21 2015-06-23 14:29:32 0.546875 0.609375 0.406250

Comparison

head(csv)
#> # A tibble: 6 x 4
#>   time                     x     y      z
#>   <dttm>               <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03 -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03 -0.422 0.25   0.703
#> 3 2015-06-19 10:00:03 -0.422 0.281  0.734
#> 4 2015-06-19 10:00:03 -0.406 0.297  0.734
#> 5 2015-06-19 10:00:03 -0.391 0.313  0.734
#> 6 2015-06-19 10:00:03 -0.375 0.328  0.734
head(con)
#> # A tibble: 6 x 4
#>   time                     x     y      z
#>   <dttm>               <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03 -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03 -0.422 0.25   0.703
#> 3 2015-06-19 10:00:03 -0.422 0.281  0.734
#> 4 2015-06-19 10:00:03 -0.406 0.297  0.734
#> 5 2015-06-19 10:00:03 -0.391 0.312  0.734
#> 6 2015-06-19 10:00:03 -0.375 0.328  0.734
round(head(con[, xyz]) - head(csv[, xyz]), 3)
#>   x      y z
#> 1 0  0.000 0
#> 2 0  0.000 0
#> 3 0  0.000 0
#> 4 0  0.000 0
#> 5 0 -0.001 0
#> 6 0  0.000 0
round(head(con[, xyz]) - head(data[, xyz]), 3)
#>        x      y     z
#> 1  0.000  0.000 0.000
#> 2 -0.002 -0.008 0.023
#> 3  0.000  0.001 0.001
#> 4  0.001  0.001 0.000
#> 5  0.001  0.001 0.000
#> 6  0.001  0.001 0.000
head(data)
#> # A tibble: 6 x 4
#>   time                     x     y      z
#>   <dttm>               <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03 -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03 -0.420 0.258  0.680
#> 3 2015-06-19 10:00:03 -0.422 0.280  0.733
#> 4 2015-06-19 10:00:03 -0.407 0.296  0.734
#> 5 2015-06-19 10:00:03 -0.392 0.312  0.734
#> 6 2015-06-19 10:00:03 -0.376 0.327  0.734

Created on 2020-10-22 by the reprex package (v0.3.0.9001)

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       macOS Catalina 10.15.6      
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2020-10-22                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                           
#>  assertthat    0.2.1      2019-03-21 [2] CRAN (R 4.0.0)                   
#>  backports     1.1.10     2020-09-15 [1] CRAN (R 4.0.2)                   
#>  cli           2.1.0      2020-10-12 [1] CRAN (R 4.0.2)                   
#>  crayon        1.3.4      2017-09-16 [2] CRAN (R 4.0.0)                   
#>  data.table    1.13.2     2020-10-19 [1] CRAN (R 4.0.2)                   
#>  digest        0.6.26     2020-10-17 [1] CRAN (R 4.0.2)                   
#>  dplyr       * 1.0.2      2020-08-18 [2] CRAN (R 4.0.2)                   
#>  ellipsis      0.3.1      2020-05-15 [2] CRAN (R 4.0.0)                   
#>  evaluate      0.14       2019-05-28 [2] CRAN (R 4.0.0)                   
#>  fansi         0.4.1      2020-01-08 [2] CRAN (R 4.0.0)                   
#>  fs            1.5.0      2020-07-31 [2] CRAN (R 4.0.2)                   
#>  generics      0.0.2      2018-11-29 [2] CRAN (R 4.0.0)                   
#>  GGIR        * 2.1-3      2020-10-22 [1] Github (wadpac/GGIR@49aedcd)     
#>  glue          1.4.2      2020-08-27 [1] CRAN (R 4.0.2)                   
#>  highr         0.8        2019-03-20 [2] CRAN (R 4.0.0)                   
#>  hms           0.5.3      2020-01-08 [2] CRAN (R 4.0.0)                   
#>  htmltools     0.5.0      2020-06-16 [2] CRAN (R 4.0.0)                   
#>  knitr         1.30       2020-09-22 [1] CRAN (R 4.0.2)                   
#>  lifecycle     0.2.0      2020-03-06 [2] CRAN (R 4.0.0)                   
#>  lubridate     1.7.9      2020-06-08 [2] CRAN (R 4.0.0)                   
#>  magrittr      1.5        2014-11-22 [2] CRAN (R 4.0.0)                   
#>  pillar        1.4.6      2020-07-10 [2] CRAN (R 4.0.2)                   
#>  pkgconfig     2.0.3      2019-09-22 [2] CRAN (R 4.0.0)                   
#>  purrr         0.3.4      2020-04-17 [2] CRAN (R 4.0.0)                   
#>  R.methodsS3   1.8.1      2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R.oo          1.24.0     2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R.utils       2.10.1     2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R6            2.4.1      2019-11-12 [2] CRAN (R 4.0.0)                   
#>  Rcpp          1.0.5      2020-07-06 [2] CRAN (R 4.0.0)                   
#>  readr       * 1.4.0      2020-10-05 [1] CRAN (R 4.0.2)                   
#>  reprex        0.3.0.9001 2020-09-30 [1] Github (tidyverse/reprex@d3fc4b8)
#>  rlang         0.4.8.9000 2020-10-22 [1] Github (r-lib/rlang@7a36238)     
#>  rmarkdown     2.4        2020-09-30 [1] CRAN (R 4.0.2)                   
#>  rstudioapi    0.11       2020-02-07 [2] CRAN (R 4.0.0)                   
#>  sessioninfo   1.1.1      2018-11-05 [2] CRAN (R 4.0.0)                   
#>  stringi       1.5.3      2020-09-09 [1] CRAN (R 4.0.2)                   
#>  stringr       1.4.0      2019-02-10 [2] CRAN (R 4.0.0)                   
#>  styler        1.3.2      2020-02-23 [2] CRAN (R 4.0.0)                   
#>  tibble        3.0.4      2020-10-12 [1] CRAN (R 4.0.2)                   
#>  tidyselect    1.1.0      2020-05-11 [2] CRAN (R 4.0.0)                   
#>  utf8          1.1.4      2018-05-24 [2] CRAN (R 4.0.0)                   
#>  vctrs         0.3.4      2020-08-29 [1] CRAN (R 4.0.2)                   
#>  withr         2.3.0      2020-09-22 [1] CRAN (R 4.0.2)                   
#>  xfun          0.18       2020-09-29 [1] CRAN (R 4.0.2)                   
#>  yaml          2.2.1      2020-02-01 [2] CRAN (R 4.0.0)                   
#> 
#> [1] /Users/johnmuschelli/Library/R/4.0/library
#> [2] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

@chanshing
Copy link
Member

chanshing commented Dec 18, 2020

Goes from 12:56AM to 12:46PM the next day - I'm not sure why this would be or if this is correct, but seems odd.

csv[22655995:22656020,]
#> # A tibble: 26 x 4
#>    time                     x      y      z
#>    <dttm>               <dbl>  <dbl>  <dbl>
#>  1 2015-06-22 00:56:03 -0.484  0.359  0.609
#>  2 2015-06-22 00:56:03 -0.484  0.359  0.609
#>  3 2015-06-22 00:56:03 -0.484  0.359  0.609
#>  4 2015-06-22 00:56:03 -0.484  0.359  0.609
#>  5 2015-06-22 00:56:03 -0.484  0.359  0.609
#>  6 2015-06-22 00:56:03 -0.484  0.359  0.609
#>  7 2015-06-23 12:46:03 -0.922 -0.328 -0.5  
#>  8 2015-06-23 12:46:03 -0.922 -0.328 -0.5  
#>  9 2015-06-23 12:46:03 -0.922 -0.328 -0.5  
#> 10 2015-06-23 12:46:03 -0.922 -0.328 -0.5  
#> # … with 16 more rows

Created on 2020-10-21 by the reprex package (v0.3.0.9001)

Session info

Hi @muschellij2

Sorry for the late reply. So I looked into the file you mentioned. The skips mean that there were interrupts during the wear. This can happen if the person plugged the device into a computer, or worst case scenario the device malfunctioned. In this case, it does seem that the device probably malfunctioned. Here's a plot of predicted activities. Yellow means missing (interrupts or non-wear):

2584812

Do yo have outputs for the openmovement software? I'd be interested to see what happens after 2015-06-22 00:56:03.761+0100

@muschellij2
Copy link
Author

This should mirror what cwa-convert does from open-movement I believe. I wrapped th C code into an R package. We see the same breaks in time.

There are 29856000 results from biobankAccelerometerAnalysis but only 29245200 from open-movement. I don't have time right now (trying to get you results quickly), but I will look into the differences.

I think there is still an issue in GGIR @wadpac with both the first few lines of the data set (see above) as well as the large values that correspond to these interrupts (it may be trying to resample/impute in there, but a scaling factor seems off)

library(read.cwa)
x = read.cwa::read_cwa("example_90001_0_0.cwa.gz")
#> Converting the CWA to CSV
#> Reading 243712 sectors (offset 0, file 243712)...
#> [MD].
#> Wrote 1507159295 bytes of data (29245200 samples).

#> Reading in the CSV: /var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T//RtmpBoGhAU/file73e134f6a875.csv

as.data.frame(x$data[22241005:(22241005+100),])
#>                    time         X         Y         Z
#> 1   2015-06-22 00:56:02 -0.500000  0.359375  0.609375
#> 2   2015-06-22 00:56:02 -0.500000  0.359375  0.593750
#> 3   2015-06-22 00:56:02 -0.500000  0.359375  0.593750
#> 4   2015-06-22 00:56:02 -0.484375  0.359375  0.609375
#> 5   2015-06-22 00:56:02 -0.484375  0.359375  0.593750
#> 6   2015-06-22 00:56:02 -0.500000  0.359375  0.609375
#> 7   2015-06-22 00:56:02 -0.500000  0.359375  0.609375
#> 8   2015-06-22 00:56:02 -0.500000  0.359375  0.609375
#> 9   2015-06-22 00:56:02 -0.500000  0.359375  0.593750
#> 10  2015-06-22 00:56:02 -0.500000  0.359375  0.609375
#> 11  2015-06-22 00:56:02 -0.500000  0.359375  0.609375
#> 12  2015-06-22 00:56:02 -0.500000  0.359375  0.609375
#> 13  2015-06-22 00:56:02 -0.500000  0.359375  0.609375
#> 14  2015-06-22 00:56:02 -0.484375  0.359375  0.609375
#> 15  2015-06-22 00:56:02 -0.484375  0.359375  0.609375
#> 16  2015-06-22 00:56:02 -0.500000  0.359375  0.609375
#> 17  2015-06-22 00:56:02 -0.500000  0.359375  0.609375
#> 18  2015-06-22 00:56:03 -0.500000  0.359375  0.593750
#> 19  2015-06-22 00:56:03 -0.500000  0.359375  0.593750
#> 20  2015-06-22 00:56:03 -0.500000  0.359375  0.609375
#> 21  2015-06-22 00:56:03 -0.500000  0.359375  0.593750
#> 22  2015-06-22 00:56:03 -0.500000  0.359375  0.609375
#> 23  2015-06-22 00:56:03 -0.484375  0.359375  0.593750
#> 24  2015-06-22 00:56:03 -0.484375  0.359375  0.609375
#> 25  2015-06-22 00:56:03 -0.500000  0.359375  0.609375
#> 26  2015-06-22 00:56:03 -0.500000  0.359375  0.609375
#> 27  2015-06-22 00:56:03 -0.500000  0.359375  0.609375
#> 28  2015-06-22 00:56:03 -0.500000  0.359375  0.593750
#> 29  2015-06-22 00:56:03 -0.500000  0.359375  0.609375
#> 30  2015-06-22 00:56:03 -0.500000  0.359375  0.609375
#> 31  2015-06-22 00:56:03 -0.500000  0.359375  0.609375
#> 32  2015-06-22 00:56:03 -0.500000  0.359375  0.609375
#> 33  2015-06-22 00:56:03 -0.500000  0.359375  0.609375
#> 34  2015-06-22 00:56:03 -0.484375  0.359375  0.593750
#> 35  2015-06-22 00:56:03 -0.500000  0.359375  0.593750
#> 36  2015-06-22 00:56:03 -0.484375  0.359375  0.609375
#> 37  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 38  2015-06-23 12:46:32 -0.875000 -0.312500 -0.468750
#> 39  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 40  2015-06-23 12:46:32 -0.921875 -0.312500 -0.484375
#> 41  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 42  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 43  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 44  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 45  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 46  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 47  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 48  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 49  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 50  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 51  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 52  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 53  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 54  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 55  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 56  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 57  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 58  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 59  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 60  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 61  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 62  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 63  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 64  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 65  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 66  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 67  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 68  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 69  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 70  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 71  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 72  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 73  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 74  2015-06-23 12:46:32 -0.921875 -0.328125 -0.484375
#> 75  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 76  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 77  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 78  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 79  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 80  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 81  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 82  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 83  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 84  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 85  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 86  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 87  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 88  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 89  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 90  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 91  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 92  2015-06-23 12:46:33 -0.921875 -0.328125 -0.500000
#> 93  2015-06-23 12:46:33 -0.921875 -0.328125 -0.484375
#> 94  2015-06-23 12:46:33 -0.921875 -0.328125 -0.500000
#> 95  2015-06-23 12:46:33 -0.921875 -0.328125 -0.500000
#> 96  2015-06-23 12:46:33 -0.921875 -0.328125 -0.500000
#> 97  2015-06-23 12:46:33 -0.921875 -0.328125 -0.500000
#> 98  2015-06-23 12:46:33 -0.921875 -0.328125 -0.500000
#> 99  2015-06-23 12:46:33 -0.921875 -0.328125 -0.484375
#> 100 2015-06-23 12:46:33 -0.921875 -0.328125 -0.500000
#> 101 2015-06-23 12:46:33 -0.921875 -0.328125 -0.500000

Created on 2020-12-18 by the reprex package (v0.3.0.9001)

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       macOS Catalina 10.15.7      
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2020-12-18                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                           
#>  assertthat    0.2.1      2019-03-21 [2] CRAN (R 4.0.0)                   
#>  backports     1.2.0      2020-11-02 [1] CRAN (R 4.0.2)                   
#>  cli           2.2.0      2020-11-20 [1] CRAN (R 4.0.2)                   
#>  crayon        1.3.4      2017-09-16 [2] CRAN (R 4.0.0)                   
#>  data.table    1.13.2     2020-10-19 [1] CRAN (R 4.0.2)                   
#>  digest        0.6.27     2020-10-24 [1] CRAN (R 4.0.2)                   
#>  ellipsis      0.3.1      2020-05-15 [2] CRAN (R 4.0.0)                   
#>  evaluate      0.14       2019-05-28 [2] CRAN (R 4.0.0)                   
#>  fansi         0.4.1      2020-01-08 [2] CRAN (R 4.0.0)                   
#>  fs            1.5.0      2020-07-31 [2] CRAN (R 4.0.2)                   
#>  GGIR          2.1-3      2020-11-10 [1] Github (wadpac/GGIR@2bd6b40)     
#>  glue          1.4.2      2020-08-27 [1] CRAN (R 4.0.2)                   
#>  highr         0.8        2019-03-20 [2] CRAN (R 4.0.0)                   
#>  hms           0.5.3      2020-01-08 [2] CRAN (R 4.0.0)                   
#>  htmltools     0.5.0      2020-06-16 [2] CRAN (R 4.0.0)                   
#>  knitr         1.30       2020-09-22 [1] CRAN (R 4.0.2)                   
#>  lifecycle     0.2.0      2020-03-06 [2] CRAN (R 4.0.0)                   
#>  magrittr      2.0.1      2020-11-17 [1] CRAN (R 4.0.2)                   
#>  pillar        1.4.7      2020-11-20 [1] CRAN (R 4.0.2)                   
#>  pkgconfig     2.0.3      2019-09-22 [2] CRAN (R 4.0.0)                   
#>  purrr         0.3.4      2020-04-17 [2] CRAN (R 4.0.0)                   
#>  R.methodsS3   1.8.1      2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R.oo          1.24.0     2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R.utils       2.10.1     2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R6            2.5.0      2020-10-28 [1] CRAN (R 4.0.2)                   
#>  Rcpp          1.0.5      2020-07-06 [1] CRAN (R 4.0.2)                   
#>  read.cwa    * 0.2.1      2020-10-26 [1] local                            
#>  readr         1.4.0      2020-10-05 [1] CRAN (R 4.0.2)                   
#>  reprex        0.3.0.9001 2020-09-30 [1] Github (tidyverse/reprex@d3fc4b8)
#>  rlang         0.4.9.9000 2020-12-11 [1] Github (r-lib/rlang@1939a71)     
#>  rmarkdown     2.5        2020-10-21 [1] CRAN (R 4.0.2)                   
#>  rstudioapi    0.13       2020-11-12 [1] CRAN (R 4.0.2)                   
#>  sessioninfo   1.1.1      2018-11-05 [2] CRAN (R 4.0.0)                   
#>  stringi       1.5.3      2020-09-09 [1] CRAN (R 4.0.2)                   
#>  stringr       1.4.0      2019-02-10 [2] CRAN (R 4.0.0)                   
#>  styler        1.3.2      2020-02-23 [2] CRAN (R 4.0.0)                   
#>  tibble        3.0.4      2020-10-12 [1] CRAN (R 4.0.2)                   
#>  vctrs         0.3.5      2020-11-17 [1] CRAN (R 4.0.2)                   
#>  withr         2.3.0      2020-09-22 [1] CRAN (R 4.0.2)                   
#>  xfun          0.19       2020-10-30 [1] CRAN (R 4.0.2)                   
#>  yaml          2.2.1      2020-02-01 [2] CRAN (R 4.0.0)                   
#> 
#> [1] /Users/johnmuschelli/Library/R/4.0/library
#> [2] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

@chanshing
Copy link
Member

This should mirror what cwa-convert does from open-movement I believe. I wrapped th C code into an R package. We see the same breaks in time.

There are 29856000 results from biobankAccelerometerAnalysis but only 29245200 from open-movement. I don't have time right now (trying to get you results quickly), but I will look into the differences.

I think there is still an issue in GGIR @wadpac with both the first few lines of the data set (see above) as well as the large values that correspond to these interrupts (it may be trying to resample/impute in there, but a scaling factor seems off)

library(read.cwa)
x = read.cwa::read_cwa("example_90001_0_0.cwa.gz")
#> Converting the CWA to CSV
#> Reading 243712 sectors (offset 0, file 243712)...
#> [MD].
#> Wrote 1507159295 bytes of data (29245200 samples).

#> Reading in the CSV: /var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T//RtmpBoGhAU/file73e134f6a875.csv

as.data.frame(x$data[22241005:(22241005+100),])
#>                    time         X         Y         Z
#> 1   2015-06-22 00:56:02 -0.500000  0.359375  0.609375
#> 2   2015-06-22 00:56:02 -0.500000  0.359375  0.593750
#> 3   2015-06-22 00:56:02 -0.500000  0.359375  0.593750
#> 4   2015-06-22 00:56:02 -0.484375  0.359375  0.609375
#> 5   2015-06-22 00:56:02 -0.484375  0.359375  0.593750
#> 6   2015-06-22 00:56:02 -0.500000  0.359375  0.609375
#> 7   2015-06-22 00:56:02 -0.500000  0.359375  0.609375
#> 8   2015-06-22 00:56:02 -0.500000  0.359375  0.609375
#> 9   2015-06-22 00:56:02 -0.500000  0.359375  0.593750
#> 10  2015-06-22 00:56:02 -0.500000  0.359375  0.609375
#> 11  2015-06-22 00:56:02 -0.500000  0.359375  0.609375
#> 12  2015-06-22 00:56:02 -0.500000  0.359375  0.609375
#> 13  2015-06-22 00:56:02 -0.500000  0.359375  0.609375
#> 14  2015-06-22 00:56:02 -0.484375  0.359375  0.609375
#> 15  2015-06-22 00:56:02 -0.484375  0.359375  0.609375
#> 16  2015-06-22 00:56:02 -0.500000  0.359375  0.609375
#> 17  2015-06-22 00:56:02 -0.500000  0.359375  0.609375
#> 18  2015-06-22 00:56:03 -0.500000  0.359375  0.593750
#> 19  2015-06-22 00:56:03 -0.500000  0.359375  0.593750
#> 20  2015-06-22 00:56:03 -0.500000  0.359375  0.609375
#> 21  2015-06-22 00:56:03 -0.500000  0.359375  0.593750
#> 22  2015-06-22 00:56:03 -0.500000  0.359375  0.609375
#> 23  2015-06-22 00:56:03 -0.484375  0.359375  0.593750
#> 24  2015-06-22 00:56:03 -0.484375  0.359375  0.609375
#> 25  2015-06-22 00:56:03 -0.500000  0.359375  0.609375
#> 26  2015-06-22 00:56:03 -0.500000  0.359375  0.609375
#> 27  2015-06-22 00:56:03 -0.500000  0.359375  0.609375
#> 28  2015-06-22 00:56:03 -0.500000  0.359375  0.593750
#> 29  2015-06-22 00:56:03 -0.500000  0.359375  0.609375
#> 30  2015-06-22 00:56:03 -0.500000  0.359375  0.609375
#> 31  2015-06-22 00:56:03 -0.500000  0.359375  0.609375
#> 32  2015-06-22 00:56:03 -0.500000  0.359375  0.609375
#> 33  2015-06-22 00:56:03 -0.500000  0.359375  0.609375
#> 34  2015-06-22 00:56:03 -0.484375  0.359375  0.593750
#> 35  2015-06-22 00:56:03 -0.500000  0.359375  0.593750
#> 36  2015-06-22 00:56:03 -0.484375  0.359375  0.609375
#> 37  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 38  2015-06-23 12:46:32 -0.875000 -0.312500 -0.468750
#> 39  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 40  2015-06-23 12:46:32 -0.921875 -0.312500 -0.484375
#> 41  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 42  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 43  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 44  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 45  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 46  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 47  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 48  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 49  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 50  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 51  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 52  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 53  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 54  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 55  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 56  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 57  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 58  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 59  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 60  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 61  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 62  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 63  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 64  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 65  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 66  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 67  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 68  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 69  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 70  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 71  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 72  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 73  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 74  2015-06-23 12:46:32 -0.921875 -0.328125 -0.484375
#> 75  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 76  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 77  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 78  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 79  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 80  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 81  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 82  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 83  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 84  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 85  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 86  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 87  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 88  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 89  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 90  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 91  2015-06-23 12:46:32 -0.921875 -0.328125 -0.500000
#> 92  2015-06-23 12:46:33 -0.921875 -0.328125 -0.500000
#> 93  2015-06-23 12:46:33 -0.921875 -0.328125 -0.484375
#> 94  2015-06-23 12:46:33 -0.921875 -0.328125 -0.500000
#> 95  2015-06-23 12:46:33 -0.921875 -0.328125 -0.500000
#> 96  2015-06-23 12:46:33 -0.921875 -0.328125 -0.500000
#> 97  2015-06-23 12:46:33 -0.921875 -0.328125 -0.500000
#> 98  2015-06-23 12:46:33 -0.921875 -0.328125 -0.500000
#> 99  2015-06-23 12:46:33 -0.921875 -0.328125 -0.484375
#> 100 2015-06-23 12:46:33 -0.921875 -0.328125 -0.500000
#> 101 2015-06-23 12:46:33 -0.921875 -0.328125 -0.500000

Created on 2020-12-18 by the reprex package (v0.3.0.9001)

Session info

sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       macOS Catalina 10.15.7      
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2020-12-18                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                           
#>  assertthat    0.2.1      2019-03-21 [2] CRAN (R 4.0.0)                   
#>  backports     1.2.0      2020-11-02 [1] CRAN (R 4.0.2)                   
#>  cli           2.2.0      2020-11-20 [1] CRAN (R 4.0.2)                   
#>  crayon        1.3.4      2017-09-16 [2] CRAN (R 4.0.0)                   
#>  data.table    1.13.2     2020-10-19 [1] CRAN (R 4.0.2)                   
#>  digest        0.6.27     2020-10-24 [1] CRAN (R 4.0.2)                   
#>  ellipsis      0.3.1      2020-05-15 [2] CRAN (R 4.0.0)                   
#>  evaluate      0.14       2019-05-28 [2] CRAN (R 4.0.0)                   
#>  fansi         0.4.1      2020-01-08 [2] CRAN (R 4.0.0)                   
#>  fs            1.5.0      2020-07-31 [2] CRAN (R 4.0.2)                   
#>  GGIR          2.1-3      2020-11-10 [1] Github (wadpac/GGIR@2bd6b40)     
#>  glue          1.4.2      2020-08-27 [1] CRAN (R 4.0.2)                   
#>  highr         0.8        2019-03-20 [2] CRAN (R 4.0.0)                   
#>  hms           0.5.3      2020-01-08 [2] CRAN (R 4.0.0)                   
#>  htmltools     0.5.0      2020-06-16 [2] CRAN (R 4.0.0)                   
#>  knitr         1.30       2020-09-22 [1] CRAN (R 4.0.2)                   
#>  lifecycle     0.2.0      2020-03-06 [2] CRAN (R 4.0.0)                   
#>  magrittr      2.0.1      2020-11-17 [1] CRAN (R 4.0.2)                   
#>  pillar        1.4.7      2020-11-20 [1] CRAN (R 4.0.2)                   
#>  pkgconfig     2.0.3      2019-09-22 [2] CRAN (R 4.0.0)                   
#>  purrr         0.3.4      2020-04-17 [2] CRAN (R 4.0.0)                   
#>  R.methodsS3   1.8.1      2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R.oo          1.24.0     2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R.utils       2.10.1     2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R6            2.5.0      2020-10-28 [1] CRAN (R 4.0.2)                   
#>  Rcpp          1.0.5      2020-07-06 [1] CRAN (R 4.0.2)                   
#>  read.cwa    * 0.2.1      2020-10-26 [1] local                            
#>  readr         1.4.0      2020-10-05 [1] CRAN (R 4.0.2)                   
#>  reprex        0.3.0.9001 2020-09-30 [1] Github (tidyverse/reprex@d3fc4b8)
#>  rlang         0.4.9.9000 2020-12-11 [1] Github (r-lib/rlang@1939a71)     
#>  rmarkdown     2.5        2020-10-21 [1] CRAN (R 4.0.2)                   
#>  rstudioapi    0.13       2020-11-12 [1] CRAN (R 4.0.2)                   
#>  sessioninfo   1.1.1      2018-11-05 [2] CRAN (R 4.0.0)                   
#>  stringi       1.5.3      2020-09-09 [1] CRAN (R 4.0.2)                   
#>  stringr       1.4.0      2019-02-10 [2] CRAN (R 4.0.0)                   
#>  styler        1.3.2      2020-02-23 [2] CRAN (R 4.0.0)                   
#>  tibble        3.0.4      2020-10-12 [1] CRAN (R 4.0.2)                   
#>  vctrs         0.3.5      2020-11-17 [1] CRAN (R 4.0.2)                   
#>  withr         2.3.0      2020-09-22 [1] CRAN (R 4.0.2)                   
#>  xfun          0.19       2020-10-30 [1] CRAN (R 4.0.2)                   
#>  yaml          2.2.1      2020-02-01 [2] CRAN (R 4.0.0)                   
#> 
#> [1] /Users/johnmuschelli/Library/R/4.0/library
#> [2] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

Interesting, so there's a difference of about 2% number of rows. This is just a guess but it might be due to buffer size differences and the many interrupts in this cwa. When a buffer of data is read and doesn't pass some quality checks the whole buffer is discarded. Would be interested to see if this happens with nonfaulty cwas.
@aidendoherty do you have any other ideas why there might be differences?

@chanshing
Copy link
Member

Closing due to inactivity, but please reopen if the problem persists.

@vincentvanhees
Copy link

Does this mean you checked that the resampling algorithm in biobankAccelerometerAnalysis is correct? The observations I shared in wadpac/GGIR#369 (comment) indicated to me that the resampling algorithm as implemented in OMGUI/cwa-convert and biobankAccelerometerAnalysis might have a bug. For OMGU/cwa-convert I could confirm this with a test. For biobankAccelerometerAnalysis the output is similar to OMGUI, which would give the impression that it has the same bug... or am I somehow making a mistake? Note that this issue may not apply to all cwa data, I only observed it in the cwa demofile that Axivity has on their website not in data collected with more recent AX3 devices.

@chanshing
Copy link
Member

chanshing commented Jan 15, 2021

Hi @vincentvanhees
I haven't looked into that but the main issue here was that there were "skips" that @muschellij2 didn't understand why it was happening, but is now explained. Regarding the 2% difference in the total number of rows between biobankAccelerometerAnalysis and cwa-convert, I now recall that cwa-convert doesn't resample at all (@danielgjackson is this correct?) while biobankAccelerometerAnalysis does by default. Maybe this explains the 2% difference?
I will look into wadpac/GGIR#369 (comment) soon

@vincentvanhees
Copy link

hi @chanshing My comment relates to item 2 at the top of this thread, which is why I replied here and didn't create a new issue. Initially it appeared as if it was a GGIR issue, but I came to the conclusion that it must be an OMGUI issue and possibly also biobankAccelerometerAnalysis. Note that OMGUI uses cwa-convert internally and has the option to export raw data with or without resampling. I did both and compared them as I discuss in wadpac/GGIR#369 (comment).

Well, hopefully it is all a misunderstanding but if it truly is a bug than it will impact other analysis. This is why I am trying to raise attention to it. For example, I am wondering whether this could be the cause underlying the comparability problem @aidendoherty and Scott Small observed in their recent pre-print: https://www.medrxiv.org/content/10.1101/2020.10.22.20217927v1.full.

@chanshing chanshing reopened this Jan 15, 2021
@aidendoherty
Copy link
Member

Many thanks for your suggestions @vincentvanhees
I'm going to close this issue now. The resampling based on linear interpolation could well be affected by the very small number of files that have large interrupted periods of data as helpfully identified by @muschellij2. Almost all of these cases are unlikely to pass other QC checks. In addition, the output values still seem reasonable and are compatible with OMGUI. However, we'll keep an eye on openmovementproject/openmovement#41

Finally, this part of the code will soon be redundant as @chanshing is working on a major refactor of the codebase to reduce our reliance on Java. This will allow us to more easily piggy back on standard signal processing libraries (in python).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants