Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task: Reorganize readers #17

Closed
3 tasks done
ReinderVosDeWael opened this issue Apr 26, 2024 · 10 comments
Closed
3 tasks done

Task: Reorganize readers #17

ReinderVosDeWael opened this issue Apr 26, 2024 · 10 comments
Assignees
Labels
task A development task intended for Github Projects

Comments

@ReinderVosDeWael
Copy link
Contributor

ReinderVosDeWael commented Apr 26, 2024

Description

The readers need to be rewritten to account for the WatchData class changes and there should be a convenience reader function that selects the correct function based on file extension.

These should all be placed inside wristpy/io/readers.py.

Tasks

  • Rewrite read_gt3x to return WatchData and accept a pathlib.Path | str.
  • Rewrite read_gene_activ to return WatchData and accept a pathlib.Path | str.
  • Add a read_watch_data that takes the same input and output and selects the correct reader based on file extension or raises and error for unknown file extensions.

Freeform Notes

No response

@ReinderVosDeWael ReinderVosDeWael added the task A development task intended for Github Projects label Apr 26, 2024
@Asanto32 Asanto32 self-assigned this Apr 29, 2024
@nx10
Copy link
Contributor

nx10 commented Apr 29, 2024

Add a read_watch_data that takes the same input and output and selects the correct reader based on file extension or raises and error for unknown file extensions.

This will work for now but not forever - Many watches use .bin as the extension.

@ReinderVosDeWael
Copy link
Contributor Author

Of course they do... @Asanto32 perhaps we want to omit the wrapper read function

@nx10
Copy link
Contributor

nx10 commented Apr 29, 2024

I can offer adding one to actfast that reads a few bytes and returns the type

@Asanto32
Copy link
Collaborator

Asanto32 commented Apr 29, 2024

Ahh, I already started with something simple as follows, but I can remove it.

def read_watch_data(file_name: pathlib.Path | str) -> WatchData:
    """Read watch data from a file.

    This function selects the correct loader based on the file extension.
    Returns error if none of the above.

    Args:
        file_name: The filename to read the watch data from.

    Returns:
        input_data: The raw sensor data.
    """
    filename = pathlib.Path(file_name)
    if filename.suffix == ".gt3x":
        input_data = gt3x_loader(filename.as_posix())
    elif filename.suffix == ".bin":
        input_data = geneActiv_loader(filename.as_posix())
    else:
        raise ValueError(f"Unsupported file extension: {filename.suffix}")

    return input_data

@Asanto32 Asanto32 reopened this Apr 29, 2024
@Asanto32
Copy link
Collaborator

Asanto32 commented Apr 29, 2024

Also, this is what the reader for GGIR looks like, it seems the other watch with .bin is movisens?

https://github.com/wadpac/GGIR/blob/master/R/g.readaccfile.R

@Asanto32
Copy link
Collaborator

Asanto32 commented Apr 29, 2024

@ReinderVosDeWael @nx10 if I'm not mistaken the actfast readers only work with str. So if I use the wrapper above in read_watch_data, we can pass a pathlib.path or string directly. This goes against Task 1 and 2, but captures the idea by using the wrapper. Let me know your thoughts.

@nx10
Copy link
Contributor

nx10 commented Apr 29, 2024

Yeah so far they only work with string, but I will probably make them compatible with arrows filesystem abstraction then you can directly stream from stuff like S3

Either way just use

actfast.read_x_y(str(path))

for now.

@nx10
Copy link
Contributor

nx10 commented Apr 30, 2024

I'm also open to move whatever wrapper you come up with to actfast eventually.

PyO3 currently can't generate stubs (that's why actfast has no in-editor auto complete) so it would make sense to also ship a thin python layer.

@ReinderVosDeWael
Copy link
Contributor Author

Yeah I concur with Florian here; you can take in a pathlib.Path | str and convert it with str() where needed. This means that users of this function (including power end-users) can use all the goodies of pathlib without being bothered by the conversion themselves.

@nx10
Copy link
Contributor

nx10 commented Apr 30, 2024

Also, this is what the reader for GGIR looks like, it seems the other watch with .bin is movisens?

https://github.com/wadpac/GGIR/blob/master/R/g.readaccfile.R

Just for reference from the GGIR docs:

2.2 Prepare folder structure
GGIR works with the following accelerometer brands and formats:
GENEActiv .bin
Axivity AX3 and AX6 .cwa
ActiGraph .csv and .gt3x (.gt3x only the newer format generated with firmware versions above 2.5.0. Serial numbers that start with “NEO” or “MRA” and have firmware version of 2.5.0 or earlier use an older format of the .gt3x file). Note for Actigraph users: If you want to work with .csv exports via the commercial ActiLife software then note that you have the option to export data with timestamps. Please do not do this as this causes memory issues for GGIR. To cope with the absence of timestamps GGIR will calculate timestamps from the sample frequency, the start time and start date as presented in the file header.
Movisens .bin files with data stored in folders. GGIR expects that each participant’s folder contains at least a file named acc.bin.
Any other accelerometer brand that generates csv output, see documentation for functions read.myacc.csv and argument rmc.noise in the GGIR function documentation (pdf). Note that functionality for the following file formats was part of GGIR but has been deprecated as it required a significant maintenance effort without a clear use case or community support: (1) .bin for the Genea monitor by Unilever Discover, an accelerometer that was used for some studies between 2007 and 2012) .bin, and (2) .wav files as can be exported by the Axivity Ltd OMGUI software. Please contact us if you think these data formats should be facilitated by GGIR again and if you are interested in supporting their ongoing maintenance.
All accelerometer data that needs to be analysed should be stored in one folder, or subfolders of that folder.
Give the folder an appropriate name, preferable with a reference to the study or project it is related to rather than just ‘data’, because the name of this folder will be used later on as an identifier of the dataset.

( https://cran.r-project.org/web/packages/GGIR/vignettes/GGIR.html )

So it seems that GENEActiv, Movisens, Genea, and Axivity all use the .bin file extension for their different formats. Additionally ActiGraph .gt3x files are of course also just zip archives that contain a .bin file.

@Asanto32 Asanto32 closed this as completed May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task A development task intended for Github Projects
Projects
None yet
Development

No branches or pull requests

3 participants