Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing and categorization #6

Open
9 of 10 tasks
rmanaem opened this issue Jun 6, 2024 · 0 comments
Open
9 of 10 tasks

Parsing and categorization #6

rmanaem opened this issue Jun 6, 2024 · 0 comments
Assignees
Labels
Milestone Used to track other issues that are required to complete the milestone.

Comments

@rmanaem
Copy link
Contributor

rmanaem commented Jun 6, 2024

Outcome

The model should be able to parse the uploaded participants.tsv and data dictionary (if there is one since it's optional in the annotation tool) to

  • describe what columns there are in the tsv
  • validate those against the ones in the data dictionary (if there is one)
  • map columns to Neurobagel annotating categories
    • the challenge here is most likely diagnosis and assessment tools
    • getting automated coding on participant_id, age, and sex w/ an LLM agent would be the simplest case
    • temporal / longitudinal fields (session / visit) should probably be ignored for now

Resources:

Utilities

  • llama3
  • Gemma
  • hand over each column separately to the model

Tasks

@rmanaem rmanaem added the Milestone Used to track other issues that are required to complete the milestone. label Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Milestone Used to track other issues that are required to complete the milestone.
Projects
None yet
Development

No branches or pull requests

3 participants