Skip to content

Ticket_id addition to file name #376

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
DLBPointon opened this issue Mar 6, 2025 · 13 comments
Open

Ticket_id addition to file name #376

DLBPointon opened this issue Mar 6, 2025 · 13 comments
Assignees
Labels
enhancement New feature or request

Comments

@DLBPointon
Copy link
Contributor

Description of feature

In the presence of a ticket_id, the file/folder structure should now include that to help differentiation.

tolid - ticket_id - normal treeval folder outputs

or

tolid - normal structure - files with ticket_id inserted

@DLBPointon DLBPointon added the enhancement New feature or request label Mar 6, 2025
@DLBPointon
Copy link
Contributor Author

DLBPointon commented Mar 6, 2025

Personally, I'm in favour of both options:

iyTipFemo1
    - GRIT-XXX
         -- pretextmaps
             --- pretext_nr_GRIT-XXX.pretext
         -- accessoryfiles
     - RC-XXXX
         -- pretextmaps
             --- pretext_nr_RC-XXXX.pretext
         -- accessoryfiles

I suggest that we re-use project-id as it isn't used in it's current form.

@DLBPointon
Copy link
Contributor Author

Or add it as a new param, as after the re-write everything will be a param anyway.

@yumisims
Copy link
Contributor

yumisims commented Mar 6, 2025

Just to clarify the description:

The change should aim to prevent confusion between ONT and HiFi assemblies. When TreeVal generates mapping output, the entry ID is often formatted as tolid_runversion. This can make it difficult for curators to determine the assembly type, as some ONT assemblies also have HiFi/PacBio assemblies at the same time.

To address this, it would be beneficial for the TreeVal entry ID to follow this format: tolid_jiraid_runversion. This change should be sufficient to distinguish assembly type and run version, making it easier to differentiate hap1 and merged assemblies based on the entry name. @DLBPointon @weaglesBio @mcshane

@DLBPointon
Copy link
Contributor Author

DLBPointon commented Mar 6, 2025

Ok so posting this I sent to Slack earlier

Image

So you would want as how it was being generated but files names with bAnaCre_GRIT-100_1?

@DLBPointon
Copy link
Contributor Author

Just to clarify the description:

The change should aim to prevent confusion between ONT and HiFi assemblies. When TreeVal generates mapping output, the entry ID is often formatted as tolid_runversion. This can make it difficult for curators to determine the assembly type, as some ONT assemblies also have HiFi/PacBio assemblies at the same time.

To address this, it would be beneficial for the TreeVal entry ID to follow this format: tolid_jiraid_runversion. This change should be sufficient to distinguish assembly type and run version, making it easier to differentiate hap1 and merged assemblies based on the entry name. @DLBPointon @weaglesBio @mcshane

@additive3 Tagging Jo as well as it affects his team.

@additive3
Copy link

Thanks @DLBPointon

I think the use of jira ticket resolves the issue well. [tolid].[jira_ticket].[resolution].pretext
It prevents the overwriting of files where we have a previous assembly release as any new assembly will have a new ticket.
I think it good to utilise the ticketid in the rundirectory too in a similar vein.

All assembly data information are carried in the ticket, so no need to specify in the map file name.

Any map file regeneration would just be a case of overwriting the existing files. No need for the runversion because there will be a single (combined) map produced.

@yumisims @weaglesBio @mcshane

@additive3
Copy link

To address this, it would be beneficial for the TreeVal entry ID to follow this format: tolid_jiraid_runversion. This change should be sufficient to distinguish assembly type and run version, making it easier to differentiate hap1 and merged assemblies based on the entry name.

I don't want multilple assorted map files produced. Curators will be working from combined haplotype maps only (available as standard/high resolution)

@yumisims
Copy link
Contributor

yumisims commented Mar 6, 2025

If there is only one ticket per map, we could stick to tolid_jiraid_resolution for map name, and tolid_jiraid as treeval entry name.

@DLBPointon
Copy link
Contributor Author

In that case, we can get away with using sample_id: {tolid}.{ticket_id} in the yaml rather than this being something we need to add an update for, even if I prefer the folder approach above.

So we'll end up with:

bAnaCre1.GRIT-101
     - Usual output but everything is now labelled in the style of bAnaCre1.GRIT-101.nr.pretext
bAnaCre1.GRIT-202
     - Usual output but everything is now labelled in the style of bAnaCre1.GRIT-202.nr.pretext

In that case it'll just be a job for @weaglesBio to:

  • update the script to concat the values
  • not generate maps for pri/hap in combined cases
    • @additive3 do you still want higlass for the pri/hap? At the minute treeval is set up so we can exclude all HIC_MAPPING but not specific mappings (pretext/higlasss).

@additive3
Copy link

@DLBPointon
No I think there is no need to keep producing the HiGlass maps for single hap assembly either.

@additive3
Copy link

even if I prefer the folder approach above.

@DLBPointon I'm happy for the run directories to be just the ticket_id

@yumisims
Copy link
Contributor

yumisims commented Mar 7, 2025

@additive3 Could you please clarify if this applies to full treeval ticket too?

@DLBPointon
Copy link
Contributor Author

even if I prefer the folder approach above.

@DLBPointon I'm happy for the run directories to be just the ticket_id

That makes the most sense actually, just a small update for @weaglesBio when he gets back then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants