-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preparing for future bioconda submission #455
Comments
Hi team, I would like to add TileDB-VCF as a nf-core module for Nextflow pipelines. I believe that making your work available on the nf-core repository will be valuable to the community. |
Hi @atrigila, Our existing package can be viewed on the Anaconda website, here. If you would like to discuss some ideas of how this module would be used in existing NF-core pipelines such as Sarek and variant catalogue, we might be able to help. @leipzig can be reached either on the NF-core Slack channel or via email: [email protected]. We would also be happy to work with you if you wish to assist creating a Bioconda recipe. Feel free to send an email to [email protected] to discuss further; we can arrange a call as well. |
Hi Adam, thanks for the reply! I will surely send an E-mail on Monday to discuss this further. |
Quick update. The main motivation for submitting this recipe to bioconda was the htslib dependency. However, in order to build conda binaries for the latest release (0.21.0, TileDB-Inc/tiledb-vcf-feedstock#62), we had to switch to vendoring htslib (ie building it from source as part of the build; TileDB-Inc/tiledb-vcf-feedstock#64). Thus from a maintenance perspective of the entire TileDB stack, it would be easier to submit TileDB-VCF to conda-forge as well. We could use the existing recipe in https://github.com/TileDB-Inc/tiledb-vcf-feedstock as is, and we could in the future build binaries for additional platforms (eg win, osx-arm64, linux-aarch64) that are supported by conda-forge but not bioconda. |
Another update. A big benefit of submitting to bioconda is that it automatically creates biocontainers (ie Docker images with the single tool installed). These biocontainers are strongly preferred by nf-core compared to self-maintained Docker images. However, even if we submit to conda-forge instead of bioconda, we can still automatically build biocontainers for the conda binaries by submitting a PR to https://github.com/BioContainers/mulled |
Another twist: building htslib as part of the superbuild was laborious. Building for all the variants of libdeflate and openssl required many builds (TileDB-Inc/tiledb-vcf-feedstock#70; purely for the sake of htslib), the CMake commands to install htslib weren't properly linking libdeflate (TileDB-Inc/tiledb-vcf-feedstock#74), and vendoring a dynamically linked library can easily break a conda env (TileDB-Inc/tiledb-vcf-feedstock#76). For now we are building htslib in a dedicated TileDB-Inc feedstock and uploading it to the tiledb channel (TileDB-Inc/htslib-feedstock). This gives us full control of the build variants (eg libdeflate, openssl), and opens the potential to building for arm in the future (TileDB-Inc/tiledb-vcf-feedstock#42; TileDB-Inc/tiledb-vcf-feedstock#66), which bioconda doesn't support. And given that we already have to maintain our bespoke build of m2w64-htslib, it makes sense to also continue maintaining our own build of htslib. Overall, this conda recipe doesn't fit well into the requirements for either bioconda nor conda-forge, so unless provided a very compelling motivation, I think it should remain in the tiledb channel. |
New user here. I appreciate the pain of the builds in bioconda, etc -- however, in the current setup its difficult to build and environment with other reasonable dependencies, ie its natural to think you'd want a Conda environment that has recent builds of htslib tools (samtools, bcftools, etc) alongside tiledbvcf, general variant calling packages, etc. While I did manage to solve an environment with the dependencies I needed, the latest
If I try to upgrade I run into dependency issues, unsurprisingly it's ultimately the openssl dependency but this is likely due to the older htslib 1.16. I didn't look at exactly at how these are defined in the recipe or if there is some reason that you need to pin htslib 1.16 (August 2022). I thought all of the bioconda openssl issues with samtools/htslib have more recently been fixed (don't recall exactly what versions this impacted).
I guess I can try to build from source in an environment that has the latest htslib, etc and see if that works. |
I can reproduce the solver error. The problem is because we build libtiledbvcf against the latest htslib created by TileDB-Inc/htslib-feedstock, which is still at 1.16. Because of the run exports of htslib recipe: build:
number: 0
run_exports:
- {{ pin_subpackage('htslib', max_pin='x.x') }} the runtime requirement is pinned to The good news is that this can be fixed. I'll update our htslib recipe to 1.19, and then we'll can rebuild libtiledbvf. In the short-term, you'll need to use the 1.16 versions |
Thanks @jdblischak! |
I've created a new Issue to track updating the htslib version linked to libtiledbvcf TileDB-Inc/tiledb-vcf-feedstock#106 |
This isn't urgent, but I recently investigated the potential submission to bioconda, so I wanted to create this reminder and share my notes:
xref: #47
The text was updated successfully, but these errors were encountered: