Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot submit_and_finalize query #663

Open
dotinspace opened this issue Feb 13, 2024 · 2 comments
Open

Cannot submit_and_finalize query #663

dotinspace opened this issue Feb 13, 2024 · 2 comments

Comments

@dotinspace
Copy link

Hi!
I am trying to ingest samples, but run into the following problem:

_40b48cf415984c70a4ab49e225d7c8ef_21 samples = SAMPLE
[2024-02-13 09:59:31.061] [tiledb-vcf] [...] [debug] Finalizing last contig batch of [1, 1]
[2024-02-13 09:59:31.061] [tiledb-vcf] [...] [debug] AlleleCount: Finalize query with 0 records
[2024-02-13 09:59:31.104] [tiledb-vcf] [...] [debug] VariantStats: Finalize query with 0 records
[2024-02-13 09:59:31.146] [tiledb-vcf] [...] [debug] Query buffer for 'contig' contains 3272 elements
[2024-02-13 09:59:31.146] [tiledb-vcf] [...] [critical] Cannot submit_and_finalize query with buffers set.

As far as I can tell, one of the following steps fails. Is there something I can test tweaking to make this work, or does anything else stand out as the obvious culprit here?

    File: libtiledbvcf/src/stats/allele_count.cc
 160   if (contig_records_ > 0) {
 161     if (utils::query_buffers_set(query_.get())) {
 162       LOG_FATAL("Cannot submit_and_finalize query with buffers set.");                                                                                                                                      
 163     }
 164     query_->submit_and_finalize();
 File: libtiledbvcf/src/stats/variant_stats.cc
 158   if (contig_records_ > 0) {
 159     if (utils::query_buffers_set(query_.get())) {
 160       LOG_FATAL("Cannot submit_and_finalize query with buffers set.");                                                                                                                                      
 161     }
 162     query_->submit_and_finalize();
@gspowley
Copy link
Member

Hi @dotinspace,

This error looks like a data dependent edge case related to the AlleleCount and VariantStats stats having 0 records.

To work around the issue, please create the dataset with AlleleCount and VariantStats disabled:

tiledbvcf create --disable-allele-count --disable-variant-stats ...

If you can share the VCF file ingested, it would help us debug the issue (I know that is not always possible). Otherwise, we will try to reproduce the condition that causes this error.

@dotinspace
Copy link
Author

Hi, thanks for the swift response.

The multisample VCF, ALL.chr1.phase3_shapeit2_mvncall_integrated_v5b.20130502.genotypes.vcf.gz, was taken from 1000Genomes. Then split with vcf-split, and subsequently block compressed (bgzip) and indexed (tabix), before being ingested into TileDB-VCF dataset. I wouldn't be surprised if the VCF files, or the process of splitting, might cause some issue with those two stats arrays. Unfortunately, for our purposes, currently, we are testing by utilising variant_stats.

Anyway, nice to know what is going on for future reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants