Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: could not broadcast input array from shape (3065,) into shape (2777,) #325

Closed
cwarlysolsberg opened this issue Jun 24, 2024 · 6 comments · Fixed by #341
Closed
Labels
bug Something isn't working user-query User queries & requests
Milestone

Comments

@cwarlysolsberg
Copy link

cwarlysolsberg commented Jun 24, 2024

Description of the bug

I have run PGSC_calc many times on different files but this is the only time I am running with liftover and with run_ancestry. Without run_ancestry, it runs fine. Otherwise it fails at fraposa.py.

Join mismatch for the following entries: key=[chrom:ALL, n:0, effect_type:additive] values=[]

Loading study data...
Traceback (most recent call last):
File "/venv/bin/fraposa", line 8, in
sys.exit(main())
File "/venv/lib/python3.10/site-packages/fraposa_pgsc/fraposa_runner.py", line 56, in main
fp.pca(ref_filepref=ref_filepref, stu_filepref=stu_filepref, stu_filt_iid=stu_filt_iid, out_filepref=out_filepref,
File "/venv/lib/python3.10/site-packages/fraposa_pgsc/fraposa.py", line 520, in pca
W, W_bim, W_fam = read_bed(stu_filepref, dtype=np.int8, filt_iid=stu_filt_iid)
File "/venv/lib/python3.10/site-packages/fraposa_pgsc/fraposa.py", line 148, in read_bed
bed[i,:] = genotypes[i_extract]
ValueError: could not broadcast input array from shape (3065,) into shape (2777,)

Command used and terminal output

sudo nextflow run pgscatalog/pgsc_calc -r ccfd6367d55eee6d81c36541248d757ebacf6c7e -profile docker \
    --input $path/samplesheet.csv \
    --scorefile $path/scorefile_reformatted.txt \
    --liftover \
    --target_build GRCh38 \
    --hg19_chain $path/hg19ToHg38.over.chain.gz \
    --hg38_chain $path//hg38ToHg19.over.chain.gz \
    --run_ancestry $path/pgsc_HGDP+1kGP_v1.tar.zst \
    --outdir $path/results

Relevant files

No response

System information

No response

@cwarlysolsberg cwarlysolsberg added the bug Something isn't working label Jun 24, 2024
@nebfield
Copy link
Member

Thanks for the bug report 😄 This is a strange problem that I can't reproduce. Perhaps it might be caused by the cache if you've successfully ran pgsc_calc a lot before. Could you try rm -r work before retrying?

@cwarlysolsberg
Copy link
Author

cwarlysolsberg commented Jun 25, 2024

yes I have removed work and results before retrying. not sure if this matter but i also get a file named GRCh38_out_oriented_out_splitfamab.pcs which is weird because in the past i've always seen GRCh38_out_oriented_out_splitfamaa.pcs

I have also tried running this on numerous releases just to make sure its not an issue in the newest release (getting the same error)

@nebfield
Copy link
Member

Thanks for the extra details - the different file name is interesting. Could you please attach the .nextflow.log file from a broken run? The log gets created in the same directory where you run the workflow - it just contains metadata about the state of the workflow.

@cwarlysolsberg
Copy link
Author

cwarlysolsberg commented Jun 26, 2024

Attached is the log file.
nextflow (4).log

@cwarlysolsberg
Copy link
Author

I figured it out. I had combined a bunch of cohorts and some of them used the same ID's. For some reason this error happened because of duplicate ids. Maybe just put in a simple FAM check to confirm there are no duplicate ids before moving forward haha. Everything else ran fine because other packages considered FID and IID so i just got an ambigous error. I was finally able to figure it out using the pygsc with some error handling in the .py code.

@nebfield
Copy link
Member

nebfield commented Jul 1, 2024

Thanks for debugging! I was quite confused 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working user-query User queries & requests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants