Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Messages when Running Ancestral_State_Reconstruction_Roary.py and Kp.gwas_runs.sh #2

Open
erin-thei opened this issue Feb 9, 2023 · 5 comments

Comments

@erin-thei
Copy link

Hello,

Thank you for this tool. I am attempting to use this to do power calculations for pan-genome GWAS. These are the commands I have run:

python3 ./scripts/annotate_nodes_newick.py --input_tree RAxML_bipartitions.bootmap --output_tree tree.annotated.nwk

python3 ./scripts/roary_to_pastml_matrix.py --gene_presence_absence gene_presence_absence.Rtab --input_format 2 --output_table pastml.csv

python3 ./scripts/roary_to_plink_files.py --gene_presence_absence gene_presence_absence.Rtab --input_format 2 --output_prefix kp.pg

python3 ./scripts/ancestral_state_reconstruction_roary.py --input_pastml_table pastml.csv --input_tree tree.annotated.nwk --output_table ancestral.csv --output_steps ancestral_steps.csv --process 8 # this command produced an error

python3 ./pyseer/scripts/phylogeny_distance.py --calc-C --midpoint tree.annotated.nwk > tree_distances.csv

python3 ./scripts/prepare_gwas_runs_roary.py --roary_table gene_presence_absence.Rtab --input_format 2 --parameters_file paramters.binary.efs.txt --code_directory ./scripts/ --pyseer_path ./pyseer/pyseer-runner.py --similarity tree_distances.csv --plink_prefix kp.pg --pastml_steps_file ancestral_steps.csv --output_dir output_dir --output_prefix kp

bash kp.gwas_runs.sh

python3 ./scripts/process_gwas_runs.py --gwas_runs_in_table kp.gwas_runs.csv --variant_type r --output_dir ./output_dir/ --gwas_runs_out_table kp.ph.gwas_runs.results.csv

Rscript ./scripts/plot_gwas_runs.R --input_table kp.ph.gwas_runs.results.csv --parameters_file paramters.binary.efs.txt --plot_type 1 -v 12034 --output_plot kp.gwas_runs.results.plot.pdf

When I ran ancestral_state_reconstruction_roary.py I got some error messages about files not being found. I was still able to get the necessary outputs from this step, but I am wondering if these error messages affected the final plot. Here is the output from the command:
ancestral_state_error.txt

Additionally, when I ran 'bash kp.gwas_runs.sh' I received another set of errors. Again, the script did not exit so I was able to obtain the necessary outputs, but I am wondering what these error messages are and how they may have affected the power calculations.

This is a truncated version of the error message when running kp.gwas_runs.sh as the full output is too large. I can send you the entire error message, if necessary. Just trying to understand these errors and if/how they affect the final power calculations. Thanks!

No observations of group_5941 in selected samples
No observations of group_2493 in selected samples
10722 loaded variants
4093 pre-filtered variants
6629 tested variants
10722 printed variants
2023-02-06 11:21:04,013 INFO: Saving number of homoplasies (steps) per gene...
2023-02-06 11:21:04,016 INFO: Opening gene_presence_absence.Rtab file and calculating gene frequency...
2023-02-06 11:21:04,123 INFO: Opening gene_presence_absence file and calculating gene frequency. DONE.
2023-02-06 11:21:04,123 INFO: Selecting variants meeting criteria...
2023-02-06 11:21:04,132 INFO: Selecting variants meeting criteria. DONE.
2023-02-06 11:21:04,132 INFO: Randomly sampling of variants meeting criteria.
2023-02-06 11:21:04,547 INFO: Reading causal variant file /scr1/users/theillere/usda37/powerbacgwas2/output_dir/HWFRVODMREFP.causal_variants.txt...
2023-02-06 11:21:04,548 INFO: Opening gene_presence_absence.Rtab file and calculating gene frequency...
2023-02-06 11:21:04,594 INFO: Opening gene_presence_absence file and calculating gene frequency. DONE.
2023-02-06 11:21:04,594 INFO: Extracting samples with and without causal variant(s) (mutated and wild-type)...
2023-02-06 11:21:04,594 INFO: Calculating number of cases and controls, with and without the causal variant, to achieve the chosen odds ratio...
Traceback (most recent call last):
File "/scr1/users/theillere/usda37/powerbacgwas2/scripts/simulate_binary_phenotype_roary.py", line 406, in
_main()
File "/scr1/users/theillere/usda37/powerbacgwas2/scripts/simulate_binary_phenotype_roary.py", line 358, in _main
args=(len(roary_samples_mut), len(roary_samples_wt), float(args.allele_frequency), int(sample_size), float(odds_ratio)),
ValueError: could not convert string to float: 'NA'
mv: cannot stat '/scr1/users/theillere/usda37/powerbacgwas2/output_dir/HWFRVODMREFP.phen': No such file or directory
Traceback (most recent call last):
File "./pyseer/pyseer-runner.py", line 8, in
main()
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/main.py", line 305, in main
p = load_phenotypes(options.phenotypes, options.phenotype_column)
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/input.py", line 37, in load_phenotypes
p = pd.read_csv(infile, index_col=0, sep='\t')
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 454, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 948, in init
self._make_engine(self.engine)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 1180, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 2010, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: '/scr1/users/theillere/usda37/powerbacgwas2/output_dir/HWFRVODMREFP.pyseer.phen'
Traceback (most recent call last):
File "./pyseer/pyseer-runner.py", line 8, in
main()
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/main.py", line 305, in main
p = load_phenotypes(options.phenotypes, options.phenotype_column)
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/input.py", line 37, in load_phenotypes
p = pd.read_csv(infile, index_col=0, sep='\t')
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 454, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 948, in init
self._make_engine(self.engine)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 1180, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 2010, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: '/scr1/users/theillere/usda37/powerbacgwas2/output_dir/HWFRVODMREFP.pyseer.phen'
Traceback (most recent call last):
File "./pyseer/pyseer-runner.py", line 8, in
main()
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/main.py", line 305, in main
p = load_phenotypes(options.phenotypes, options.phenotype_column)
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/input.py", line 37, in load_phenotypes
p = pd.read_csv(infile, index_col=0, sep='\t')
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 454, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 948, in init
self._make_engine(self.engine)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 1180, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 2010, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: '/scr1/users/theillere/usda37/powerbacgwas2/output_dir/HWFRVODMREFP.pyseer.phen'
Traceback (most recent call last):
File "./pyseer/pyseer-runner.py", line 8, in
main()
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/main.py", line 305, in main
p = load_phenotypes(options.phenotypes, options.phenotype_column)
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/input.py", line 37, in load_phenotypes
p = pd.read_csv(infile, index_col=0, sep='\t')
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 454, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 948, in init
self._make_engine(self.engine)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 1180, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 2010, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: '/scr1/users/theillere/usda37/powerbacgwas2/output_dir/HWFRVODMREFP.pyseer.phen'
Traceback (most recent call last):
File "./pyseer/pyseer-runner.py", line 8, in
main()
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/main.py", line 305, in main
p = load_phenotypes(options.phenotypes, options.phenotype_column)
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/input.py", line 37, in load_phenotypes
p = pd.read_csv(infile, index_col=0, sep='\t')
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 454, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 948, in init
self._make_engine(self.engine)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 1180, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 2010, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: '/scr1/users/theillere/usda37/powerbacgwas2/output_dir/HWFRVODMREFP.pyseer.phen'
2023-02-06 11:21:09,910 INFO: Saving number of homoplasies (steps) per gene...
2023-02-06 11:21:09,913 INFO: Opening gene_presence_absence.Rtab file and calculating gene frequency...
2023-02-06 11:21:10,000 INFO: Opening gene_presence_absence file and calculating gene frequency. DONE.
2023-02-06 11:21:10,000 INFO: Selecting variants meeting criteria...
2023-02-06 11:21:10,009 INFO: Selecting variants meeting criteria. DONE.
2023-02-06 11:21:10,009 INFO: Randomly sampling of variants meeting criteria.
2023-02-06 11:21:10,430 INFO: Reading causal variant file /scr1/users/theillere/usda37/powerbacgwas2/output_dir/KNMUBUQIHASD.causal_variants.txt...
2023-02-06 11:21:10,431 INFO: Opening gene_presence_absence.Rtab file and calculating gene frequency...
2023-02-06 11:21:10,476 INFO: Opening gene_presence_absence file and calculating gene frequency. DONE.
2023-02-06 11:21:10,476 INFO: Extracting samples with and without causal variant(s) (mutated and wild-type)...
2023-02-06 11:21:10,477 INFO: Calculating number of cases and controls, with and without the causal variant, to achieve the chosen odds ratio...
Traceback (most recent call last):
File "/scr1/users/theillere/usda37/powerbacgwas2/scripts/simulate_binary_phenotype_roary.py", line 406, in
_main()
File "/scr1/users/theillere/usda37/powerbacgwas2/scripts/simulate_binary_phenotype_roary.py", line 358, in _main
args=(len(roary_samples_mut), len(roary_samples_wt), float(args.allele_frequency), int(sample_size), float(odds_ratio)),
ValueError: could not convert string to float: 'NA'
mv: cannot stat '/scr1/users/theillere/usda37/powerbacgwas2/output_dir/KNMUBUQIHASD.phen': No such file or directory
Traceback (most recent call last):
File "./pyseer/pyseer-runner.py", line 8, in
main()
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/main.py", line 305, in main
p = load_phenotypes(options.phenotypes, options.phenotype_column)
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/input.py", line 37, in load_phenotypes
p = pd.read_csv(infile, index_col=0, sep='\t')
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 454, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 948, in init
self._make_engine(self.engine)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 1180, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 2010, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: '/scr1/users/theillere/usda37/powerbacgwas2/output_dir/KNMUBUQIHASD.pyseer.phen'
Traceback (most recent call last):
File "./pyseer/pyseer-runner.py", line 8, in
main()
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/main.py", line 305, in main
p = load_phenotypes(options.phenotypes, options.phenotype_column)
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/input.py", line 37, in load_phenotypes
p = pd.read_csv(infile, index_col=0, sep='\t')
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 454, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 948, in init
self._make_engine(self.engine)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 1180, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 2010, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: '/scr1/users/theillere/usda37/powerbacgwas2/output_dir/KNMUBUQIHASD.pyseer.phen'
Traceback (most recent call last):
File "./pyseer/pyseer-runner.py", line 8, in
main()
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/main.py", line 305, in main
p = load_phenotypes(options.phenotypes, options.phenotype_column)
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/input.py", line 37, in load_phenotypes
p = pd.read_csv(infile, index_col=0, sep='\t')
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 454, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 948, in init
self._make_engine(self.engine)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 1180, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 2010, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: '/scr1/users/theillere/usda37/powerbacgwas2/output_dir/KNMUBUQIHASD.pyseer.phen'
Traceback (most recent call last):
File "./pyseer/pyseer-runner.py", line 8, in
main()
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/main.py", line 305, in main
p = load_phenotypes(options.phenotypes, options.phenotype_column)
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/input.py", line 37, in load_phenotypes
p = pd.read_csv(infile, index_col=0, sep='\t')
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 454, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 948, in init
self._make_engine(self.engine)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 1180, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 2010, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: '/scr1/users/theillere/usda37/powerbacgwas2/output_dir/KNMUBUQIHASD.pyseer.phen'
Traceback (most recent call last):
File "./pyseer/pyseer-runner.py", line 8, in
main()
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/main.py", line 305, in main
p = load_phenotypes(options.phenotypes, options.phenotype_column)
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/input.py", line 37, in load_phenotypes
p = pd.read_csv(infile, index_col=0, sep='\t')
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 454, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 948, in init
self._make_engine(self.engine)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 1180, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 2010, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: '/scr1/users/theillere/usda37/powerbacgwas2/output_dir/KNMUBUQIHASD.pyseer.phen'
2023-02-06 11:21:15,844 INFO: Saving number of homoplasies (steps) per gene...
2023-02-06 11:21:15,847 INFO: Opening gene_presence_absence.Rtab file and calculating gene frequency...
2023-02-06 11:21:15,935 INFO: Opening gene_presence_absence file and calculating gene frequency. DONE.
2023-02-06 11:21:15,935 INFO: Selecting variants meeting criteria...
2023-02-06 11:21:15,944 INFO: Selecting variants meeting criteria. DONE.
2023-02-06 11:21:15,944 INFO: Randomly sampling of variants meeting criteria.
2023-02-06 11:21:16,354 INFO: Reading causal variant file /scr1/users/theillere/usda37/powerbacgwas2/output_dir/XPZEDQRQDHGK.causal_variants.txt...
2023-02-06 11:21:16,355 INFO: Opening gene_presence_absence.Rtab file and calculating gene frequency...
2023-02-06 11:21:16,401 INFO: Opening gene_presence_absence file and calculating gene frequency. DONE.
2023-02-06 11:21:16,401 INFO: Extracting samples with and without causal variant(s) (mutated and wild-type)...
2023-02-06 11:21:16,401 INFO: Calculating number of cases and controls, with and without the causal variant, to achieve the chosen odds ratio...
Traceback (most recent call last):
File "/scr1/users/theillere/usda37/powerbacgwas2/scripts/simulate_binary_phenotype_roary.py", line 406, in
_main()
File "/scr1/users/theillere/usda37/powerbacgwas2/scripts/simulate_binary_phenotype_roary.py", line 358, in _main
args=(len(roary_samples_mut), len(roary_samples_wt), float(args.allele_frequency), int(sample_size), float(odds_ratio)),
ValueError: could not convert string to float: 'NA'
mv: cannot stat '/scr1/users/theillere/usda37/powerbacgwas2/output_dir/XPZEDQRQDHGK.phen': No such file or directory
Traceback (most recent call last):
File "./pyseer/pyseer-runner.py", line 8, in
main()
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/main.py", line 305, in main
p = load_phenotypes(options.phenotypes, options.phenotype_column)
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/input.py", line 37, in load_phenotypes
p = pd.read_csv(infile, index_col=0, sep='\t')
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 454, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 948, in init
self._make_engine(self.engine)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 1180, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 2010, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: '/scr1/users/theillere/usda37/powerbacgwas2/output_dir/XPZEDQRQDHGK.pyseer.phen'
Traceback (most recent call last):
File "./pyseer/pyseer-runner.py", line 8, in
main()
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/main.py", line 305, in main
p = load_phenotypes(options.phenotypes, options.phenotype_column)
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/input.py", line 37, in load_phenotypes
p = pd.read_csv(infile, index_col=0, sep='\t')
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 454, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 948, in init
self._make_engine(self.engine)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 1180, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 2010, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: '/scr1/users/theillere/usda37/powerbacgwas2/output_dir/XPZEDQRQDHGK.pyseer.phen'
Traceback (most recent call last):
File "./pyseer/pyseer-runner.py", line 8, in
main()
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/main.py", line 305, in main
p = load_phenotypes(options.phenotypes, options.phenotype_column)
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/input.py", line 37, in load_phenotypes
p = pd.read_csv(infile, index_col=0, sep='\t')
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 454, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 948, in init
self._make_engine(self.engine)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 1180, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 2010, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: '/scr1/users/theillere/usda37/powerbacgwas2/output_dir/XPZEDQRQDHGK.pyseer.phen'
Traceback (most recent call last):
File "./pyseer/pyseer-runner.py", line 8, in
main()
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/main.py", line 305, in main
p = load_phenotypes(options.phenotypes, options.phenotype_column)
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/input.py", line 37, in load_phenotypes
p = pd.read_csv(infile, index_col=0, sep='\t')
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 454, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 948, in init
self._make_engine(self.engine)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 1180, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 2010, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: '/scr1/users/theillere/usda37/powerbacgwas2/output_dir/XPZEDQRQDHGK.pyseer.phen'
Traceback (most recent call last):
File "./pyseer/pyseer-runner.py", line 8, in
main()
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/main.py", line 305, in main
p = load_phenotypes(options.phenotypes, options.phenotype_column)
File "/scr1/users/theillere/usda37/powerbacgwas2/pyseer/pyseer/input.py", line 37, in load_phenotypes
p = pd.read_csv(infile, index_col=0, sep='\t')
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 454, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 948, in init
self._make_engine(self.engine)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 1180, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/theillere/miniconda3/envs/powerbacgwas/lib/python3.6/site-packages/pandas/io/parsers.py", line 2010, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: '/scr1/users/theillere/usda37/powerbacgwas2/output_dir/XPZEDQRQDHGK.pyseer.phen'
2023-02-06 11:21:21,694 INFO: Saving number of homoplasies (steps) per gene...
2023-02-06 11:21:21,698 INFO: Opening gene_presence_absence.Rtab file and calculating gene frequency...
2023-02-06 11:21:21,786 INFO: Opening gene_presence_absence file and calculating gene frequency. DONE.
2023-02-06 11:21:21,786 INFO: Selecting variants meeting criteria...
2023-02-06 11:21:21,795 INFO: Selecting variants meeting criteria. DONE.
2023-02-06 11:21:21,795 INFO: Randomly sampling of variants meeting criteria.
2023-02-06 11:21:22,211 INFO: Reading causal variant file /scr1/users/theillere/usda37/powerbacgwas2/output_dir/NRRUMMWMXBFS.causal_variants.txt...
2023-02-06 11:21:22,212 INFO: Opening gene_presence_absence.Rtab file and calculating gene frequency...
2023-02-06 11:21:22,257 INFO: Opening gene_presence_absence file and calculating gene frequency. DONE.
2023-02-06 11:21:22,257 INFO: Extracting samples with and without causal variant(s) (mutated and wild-type)...
2023-02-06 11:21:22,258 INFO: Calculating number of cases and controls, with and without the causal variant, to achieve the chosen odds ratio...

@francesccoll
Copy link
Owner

Hi,

Thanks for using PowerBacGWAS for your research.

Would you be able to share the original phylogenetic tree file (RAxML_bipartitions.bootmap) and pan-genome table (gene_presence_absence.Rtab)? I can give it a go to see if I can reproduce this error.

You may also want to try using the Docker/Nextflow implementation to see if you get the same error (just to rule out dependencies issues).

@erin-thei
Copy link
Author

Thanks for the quick reply! I've attached the two input files I used. As for trying the Docker/Nextflow implementation, I've been having issues with using Nextflow on our cluster. I'm trying to work through them, but needed a quicker option - which is why I opted to use the individual commands.

powerbacGWAS_input_files.zip

@francesccoll
Copy link
Owner

francesccoll commented Feb 14, 2023

It looks as if it was not an issue with dependencies but with the format of the input file (gene_presence_absence.Rtab), specifically with the symbol ~ in the gene names which made pastml crash. After editing the input pan-genome file:
cat gene_presence_absence.Rtab | sed 's/~/_/g' > gene_presence_absence.edited.Rtab
the command:
python3 ./scripts/ancestral_state_reconstruction_roary.py --input_pastml_table pastml.csv --input_tree tree.annotated.nwk --output_table ancestral.csv --output_steps ancestral_steps.csv --process 8
run without errors. Make sure all PowerBacGWAS commands before this one are run with the edited input file without ~ symbols. The output table ancestral_steps.csv should have the same number of lines as the input file gene_presence_absence.edited.Rtab

@erin-thei
Copy link
Author

Great, thanks so much - that worked for the ancestral_state_reconstruction_roary.py step! It was able to produce the csv files with no error and ancestral_steps.csv had the same number of lines as the edited Rtab file.

I am still having issues when running 'bash kp.gwas_runs.sh'. When I tried to run this command with the edited Rtab, I am still getting the same error messages that I had posted in my initial message. When you tried to reproduce the issue did this occur for you? I can see if I can post the full error message, however it is very lengthy and repetitive.

Thanks for your help!

@francesccoll
Copy link
Owner

francesccoll commented Mar 3, 2023

I can attempt to reproduce this issue. Can you also share the file 'paramters.binary.efs.txt' too?
Also it may be worth running a single line command in kp.gwas_runs.sh before running them all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants