Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalization of tgp_chr20.trees rule to other chromosomes fails #40

Open
lukashuebner opened this issue Apr 3, 2023 · 0 comments
Open

Comments

@lukashuebner
Copy link

I tried extending your workflow described in the Makefile of https://github.com/awohns/unified_genealogy_paper/tree/master/all-data to the other chromosomes by generalizing your rule

tgp_chr20.trees: tgp_chr20.samples
        python3 ../src/run_inference.py $^ -t ${NUM_THREADS} -A 1 -S 1
        python3 tsutil.py simplify tgp_chr20.nosimplify.trees $@

to

tgp_chr%.trees: tgp_chr%.samples
        python3 ../src/run_inference.py $^ -t ${NUM_THREADS} -A 1 -S 1
        python3 tsutil.py simplify tgp_chr$*.nosimplify.trees $@

This seems to work (i.e., there is no error) for chromosomes 8, 13, 19, 21, and 22 but sadly not for the others. For them, I get the following error:

python3 ../src/run_inference.py tgp_chr18.samples -t 0 -A 1 -S 1
ga-add   (1/2)100%|██████████| 1.98M/1.98M [01:39, 19.9kit/s]
ga-gen   (2/2)100%|██████████| 887k/887k [45:26, 325it/s]
Using chr18 from HapMapII_GRCh37 for the recombination map
Traceback (most recent call last):
  File "../src/run_tsinfer.py", line 230, in <module>
    main()
  File "../src/run_tsinfer.py", line 206, in main
    r_prob, m_prob = get_rho(anc, genetic_map, args.prefix)
  File "../src/run_tsinfer.py", line 109, in get_rho
    genetic_dists = tsinfer.Matcher.recombination_rate_to_dist(rmap, inference_pos)
  File "/home/huebner/projects/tree-sequences/unified_genealogy_paper/venv/lib/python3.8/site-packages/tsinfer/inference.py", line 1160, in recombination_rate_to_dist
    return np.diff(rho.get_cumulative_mass(positions))
  File "/home/huebner/projects/tree-sequences/unified_genealogy_paper/venv/lib/python3.8/site-packages/msprime/intervals.py", line 253, in get_cumulative_mass
    raise ValueError(f"Cannot have positions < 0 or > {self.sequence_length}")
ValueError: Cannot have positions < 0 or > 62949445.0
Traceback (most recent call last):
  File "../src/run_tsinfer.py", line 230, in <module>
    main()
  File "../src/run_tsinfer.py", line 212, in main
    inferred_anc_ts = tskit.load(args.prefix + ".atrees")
  File "/home/huebner/projects/tree-sequences/unified_genealogy_paper/venv/lib/python3.8/site-packages/tskit/trees.py", line 3292, in load
    return TreeSequence.load(
  File "/home/huebner/projects/tree-sequences/unified_genealogy_paper/venv/lib/python3.8/site-packages/tskit/trees.py", line 4087, in load
    file, local_file = util.convert_file_like_to_open_file(file_or_path, "rb")
  File "/home/huebner/projects/tree-sequences/unified_genealogy_paper/venv/lib/python3.8/site-packages/tskit/util.py", line 692, in convert_file_like_to_open_file
    _file = open(path, mode)
FileNotFoundError: [Errno 2] No such file or directory: 'tgp_chr18.atrees'
Running inference with Params(sample_data=<tsinfer.formats.SampleData object at 0x7f57ab7198b0>, filename='tgp_chr18.samples', genetic_map='None', ma_mut_rate=1.0, ms_mut_rate=1.0, precision=15, num_threads=0)
GA done (ma_mut: 1.0, ms_mut: 1.0)
Starting 1.0 1.0 precision 15
MA done (ma_mut:1.0 ms_mut1.0)
Traceback (most recent call last):
  File "../src/run_inference.py", line 238, in <module>
    result = run(params)
  File "../src/run_inference.py", line 99, in run
    inferred_ts = tskit.load(prefix + ".nosimplify.trees")
  File "/home/huebner/projects/tree-sequences/unified_genealogy_paper/venv/lib/python3.8/site-packages/tskit/trees.py", line 3292, in load
    return TreeSequence.load(
  File "/home/huebner/projects/tree-sequences/unified_genealogy_paper/venv/lib/python3.8/site-packages/tskit/trees.py", line 4087, in load
    file, local_file = util.convert_file_like_to_open_file(file_or_path, "rb")
  File "/home/huebner/projects/tree-sequences/unified_genealogy_paper/venv/lib/python3.8/site-packages/tskit/util.py", line 692, in convert_file_like_to_open_file
    _file = open(path, mode)
FileNotFoundError: [Errno 2] No such file or directory: 'tgp_chr18.nosimplify.trees'
make: *** [Makefile:43: tgp_chr18.trees] Error 1

As it prints "Using chr18 from HapMapII_GRCh37 for the recombination map", it probably took the broken code path in

map_file = os.path.join(map.map_cache_dir, map.file_pattern.format(id="20"))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant