Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pair_Frags...read error with HiSIF command #2

Open
yaojiayingJenny opened this issue Feb 25, 2021 · 5 comments
Open

Pair_Frags...read error with HiSIF command #2

yaojiayingJenny opened this issue Feb 25, 2021 · 5 comments

Comments

@yaojiayingJenny
Copy link

I got an error with HiSIF step. It seems my input files aren't in correct format. But I followed the pipeline tutorial instructions strictly. Could anyone know the reason? Thanks.

$head test/chr1.tmp
1 10000001 1 1 10001638 1
1 10000018 1 1 9996231 1
1 10000020 1 1 9383708 1
1 10000021 1 1 9564208 1
1 100000267 0 1 103050925 0

command:
HiSIF -g hg19_bowtie2_index -c hg19.MboI.bed -p 1 29 -w 50 500 5000 -T 1 -s 0.1 -i 2 -m 1 -x 5 -o test/result test/

out.log:
(=:...........Start processing files...........:=)
cuttingSiteTotal == 7127585
<-----Parsed enzyme cutting site map----->
<-----Extended cutting site region----->
<-----Combining Data from Child Processes----->
<-----Reading Vector Sizes for Bootstrapping----->
<-----Found 25 files----->
<-----Reading sum pipe----->
<-----Performing filtration----->
<-----Writing to test__PoisMix.txt----->
<-----Main Process 1 finished writing distrubitions----->
<-----Finished Vector Sizes for Bootstrapping----->
<-----Child Processes Finished----->
<-----Beginning Bootstrapping----->
<-----Using samplesize of 13555475 elements----->
<-----Random Dataset 1----->
Iteration: 1--> LLH: -3.26381
Iteration: 2--> LLH: -2.88868
Iteration: 3--> LLH: -2.69257
Iteration: 4--> LLH: -2.58041
Iteration: 5--> LLH: -2.51447
<-----Random Dataset 2----->
Iteration: 1--> LLH: -3.47266
Iteration: 2--> LLH: -3.7416
Iteration: 3--> LLH: -3.9588
<-----Proximate ligation events extracted from mixture model----->
<-----Starting Frequency Generation----->

@yufanzhouonline
Copy link
Owner

yufanzhouonline commented Mar 3, 2021

Hi yaojiayingJenny,

If you just only run one chromosome with HISIF, please also make the empty file for other chromosomes. You can run the shell like the following:

###Linux Shell to make all empty files
for chrno in $(seq 2 23)
do
touch chr${chrno}.tmp
done

Thanks.

@yaojiayingJenny
Copy link
Author

Hi yaojiayingJenny,

If you just only run one chromosome with HISIF, please also make the empty file for other chromosomes. You can run the shell like the following:

###Linux Shell to make all empty files
for chrno in $(seq 2 23)
do
touch chr${chrno}.tmp
done

Thanks.

Thanks for your reply.

Actually my input contain 25 files which were generated by "proc" command(HiSIF_V1.00/bin/proc split/ test/ -t) from a validPairs file.
(<-----Found 25 files----->)
$ls test
chr10.tmp chr12.tmp chr14.tmp chr16.tmp chr18.tmp chr1.tmp chr21.tmp chr23.tmp chr25.tmp chr3.tmp chr5.tmp chr7.tmp chr9.tmp
chr11.tmp chr13.tmp chr15.tmp chr17.tmp chr19.tmp chr20.tmp chr22.tmp chr24.tmp chr2.tmp chr4.tmp chr6.tmp chr8.tmp

I also tried your suggestion with 23 files(chr1.tmp to chr23.tmp) and 24 files (chr1.tmp to chr24.tmp), while neither of them succeeded.

Please let me know if there are any other rules need to follow, or could you upload the test data with the running command If it's convenient for you ? Thanks a lot.

Here're other information about my command:
-g hg19_bowtie2_index (my index path)
├── hg19.1.bt2
├── hg19.2.bt2
├── hg19.3.bt2
├── hg19.4.bt2
├── hg19.rev.1.bt2
└── hg19.rev.2.bt2
chromosome information is:
chr1
...
chr22
chrX

-c hg19.MboI.bed(from HiSIF_V1.00/resources/ after "cat" command)
chr1
...
chr22
chrM
chrX
chrY

@yufanzhouonline
Copy link
Owner

Hi yaojiayingJenny,

I have uploaded the example data, please find them on:

https://github.com/yufanzhouonline/HiSIF/tree/master/HiSIF_V1.00/example

Please try to run HiSIF with these example data.

Please let me know if it works.

Thanks.

@jiayingyao
Copy link

jiayingyao commented Mar 16, 2021

Hi yaojiayingJenny,

I have uploaded the example data, please find them on:

https://github.com/yufanzhouonline/HiSIF/tree/master/HiSIF_V1.00/example

Please try to run HiSIF with these example data.

Please let me know if it works.

Thanks.

Hi yufanzhouonline,
Thanks for your example data. I tested them, and got the same "Pair_Frags read error".
However, I noticed differential name with parameter of peakThreshold value. Mine is "-T", yours is "-t". Is that the reason we used a different version?

your command:
bin/HiSIF -g genome/hg19 -c resources/hg19.HindIII.bed -w 36 500 20000 -p 1 29 -t 1 -i 2 hESC

my command: (-m parameter is forced)
bin/HiSIF -g genome/hg19 -c resources/hg19.HindIII.bed -w 36 500 20000 -p 1 29 -T 1 -i 2 -m 2 -o output hESC

Program: HiSIF - HiC Significant Interaction Fragments
Version: 1.0
HiSIF [options]

-g <DIR>						reference genome directory
-p <INT> <INT>				poisson mixture model parameters
-w <INT> <INT> <INT>		readLength, cuttingSiteExtent, fragmentExtent
-T <INT>						peakThreshold value
-s <0.0-1.0>				percentage of dataset for bootstrapping
-i <INT>						bootstrapping iterations
-c <FILE>					cutting sites map .bed file
-o <FILE>					outputfile
-x								limit number of child processes used

Experimental:
-m <1 or 2> use file i/o to save main memory, 1 saves some, 2 saves more

@yufanzhouonline
Copy link
Owner

Hi yaojiayingJenny,

Please download the latest version and install HiSIF as mentioned (Just input "make" on the folder of HiSIF)

Then follow my command, not your command.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants