NEXT-RNAi.pl

#!/usr/bin/perl -w

#####################################
#####################################
#####################################
###                               ###
### NEXT-RNAi program description ###
###                               ###
#####################################
#####################################
#####################################

=head1 NAME

NEXT-RNAi Version 1.41 (10/19/2011) - Designing and evaluating genome-wide libraries for RNAi screens

=head1 DESCRIPTION

NEXT-RNAi is a software for the design and evaluation of genome-wide RNAi libraries and performs all steps
from the prediction of specific and efficient RNAi target sites to the visualization of designed reagents 
in their genomic context. The software enables the design and evaluation of siRNAs and long dsRNAs and was
implemented in an organism-independent manner allowing designs for all sequenced and annotated genomes.

Please visit http://www.nextrnai.org/ for complete documentation of NEXT-RNAi.

=head1 SYNOPSIS

perl nextrnai.pl -i <inputfile> [-s <split inputfile>] -r <reagent> -d <Bowtie database> -e <evaluation> [-o <optionsfile>] [-n <probe name>] [-h help] [-p interactive mode]

=head1 OPTIONS

=over 2

-i <inputfile>

=over 1

Inputfile containing target sequences (in FASTA format)

=back 1

-s <int>

=over 1

<int> number of features (FASTA sequences) from input file that are processed at once (optional, default=4000)

=back 1

-r <reagent>

=over 1

Reagent type (d = long dsRNA, s = short interfering RNA) designed or evaluated

=back 1

-d <Bowtie database/index>

=over 1

Location of Bowtie database/index file (pre-build with bowtie-build), multiple inputs are allowed (separated by '+')(optional, if set to 'nodb' NEXT-RNAi will run without 'off-target' evaluation)

=back 1

-e <evaluation>

=over 1

NO: de novo design of RNAi reagents

OLIGO: evaluation of primers for long dsRNAs (-r d) or siRNAs (-r s)

DSRNA: evaluation of long dsRNAs (-r d)

DSRNA+OLIGO: evaluation of long dsRNAs and underlying primers (-r d)

=back 1

-o <optionsfile>

=over 1

File containing further settings for RNAi reagent design/evaluation in a TAG=VALUE format (optional)

=back 1

-n <probe name>

=over 1

Name tag for files generated by NEXT-RNAi (optional, default=Probe)

=back 1

-h <help>

=over 1

Show help (optional)

=back 1

-p <interactive mode>

=over 1

Start interactive setting of NEXT-RNAi options (optional)

=head1 PARAMETERS FOR OPTIONS FILE

=over 2

=head2 Program locations (NEXT-RNAi dependencies)

=back 4

=over 3

PRIMER3

Set location of primer3_core script required for primer designs during the design and evaluation of long dsRNAs (default = /usr/bin/). Primer3 settings can be influenced in an additional options file (see PRIMER3OPT below).

BOWTIE

Set location of bowtie script required by NEXT-RNAi (default = /usr/bin/). Bowtie is used for mappings to determine the specificity of an RNAi reagent (against the database defined with -d) and for mappings to determine the location of an RNAi reagent in the genome. The mapping of RNAi reagents is a prerequisite for generation of GFF and AFF output files and for the calculation of FEATURE contents and requires the definition of a mapping database (GENOMEBOWTIE).

LOWCOMPEVAL

Set location of mdust program for the evaluation of low-complexity regions in the input sequences (default = disabled).

BLAT

Set location of blat program for mapping RNAi reagents to the genome (default = disabled). By default BOWTIE is used for mappings. However, if reagents were designed on CDS (SOURCE=CDS) Blat is required to allow for gapped alignments to the genome. BLAT mapping can be influenced by a set of further options (see GENOMEFASTA, BLATALIGN, BLATSPLIT, BLATPROGRAM, BLATHOST, BLATPORT below). The mapping of RNAi reagents is a prerequisite for generation of GFF and AFF output files and for the calculation of FEATURE contents.

HOMOLOGY

Set location of blastall program, the FASTA database used to determine homology (needs prior formatting to a FASTA database with formatdb command included in Blast package) and the e-value cutoff for homology (e.g. 1e-10). These three input parameters are separated by comma (default = disabled).

VIENNA

Set location of RNAfold.pl script that belongs to the Vienna RNA package (default = /usr/bin/). This program is required only for efficiency predictions using the RATIONAL method (see EFFICIENCY below).

=back 4

=over 2

=head2 Design settings

=back 4

=over 3

SIRNALENGTH

Set length [nt] of siRNAs used for off-target evaluation (default = 19).

DESIGNWINDOW

Set minimal and maximal length of desired RNAi reagents (default = '80,500') separated by comma.

DESIGNNUM

Set number of RNAi reagents to be designed for each identified specific region in the queried target sequences (default = 50).

OUTPUTNUM

Set number of RNAi reagents to be returned for each queried target sequence (default = 1).

PRIMER3OPT

Set location of file with options for PRIMER3 program in 'TAG=VALUE' format (visit Primer3 documentation for help), default settings are used otherwise (default = disabled).

PRIMERTAG

Set sequence to be added 5' to both, forward and reverse primer sequences (for the design of long dsRNAs), e.g. a T7- or SP6-tag for in vitro transcription (default = disabled).

EFFICIENCY

Set efficiency calculation method and efficiency cutoff score separated by comma (e.g. 'EFFICIENCY=SIR,50'). Available calculation methods are 'RATIONAL' for calculations according to Reynolds et al. (requires VIENNA software), or 'SIR' according to Shah et al. (default = 'SIR'). The efficiency cutoff defines the minimal required efficiency for a siRNA to be selected (only for de novo designs, -e NO). Further documentation about efficiency prediction is available here.

TARGETSEQ

If set to 'FULL' NEXT-RNAi is forced to use the complete input target sequences as design template, otherwise only calculated specific regions are considered (default = CALC for de novo designs, default = FULL for evaluations).

FEATURE

Set location of file containing feature location information. A tab-delimited file with headers 'FeatureName', 'FeatureLoc' (location in GENOMEBOWTIE / GENOMEFASTA database), 'FeatureStart' (start of feature in GENOMEBOWTIE / GENOMEFASTA) and 'FeatureEnd' (end of feature in GENOMEBOWTIE / GENOMEFASTA database). Requires mapping of reagents to GENOMEBOWTIE or GENOMEFASTA databases (default = disabled).

LOWCOMPEVAL

Set location of mdust program for the evaluation of low-complexity regions in the input sequences (default = disabled).

CANEVAL

Option for calculation of CA[ACGT] tandem trinucleotide repeats in target or reagent sequences. This option is enabled by setting the minimal number of CAN repeats (e.g. 6) to be detected (default = disabled).

SEEDMATCH

Calculation of seed matches from siRNA sense strand (starting at position 2) to a defined FASTA file OR a Bowtie database/index file (if a FASTA file was provided, NEXT-RNAi expects the bowtie-build script for building the bowtie index in the BOWTIE folder). This option requires setting the length of the seed region (between 6 and 8), the maximal seed complement frequency allowed (for filtering of target sequences) and the location of the FASTA file or Bowtie database/index (pre-build with bowtie-build) separated by comma (default = disabled).

MIRSEED

Calculation of (e.g. miRNA-) seeds within a long dsRNA or siRNA from a given FASTA file containing miRNA sequences. Requires length of seed region
(between 6 and 8, starting from position 2 in miRNA sense sequences) and location of FASTA file (separated by comma), siRNAs containing seeds will be excluded from designs (default = disabled).

POOL

Results for siRNA evaluations can be summarized for pools of sequences. This option requires setting of the location of a tab-delimited file containing the headers 'siRNAID' and 'POOLID' to define connections between query siRNA identifiers and corresponding siRNA-pool identifiers. This options is only available for the evaluation of siRNAs (default = disabled).

INDEPENDENT

Set location of FASTA file or Bowtie database/index containing sequences that should be avoided for independent reagent designs (file is appended to the off-target database) (default = disabled). In case a FASTA file was provided the 'bowtie-build' script is required in the location defined for BOWTIE (to build the Bowtie index).

INTRON

Percentage nt of a long dsRNA allowed to target intronic regions (default = 25).

RANKD

Long dsRNA designs are by default ranked for percent specificity in first place and number of contained siRNAs predicted to be efficient in second place. NEXT-RNAi can be forced to rank designs for the absolute number of specific siRNAs contained in the long dsRNAs in second place (RANKD = SPEC), which maximized the length of long dsRNA designs (default = EFF for efficiency ranking in second place).

TARGETTYPE

For the design of RNAi reagents against non-annotated genes this parameter should be set to 'NA' (Not Annotated). This affects the specificity ranking of designs in a way to avoid targeting any annotated gene (for designs against annotated genes (default) the ranking maximizes the specificity for the intended target gene).

REDESIGN

Define whether NEXT-RNAi is allowed to enter a (re-)design method (REDESIGN = ON) to enable the design of RNAi reagents for input sequences that do not meet the user-defined quality measures (specificity (SIRNALENGTH), EFFICIENCY, LOWCOMPEVAL, CANEVAL, SEEDMATCH and MIRSEED) (default = OFF).

OTEEVAL

For evaluation of designed RNAi reagents for 'off-target' effects in additional databases. This options requires the location of a Bowtie database/index; the siRNA length [nt] for mappings; whether off-target effects should be evaluated by positional information ('pos', database has to be the same as in GENOMEBOWTIE / GENOMEFASTA) or by target information ('target' uses targetgroups defined in TARGETGROUPS). Database, siRNA length and evaluation option are separated by comma. Multiple evaluations can be queried (default = disabled).

=back 4

=over 2

=head2 Mapping reagents to the 'off-target' (-d) database

=back 4

=over 3

TARGETGROUPS

Location of file defining which sequences in the database file (-d option) belong to one group (e.g. splice variants of a gene) (default = disabled). A tab-delimited file containing the headers 'Target' (e.g. transcripts) and 'TargetGroup' (e.g. the gene the transcript belongs to) is required. NEXT-RNAi will then consider e.g. siRNAs that target multiple transcripts of the same gene as specific for this gene. Multiple files containing targetgroups can be defined in the options file.

EXCLUDED

Location of file containing identifiers from the off-target database (-d option) that should be excluded as target sites, but not considered as real off-targets in case they were hit (e.g. UTR regions). A text file with the header 'Exclude' listing identifiers to be excluded is required. Multiple 'EXCLUDED' files can be queried (default = disabled).

INTENDED

Location of file containing sequence identifiers from the input file connected to their intended target (same as 'TargetGroup' identifier in TARGETGROUPS file) that forces NEXT-RNAi always to output this gene as the primary, intended target of the reagent. A tab-delimited file with the headers 'Query' and 'Intended' listing the identifiers is required. Multiple 'INTENDED' files can be queried (default = disabled).

=back 4

=over 2

=head2 Mapping reagents to the genome using Bowtie

=back 4

=over 3

GENOMEBOWTIE

Set location of mapping database/index for Bowtie. Bowtie needs mapping databases (indices) that were build with the bowtie-build script from FASTA files. The mapping of RNAi reagents is a prerequisite for generation of GFF and AFF output files and for the calculation of FEATURE contents.

=back 4

=over 2

=head2 Mapping reagents to the genome using blat or gfClient


=back 4

=over 3

SOURCE

Set type of source where target sequences were retrieved from ('GENOMIC' for genomic (unspliced) sources, 'CDS' for spliced sources). It affects the type of mapping: for 'CDS' sources BLAT is required, for 'GENOMIC' sources BOWTIE is used (default = GENOMIC)

BLATPROGRAM

Set either to 'blat' for local Blat alignments or to 'gfClient' for alignments using a running Blat server (default = blat). The 'blat' option requires setting of a FASTA database with the GENOMEFASTA option, the 'gfClient' option requires BLATHOST and BLATPORT settings to connect to the Blat server.

GENOMEFASTA

Set location of FASTA mapping database for Blat. The mapping of RNAi reagents is a prerequisite for generation of GFF and AFF output files and for the calculation of FEATURE contents.

BLATHOST

Name of server that runs the Blat server (gfServer), required to run Blat mappings using the BLATPROGRAM gfClient.

BLATPORT

Port to connect to a particular instance (database) of the Blat server defined in BLATHOST. Required to run Blat mappings using the BLATPROGRAM gfClient.

BLATSPLIT

Split parameter for a large FASTA database defined in GENOMEFASTA (using blat as BLATPROGRAM). The FASTA database will be splitted in parts only containing the defined number of sequences (default = 0, means no splitting).

BLATALIGN

If set to 'PERFECT', NEXT-RNAi only allows perfect matches during mapping of sequences to the genome with blat or gfClient. If set to 'PARTIAL', also partial mappings are evaluated (default = PERFECT).

TXNFASTA

Set location of off-target database (as used for -d option) in FASTA format. This option is required for gapped alignments using Blat (e.g. to map siRNAs spanning exon-exon boundaries). NEXT-RNAi can use the mapping information from the off-target database to extend the reagent's sequence and re-map it to the genome (default = disabled).

=back 4

=over 2

=head2 Output settings

=back 4

=over 3

OUTPUT

Set output folder for files created by NEXT-RNAi (default location is input file location)

GFF

Enables output of general feature format (GFF) file by choosing either 'GFF2' or 'GFF3' format and requires prior mapping of reagents (see BOWTIE and BLAT options) (default = disabled).

GBROWSEBASE

Set URL to a generic genome browser (GBrowse) instance for visualization of designed reagents in their genomic context (default = disabled). The URL needs to be a link to the 'gbrowse_img' script of the GBrowse instance, e.g. for accessing our Drosophila melanogaster genome browser use http://www.dkfz.de/signaling/cgi-bin/gbrowse_img/flybase/. The visualization requires prior mapping of reagents (see BOWTIE and BLAT options) and further tracks can be added by setting of the GBROWSETRACK option.

GBROWSETRACK

Set generic genome browser (GBrowse) tracks to be visualized with the designed RNAi reagents (default = disabled). Multiple tracks can be enabled by '+' concatenation (e.g. 'GENE+TXN' for showing genes and transcripts in our Drosophila melanogaster GBrowse, see GBROWSEBASE option). The visualization requires prior mapping of reagents (see BOWTIE and BLAT options) and setting of the GBROWSEBASE URL.

AFF

Set to 'YES' for generation of an annotations file that allows for the direct upload of design results to GBrowse (default = disabled). This requires prior mapping of reagents (see BOWTIE and BLAT options).

=back 2

=head1 AUTHOR

Thomas Horn (t.horn@dkfz.de) and Michael Boutros (m.boutros@dkfz.de)

=cut

use warnings;
use diagnostics;
use strict;
use Getopt::Long;
use Pod::Usage;

########################
########################
########################
###                  ###
### Global variables ###
###                  ###
########################
########################
########################

# store input/output/report file locations
our %fileLocs = ();

###################
###################
###################
###             ###
### Subroutines ###
###             ###
###################
###################
###################

##
## Newlines of various operation systems are removed
##

sub cleanLine {
    my $line = $_[0];
    $line =~s/\n//g;
    $line =~s/\r//g;
    $line =~s/\r\n//g;
    return $line;
}

##
## Administrate queried options / parameters and check for valid input
##

sub options {
    my ($option,$value,$options,$UserOptions,$error) = @_;
# valid option
    if (exists $$options{$option}){
	my @optionLen = scalar(@{ $$options{$option} });
# options were only single import is allowed
	if ((scalar(@optionLen) eq 1) && (!exists $$UserOptions{$option})){
            $$options{$option}[0] = $value;
	    $$UserOptions{$option} = 1;
        }
        else {
# options with multiple inputs allowed
	    if (($option eq "TARGETGROUPS") || ($option eq "EXCLUDED") || ($option eq "INTENDED") || ($option eq "GENOMEBOWTIE") || ($option eq "GENOMEFASTA") || ($option eq "OTEEVAL") || ($option eq "INDEPENDENT")){
		push (@{ $$options{$option} }, $value);
		$$UserOptions{$option}++;
	    } 
	    else {
		print $error "$option\tMultiple callings in optionsfile not allowed for this parameter (only first one used)\n";
	    }
        }
    }
    else {
# invalid option
        print $error "$option\tInvalid option\n";
    }
}

##
## Organization of input/output files
##

sub fileLoc{
    my ($option,$loc,$name) = @_;
    if (($option eq 'Unlink') || ($option eq 'Looped')){
# collect files to unlink
	if (!exists $fileLocs{$option}){
	    $fileLocs{$option} = [$loc, ];
	}
	else {
	    push (@{ $fileLocs{$option} }, $loc);
	}
    }
    else {
# save file locations for output
	$fileLocs{$option}{$name} = $loc;
    }
}

##
## Split FASTA DB files
##

sub splitDB {
    my ($split,$databaseFile,$splitfiles,$error) = @_;
# count sequences (FASTA headers) in DB file with grep
    my $counter = `grep \">\" $databaseFile -c`;
    $counter = &cleanLine($counter);
    if ($counter > $split){
# define output file name
	my $index = 1;
	my $dbOut = $databaseFile.'_'.$index;
	while (-e $dbOut){
	    $index++;
	    $dbOut = $databaseFile.'_'.$index;
	}
# write queried number of features per file
	open (DBOUT, ">$dbOut") || die "Cannot open DBOUT: $!\n";
	push (@$splitfiles, $dbOut);
	&fileLoc('Unlink',$dbOut);
	open (DBFILE, "<$databaseFile") || die "Cannot open DBFILE: $!\n";
	my $featcount = 0;
	while (my $line = <DBFILE>){
	    $line = &cleanLine($line);
	    if ($line=~/^>.*/){
		if ($featcount < $split){
		    print DBOUT "$line\n";
		    $featcount++;
		}
		else {
		    close DBOUT;
		    $featcount = 0;
		    $index++;
		    my $dbOut = $databaseFile.'_'.$index;
		    while (-e $dbOut){
			$index++;
			$dbOut = $databaseFile.'_'.$index;
		    }
		    open (DBOUT, ">$dbOut") || die "Cannot open DBOUT: $!\n";
		    push (@$splitfiles, $dbOut);
		    &fileLoc('Unlink',$dbOut);
		    print DBOUT "$line\n";
		    $featcount++;
		}
	    }
	    else {
		print DBOUT "$line\n";
	    }
	}
	close DBFILE;
	close DBOUT;
    }
}

##
## Read identifiers and sequences from FASTA file
##

sub readFASTA {
    my ($input,$IDSeq,$error,$option) = @_;
    open (FASTA, "<$input") || die "Cannot open FASTA: $!\n";
    my $ID = "";
    while (my $line = <FASTA>){
	$line = &cleanLine($line);
# read input until first space only
	if ($line=~/^>(\S+)/){
	    $ID = $1;
	    if (!exists $$IDSeq{$ID}){
		$$IDSeq{$ID} = "";
	    }
	    else {
		delete $$IDSeq{$ID};
		print $error "$ID\tIdentifier is not unique in file $input and is not considered for further calculations\n";
	    }
	}
	else {
	    $line=~s/\s//g;
	    if (exists $$IDSeq{$ID}){
# replace 'U'/'u' with 'T' in case RNA sequence was queried
		$line=~s/u/T/gi;
# sequence must only contain ACGTacgt if option is 'strict'
		if ($option eq 'strict'){
		    if ($line=~/^[ACGTNacgtn]+$/){
			$$IDSeq{$ID}.= $line;
		    }
		    else {
			delete $$IDSeq{$ID};
			print $error "$ID\t$input: invalid sequence (only A,C,G,T are allowed), not considered for further calculations\n";
		    }
		}
		elsif ($option eq 'permissive'){
		    $$IDSeq{$ID}.= $line;
		}
	    }
	}
    }
    close FASTA;
}

##
## In silico DICER to generate siRNAs of desired length from target sequences
##

sub edicer {
    my ($dicer,$fraglength,$IDSeq,$IDSeqKeys) = @_;
    open (EDICER, ">$dicer") || die "Cannot open EDICER: $!\n";
    for (my $z=0;$z<scalar(@$IDSeqKeys);$z++){
        for (my $i=0;$i+($fraglength-1)<length($$IDSeq{$$IDSeqKeys[$z]});$i++){
            my $j = $i+1;
            my $siRNA = substr($$IDSeq{$$IDSeqKeys[$z]},$i,$fraglength);
            print EDICER ">$$IDSeqKeys[$z]\_$j\n$siRNA\n";
        }
    }
    close EDICER;
}

##
## Build index from a file
##

sub build_index {
    my ($data_file,$index_file,$input,$name,$IDindex) = @_;
    my $offset = 0;

    my $lineNum = 1;
    while (my $line = <$data_file>){
        $line = &cleanLine($line);
        my @columns = split(/\t/,$line);
	if ($input eq 'bowtie'){
	    if (!exists $$IDindex{$columns[0]}{$name}{$lineNum}){
		$$IDindex{$columns[0]}{$name}{$lineNum} = "";
	    }
	}
	elsif ($input eq 'blat'){
	    if ((scalar(@columns) eq 21) && ($columns[0]=~/^\d+$/)){
		if (!exists $$IDindex{$columns[9]}{$name}{$lineNum}){
		    $$IDindex{$columns[9]}{$name}{$lineNum} = "";
		}
	    }
	}
	elsif ($input eq 'mapped'){
	    if (!exists $$IDindex{$columns[0]}{$name}{$lineNum}){
		$$IDindex{$columns[0]}{$name}{$lineNum} = "";
	    }
        }
	elsif ($input eq 'primer-mapped'){
	    my $id = $columns[0].'_1';
            if (!exists $$IDindex{$id}{$name}{$lineNum}){
                $$IDindex{$id}{$name}{$lineNum} = "";
            }
        }
        my $pack = pack("N", $offset);
        print $index_file pack("N", $offset);
        $offset = tell($data_file);
        $lineNum++;
    }
}

##
## Get lines from a file via index
##

sub line_with_index {
    my ($data_file,$index_file,$line_number) = @_;

    my $size;               # size of an index entry
    my $i_offset;           # offset into the index of the entry
    my $entry;              # index entry
    my $d_offset;           # offset into the data file

    $size = length(pack("N", 0));
    $i_offset = $size * ($line_number-1);
    seek($index_file, $i_offset, 0) or return;
    read($index_file, $entry, $size);
    $d_offset = unpack("N", $entry);
    seek($data_file, $d_offset, 0);
    return scalar(<$data_file>);
}

##
## Bowtie parsing, count hits of siRNAs in off-target database
##

sub BowtieTarget {
    my ($KeysRef,$bowtie,$TargetExclude,$InputTarget,$InputsiRNATarget,$siRNATargetExclude,$reagent,$evaluation,$siRNAPos) = @_;
    my $input = "";
    my $inputsiRNA = "";
    if (-e $bowtie){
	open (BOWTIE, "<$bowtie") || die "Cannot open $bowtie: $!\n";
	while (my $line = <BOWTIE>){
	    $line = &cleanLine($line);
	    my (@columns) = ();
	    @columns = split(/\t/, $line);
	    if (scalar(@columns) eq 7){
		if ($columns[0] =~/(\S+)(_\d+)/){
		    $input = $1;
		    $inputsiRNA = $1.$2;
		}
# number of matches of complete input sequence to a certain sequence in the off-target database
		if ((!defined %$TargetExclude) || (!exists $$TargetExclude{$columns[2]})){
		    if (!exists $$InputTarget{$input}{$columns[2]}){
			$$InputTarget{$input}{$columns[2]} = 1;
		    }
		    else {
			$$InputTarget{$input}{$columns[2]}++;
		    }
# number of matches of each single siRNA to a certain sequence in the off-target database
		    if (!exists $$InputsiRNATarget{$input}{$inputsiRNA}{$columns[2]}){
			$$InputsiRNATarget{$input}{$inputsiRNA}{$columns[2]} = 1;
		    }
		    else {
			$$InputsiRNATarget{$input}{$inputsiRNA}{$columns[2]}++;
		    }
		}
		else {
		    if (!exists $$siRNATargetExclude{$inputsiRNA}{$columns[2]}){
			$$siRNATargetExclude{$inputsiRNA}{$columns[2]} = 1;
		    }
		    else {
			$$siRNATargetExclude{$inputsiRNA}{$columns[2]}++;
		    }
		}
# save target position in off-target database for evaluation of siRNAs
		if (((defined $reagent) && ($reagent eq 's')) && ((defined $evaluation) && ($evaluation eq 'OLIGO'))){
		    if (!exists $$siRNAPos{$inputsiRNA}{$columns[2]}){
			$$siRNAPos{$inputsiRNA}{$columns[2]} = [ $columns[3]+1, ];
		    }
		    else {
			push (@{ $$siRNAPos{$inputsiRNA}{$columns[2]} },$columns[3]+1);
		    }
		}
	    }
	}
	close BOWTIE;
    }
# input sequences with no target are assigned to 'NA' target
    for (my $i=0;$i<scalar(@$KeysRef);$i++){
        if (!exists $$InputTarget{$$KeysRef[$i]}){
            $$InputTarget{$$KeysRef[$i]}{"NA"} = 0;
        }
    }
}

##
## Connect target groups to input sequences
##

sub targetGroups {
    my ($KeysRef,$targetGroups,$InputTarget,$InputtargetGroups,$error) = @_;
    for (my $i=0;$i<scalar(@$KeysRef);$i++){
	my @TargetKeys = keys %{ $$InputTarget{$$KeysRef[$i]} };
# input sequences with no target are defined as own target group, number of hits is set to 0
	if ((scalar(@TargetKeys) eq 1) && ($TargetKeys[0] eq "NA")){
	    undef @TargetKeys;
	    push (@TargetKeys,$$KeysRef[$i]);
	    $$InputTarget{$$KeysRef[$i]}{$$KeysRef[$i]} = 0;
	}
# sort targets according to number of hits to identify the 'real' target
        my @TargetHits = ();
        for (my $j=0;$j<scalar(@TargetKeys);$j++){
            push (@TargetHits,$$InputTarget{$$KeysRef[$i]}{$TargetKeys[$j]});
        }
        @TargetKeys = @TargetKeys[ sort {$TargetHits[$b] <=> $TargetHits[$a]} 0 .. $#TargetKeys];
	@TargetHits = sort {$b <=> $a} (@TargetHits);
# set target group for best hit, important for identification of off-target effects
        if (exists $$targetGroups{$TargetKeys[0]}){
            $$InputtargetGroups{$$KeysRef[$i]} = $$targetGroups{$TargetKeys[0]};
        }
        else {
            $$InputtargetGroups{$$KeysRef[$i]} = $$KeysRef[$i];
	    if ($TargetHits[0] > 0){
		print $error "$$KeysRef[$i]\tNo target group defined for $TargetKeys[0], target group set to $$KeysRef[$i]\n";
	    }
	}
    }
}

##
## Parse siRNA features (efficiency, seed matches, low-complexity, CAN repeats)
##

sub featsiRNA {
    my ($input,$pos,$FilterPos) = @_;
    my @unspec = split(/\|/, $$FilterPos{$input}{$pos});
    my %unspec = ();
    my $unspec = '';
    for (my $i=0;$i<scalar(@unspec);$i++){
	if (!exists $unspec{$unspec[$i]}){
	    $unspec{$unspec[$i]} = '';
	}
    }
# write signature for each siRNA in binary mode |efficiency|seedmatch|lowcomplexity|CANrepeat|mirseed
    if (exists $unspec{'eff'}){
	$unspec.= '|1';
    }
    else {
	$unspec.= '|0';
    }
    if (exists $unspec{'seed'}){
	$unspec.= '|1';
    }
    else {
	$unspec.= '|0';
    }
    if (exists $unspec{'low'}){
	$unspec.= '|1';
    }
    else {
	$unspec.= '|0';
    }
    if (exists $unspec{'can'}){
	$unspec.= '|1';
    }
    else {
	$unspec.= '|0';
    }
    if (exists $unspec{'mirseed'}){
        $unspec.= '|1';
    }
    else {
        $unspec.= '|0';
    }
    return $unspec;
}

##
## Find specific regions in input sequences
##

sub SpecRegion {
    my ($IDSeq,$IDSeqKeys,$InputsiRNATarget,$targetGroups,$InputtargetGroups,$siRNATargetExclude,$InputSpecRegion,$InputSpecsiRNA,$fraglength,$evaluation,$specreg,$FilterPos) = @_;
# get query keys
    my @input = keys %{ $InputsiRNATarget };
    for (my $i=0;$i<scalar(@input);$i++){
# get siRNA keys
	my @inputsiRNA = keys %{ $$InputsiRNATarget{$input[$i]}};
	my @num = ();
	for (my $j=0;$j<scalar(@inputsiRNA);$j++){
	    if ($inputsiRNA[$j]=~/_(\d+)$/){
		push (@num, $1);
	    }
	}
# sort siRNA for their position in query sequence
	@inputsiRNA = @inputsiRNA[ sort {$num[$a] <=> $num[$b]} 0 .. $#inputsiRNA ];
	@num = sort {$a<=>$b}(@num);
	for (my $j=0;$j<scalar(@inputsiRNA);$j++){
# fill NAs for non-targeting regions
	    if (((!exists $$InputSpecsiRNA{$input[$i]}) && ($num[$j] ne 1)) || ((exists $$InputSpecsiRNA{$input[$i]}) && (($num[$j]-1) > scalar(@{ $$InputSpecsiRNA{$input[$i]} })))){
                my $diff = 0;
		if (exists $$InputSpecsiRNA{$input[$i]}){
		    $diff = $num[$j] - 1 - scalar(@{ $$InputSpecsiRNA{$input[$i]} });
		}
		else {
		    $$InputSpecsiRNA{$input[$i]} = [];
		    $diff = $num[$j] - 1;
		}
		my $position = 0;
                for (my $k=0;$k<$diff;$k++){
		    if (($k eq 0) && (exists $$InputSpecsiRNA{$input[$i]})){
			$position = scalar(@{ $$InputSpecsiRNA{$input[$i]} }) + 1;
		    }
		    else {
			$position++;
		    }
                    my $unspec2 = 'NA';
                    if (exists $$FilterPos{$input[$i]}{$position}){
                        $unspec2.= &featsiRNA($input[$i],$position,$FilterPos);
                    }
                    else {
                        $unspec2.= '|0|0|0|0|0';
                    }
                    push (@{ $$InputSpecsiRNA{$input[$i]} }, $unspec2);
                }
            }
# get target keys
	    my @targets = keys %{ $$InputsiRNATarget{$input[$i]}{$inputsiRNA[$j]} };
	    my $specpointer = 0;
	    my $filterpointer = 0;
	    my $excludepointer = 0;
	  SPECCHECK:
	    for (my $k=0;$k<scalar(@targets);$k++){
# set target group if defined, otherwise target is target group
		my $ref = "";
		if (exists $$targetGroups{$targets[$k]}){
		    $ref = $$targetGroups{$targets[$k]};
		}
		else {
		    $ref = $targets[$k];
		}
# compare targetgroup of actual target with targetgroup of query, define queried filters as off-targets
		if ($ref ne $$InputtargetGroups{$input[$i]}){
		    $specpointer = 1;
		    last SPECCHECK;
		}
	    }
# check whether siRNA contains any unwanted feature
	    if (exists $$FilterPos{$input[$i]}{$num[$j]}){
		$filterpointer = 1;
	    }
# check whether siRNA has any unwanted target
	    if (exists $$siRNATargetExclude{$inputsiRNA[$j]}){
		$excludepointer = 1;
	    }
	    
	    if (($specpointer eq 0) && ($filterpointer eq 0) && ($excludepointer eq 0)){
# create new entry for 'specific' siRNA
		if (!exists $$InputSpecRegion{$input[$i]}){
		    $$InputSpecRegion{$input[$i]} = [ [ $num[$j], $num[$j] ], ];
# new entry in first position
		    if ($num[$j] eq 1){
			$$InputSpecsiRNA{$input[$i]} = [ 1, ];
		    }
		    else {
			push (@{ $$InputSpecsiRNA{$input[$i]} }, 1);
		    }
		}
		else {
# create new specific region if the region before is unspecific, otherwise expand existsing specific region
		    if ($$InputSpecRegion{$input[$i]}[-1][1] eq "un"){
			$$InputSpecRegion{$input[$i]}[-1][0] = $num[$j];
			$$InputSpecRegion{$input[$i]}[-1][1] = $num[$j];
			push (@{ $$InputSpecsiRNA{$input[$i]} }, 1);
		    }
		    else {
			$$InputSpecRegion{$input[$i]}[-1][1] = $num[$j];
			push (@{ $$InputSpecsiRNA{$input[$i]} }, 1);
		    }
		}
	    }
            else {
                my $unspec = "";
# although region is in FilterPos, it can be specific for target
                if ($specpointer eq 0){
                    $unspec = 1;
                }
                else {
                    $unspec = 0;
                }
                if (exists $$FilterPos{$input[$i]}{$num[$j]}){
                    $unspec.= &featsiRNA($input[$i],$num[$j],$FilterPos);
                }
                else {
                    $unspec.= '|0|0|0|0|0';
                }
# create new entry for 'unspecific' siRNA
                if (!exists $$InputSpecRegion{$input[$i]}){
                    $$InputSpecRegion{$input[$i]} = [ [ $num[$j], "un" ], ];
# new entry in first position
                    if ($num[$j] eq 1){
                        $$InputSpecsiRNA{$input[$i]} = [ $unspec, ];
                    }
		    else {
			push (@{ $$InputSpecsiRNA{$input[$i]} }, $unspec);
		    }
                }
                else {
# create new unspecific region if the region before is specific, otherwise expand existsing unspecific region
                    if ($$InputSpecRegion{$input[$i]}[-1][1] ne "un"){
# new unspecific region created
                        push (@{ $$InputSpecRegion{$input[$i]} }, [ $num[$j], "un" ]);
                        push (@{ $$InputSpecsiRNA{$input[$i]} }, $unspec);
                    }
                    else {
# unspecific region expanded
			$$InputSpecRegion{$input[$i]}[-1][0] = $num[$j];
			push (@{ $$InputSpecsiRNA{$input[$i]} }, $unspec);
                    }
                }
            }
        }
    }
# evaluate calculations for each input sequence
    for (my $i=0;$i<scalar(@$IDSeqKeys);$i++){
        if (!exists $$InputSpecRegion{$$IDSeqKeys[$i]}){
# define whole sequence as specific region, so that primer design is possible
            my $end = length($$IDSeq{$$IDSeqKeys[$i]}) - $fraglength;
            my $siRNA = $end + 1;
            $$InputSpecRegion{$$IDSeqKeys[$i]} = [ [1,$siRNA], ];
# add NAs to specificity array
            $$InputSpecsiRNA{$$IDSeqKeys[$i]} = [];
            for (my $j=0;$j<$end;$j++){
                my $position = $j + 1;
                my $unspec2 = 'NA';
                if (exists $$FilterPos{$$IDSeqKeys[$i]}{$position}){
                    $unspec2.= &featsiRNA($$IDSeqKeys[$i],$position,$FilterPos);
                }
                else {
                    $unspec2.= '|0|0|0|0|0';
                }
                push (@{ $$InputSpecsiRNA{$$IDSeqKeys[$i]} }, $unspec2);
            }
        }
        else {
# remove 'unspecific ends'
            if ($$InputSpecRegion{$$IDSeqKeys[$i]}[-1][1] eq "un"){
                if (scalar @{ $$InputSpecRegion{$$IDSeqKeys[$i]} } eq 1){
                    my $end = length($$IDSeq{$$IDSeqKeys[$i]}) - $fraglength + 1;
                    $$InputSpecRegion{$$IDSeqKeys[$i]} = [ [1,$end], ];
                }
                else {
                    pop(@{ $$InputSpecRegion{$$IDSeqKeys[$i]} });
                }
            }
        }
# add last 'NAs' if last part of target region is located within a region with no target
        my $siRNA = length($$IDSeq{$$IDSeqKeys[$i]}) - $fraglength + 1;
        if (scalar(@{ $$InputSpecsiRNA{$$IDSeqKeys[$i]} }) < $siRNA){
	    my $diff = $siRNA - scalar(@{ $$InputSpecsiRNA{$$IDSeqKeys[$i]} });
	    my $position = 0;
	    for (my $k=0;$k<$diff;$k++){
		if ($k eq 0){
		    $position = scalar(@{ $$InputSpecsiRNA{$$IDSeqKeys[$i]} }) + 1;
		}
		else {
		    $position++;
		}
		my $unspec2 = 'NA';
		if (exists $$FilterPos{$$IDSeqKeys[$i]}{$position}){
		    $unspec2.= &featsiRNA($$IDSeqKeys[$i],$position,$FilterPos);
		}
		else {
		    $unspec2.= '|0|0|0|0|0';
		}
		push (@{ $$InputSpecsiRNA{$$IDSeqKeys[$i]} }, $unspec2);
	    }
        }
# if evaluation was queried or designs on full target sequence is forced with specreg, specific region is set to complete sequence
        if (($evaluation ne "NO") || ($specreg eq "FULL")){
            $$InputSpecRegion{$$IDSeqKeys[$i]} = [ [1,$siRNA], ];
        }
    }
}

##
## Calculate local foldings of siRNAs with Vienna RNAfold perl script
##

sub ViennaRNA {
# create siRNA file in input format for Vienna RNAfold
    my ($identifier,$sequence,$fraglength,$outfolder,$Viennaloc,$sirnas,$hairpins) = @_;
    my $outsiRNA = $outfolder.'NEXT-RNAi_'.$identifier.'.siRNA';
    open (SIRNA, ">$outsiRNA") || die "Cannot open SIRNA: $!\n";
    for (my $i=0;$i+($fraglength-1)<length($sequence);$i++){
        my $siRNA = substr($sequence,$i,$fraglength);
        print SIRNA "$siRNA\n";
    }
    close SIRNA;
# run Vienna RNAfold.pl
    my $outVienna = $outfolder.'NEXT-RNAi_'.$identifier.'.RNAfold';
    system ("$Viennaloc\\RNAfold.pl $outsiRNA >$outVienna") eq 0 || die "Failed to open RNAfold.pl: $?\n";

# parse RNAfold.pl output
    open (HAIRPINS, "<$outVienna") || die "Cannot open HAIRPINS: $!\n";
    while (my $line = <HAIRPINS>){
	$line = &cleanLine($line);
        if ($line=~/\./){
# store hairpin information
            push (@$hairpins, $line);
        }
        else {
# store siRNA information
            push (@$sirnas, $line);
        }
    }
    close HAIRPINS;
    unlink ($outsiRNA,$outVienna);
}

##
## Calculation of siRNA efficiency according to Shah et al. (2007)
##

sub siR {
    my ($sirna,$ID,$pos,$effCut,$effPos) = @_;
    my $RawScore = 0;
#
# Scoring of siRNAs (sense strand)
#

# A1: A/U at position 1 of sense strand
    if (substr($sirna,0,1)=~/A|T/i){
        $RawScore-= 1.4;
    }
# A2: G/C at position 1 of sense strand
    if (substr($sirna,0,1)=~/G|C/i){
        $RawScore+= 1.11;
    }
# A3: A at position 6 of sense strand
    if (substr($sirna,5,1)=~/A/i){
        $RawScore+= 0.7;
    }
# A4: U at position 10 of sense strand
    if (substr($sirna,9,1)=~/T/i){
        $RawScore+= 0.25;
    }
# A5: G at position 13 of sense strand
    if (substr($sirna,12,1)=~/G/i){
        $RawScore-= 1.66;
    }
# A6: U at position 13 of sense strand
    if (substr($sirna,12,1)=~/T/i){
        $RawScore+= 0.31;
    }
# A7: A/U in position 4 before end of sense strand
    if (substr($sirna,-4,1)=~/A|T/i){
        $RawScore+= 0.74;
    }
# A8: A/U in position 3 before end of sense strand 
    if (substr($sirna,-3,1)=~/A|T/i){
        $RawScore+= 1.2;
    }
# A9: A/U in position 2 before end of sense strand 
    if (substr($sirna,-2,1)=~/A|T/i){
        $RawScore+= 1.44;
    }
# A10: A/U at last position of sense strand
    if (substr($sirna,-1,1)=~/A|T/i){
        $RawScore+= 0.87;
    }
# A11: G/C at last psoition of sense strand
    if (substr($sirna,-1,1)=~/G|C/i){
        $RawScore-= 1.02;
    }
# A12: GC content between 30% and 55%
    my $GC = ($sirna =~ tr/GCgc//);
    my $GCcontent = ($GC/length($sirna))*100;
    if (($GCcontent >= 30) && ($GCcontent <= 55)){
	$RawScore+= 0.42;
    }
# calculate percentage efficiency
    my $FinalScore = (($RawScore + 4.08)/(7.04 + 4.08))*100;
# save positions of low efficiency
    if ($FinalScore < $effCut){
	if (!exists $$effPos{$ID}{$pos}){
	    $$effPos{$ID}{$pos} = 'eff';
	}
	else {
	    $$effPos{$ID}{$pos}.= '|eff';
	}
    }
    return $FinalScore;
}

##
## Calculation of siRNA efficiency according to Reynolds et al. (2004)
##

sub siRNAEfficiency {
    my ($sirna,$hairpin,$ID,$pos,$effCut,$effPos) = @_;
    my $scoresiRNA = 0;
    my ($seq,$seq_hp,$GC,$GC_hp,$GCcontent,$GCcontent_hp,$Tm,$Tm_hp);
    for (my $k=0;$k<length($sirna);$k++){
        if (substr($hairpin,$k,1) eq '('){
            $seq_hp.= substr($sirna,$k,1);
        }
    }
    if (defined $seq_hp){
        $GC_hp = ($seq_hp =~ tr/GCgc//);
    }
    if (defined $seq_hp){
        $GCcontent_hp = ($GC_hp/length($seq_hp))*100;
# calculate melting temperature (according to Sambrook et al.) with 0.05 M salt concentration
        $Tm_hp = int (81.5 - 16.6*1.3 + 41 * ($GCcontent_hp/100) - (500/length($seq_hp)));
    }
    else {
        $Tm_hp = 0;
    }
#
# Scoring of siRNAs (sense strand)
#

# Score +1 if hairpin melting temperature is < 20 degrees celcius
    if ($Tm_hp < 20){
        $scoresiRNA++;
    }
    $seq = $sirna;
    $GC = ($seq =~ tr/GCgc//);
    $GCcontent = ($GC/length($seq))*100;
# Score +1 if GC content of siRNA is between 30% and 52%
    if (($GCcontent >= 30) && ($GCcontent <= 52)){
        $scoresiRNA++;
    }
# check for certain positions
# Score +1 if base at position 3 eq A
    if (substr($seq,2,1)=~/A/i){
        $scoresiRNA++;
    }
# Score +1 if base at position 10 eq T
    if (substr($seq,9,1)=~/T/i){
        $scoresiRNA++;
    }
# Score -1 if base at position 13 eq G
    if (substr($seq,12,1)=~/G/i){
        $scoresiRNA--;
    }
# Score +1 for every A or T base in last 5 positions (maximum score of +5)
    if (substr($seq,-5,1)=~/A|T/i){
        $scoresiRNA++;
    }
    if (substr($seq,-4,1)=~/A|T/i){
        $scoresiRNA++;
    }
    if (substr($seq,-3,1)=~/A|T/i){
        $scoresiRNA++;
    }
    if (substr($seq,-2,1)=~/A|T/i){
        $scoresiRNA++;
    }
    if (substr($seq,-1,1)=~/A|T/i){
        $scoresiRNA++;
    }
# Score -1 if last position eq G or C
    if (substr($seq,-1,1)=~/G|C/i){
        $scoresiRNA--;
    }
# Score +1 (extra) if base at last position eq A
    if (substr($seq,-1,1)=~/A/i){
        $scoresiRNA++;
    }
# calculate percentage efficiency
    my $FinalScore = (($scoresiRNA + 2)/(10 + 2))*100;
# save positions of low efficiency
    if ($FinalScore < $effCut){
	$$effPos{$ID}{$pos} = '';
    }
    return $FinalScore;
}

##
## primer design with primer3 software 
##

sub primer3 {
    my ($identifier,$outfolder,$primer3loc,$fraglength,$IDSeq,$IDSeqKeys,$InputSpecRegion,$lenMin,$lenMax,$InputSpecPrimer,$error,$evaluation,$primer,$designnum,$primer3opt) = @_;
# define primer3 default options in hash and overwrite with user queries
    my %param = (
	PRIMER_OPT_SIZE => [ 20, ],
	PRIMER_MIN_SIZE => [ 18, ],
	PRIMER_MAX_SIZE => [ 27, ],
	PRIMER_PRODUCT_SIZE_RANGE => [ $lenMin, $lenMax, ],
	PRIMER_NUM_RETURN => [ $designnum, ],
	EXCLUDED_REGION => [ 25, "", ],
	PRIMER_MIN_TM => [ 35, ],
	PRIMER_MAX_TM => [ 80, ],
	PRIMER_SELF_ANY => [ 20.00, ],
	PRIMER_SELF_END => [ 20.00, ],
	PRIMER_MAX_POLY_X => [ 20, ],
	PRIMER_MIN_GC => [ 1.0, ],
	PRIMER_MAX_GC => [ 100.0, ],
	PRIMER_MAX_END_STABILITY => [ 999.999, ],
	PRIMER_PAIR_PENALTY => [ "empty", ],
	);
    if ($primer3opt ne 'empty'){
	if (-e $primer3opt){
	    open (OPT,$primer3opt) || die "Cannot open OPT: $!\n";
	    while (my $line = <OPT>){
		$line = &cleanLine($line);
		if ($line=~/^(\S+)=(\S+)/){
		    my $option = $1;
		    my $value = $2;
		    if (($option eq 'PRIMER_PRODUCT_SIZE_RANGE') || ($option eq 'EXCLUDED_REGION')){
			my @values = split(/,/,$value);
			$param{$option}[0] = $values[0];
			$param{$option}[1] = $values[1];
		    }
		    else {
			if (exists $param{$option}){
			    $param{$option}[0] = $value;
			}
		    }
		}
	    }
	    close OPT;
	}
	else {
	    print $error "$primer3opt\tprimer3 options file not found, default settings are used\n";
	    print "primer3 options file $primer3opt not found, default settings are used.\n";
	}
    }
# primer3 input/output files
    my $outprimer3 = $outfolder.'NEXT-RNAi_'.$identifier.'.primer3in';
    my $outprimer3out = $outfolder.'NEXT-RNAi_'.$identifier.'.primer3out';
# check if input/output files exist (e.g. for redesign round), index file names
    my $index = 1;
    my $in = $outprimer3;
    while (-e $in){
	$in = $outprimer3.'_'.$index;
	$index++;
    }
    $outprimer3 = $in;
    $index = 1;
    my $out = $outprimer3out;
    while (-e $out){
	$out = $outprimer3out.'_'.$index;
	$index++;
    }
    $outprimer3out = $out;
    &fileLoc('Unlink',$outprimer3);
    open (PRIMER3,">$outprimer3") || die "Cannot open PRIMER3: $!\n";
    for (my $i=0;$i<scalar(@$IDSeqKeys);$i++){
        my $primerID_f = $$IDSeqKeys[$i].'_f';
	my $primerID_r = $$IDSeqKeys[$i].'_r';
	if ($evaluation eq "NO"){
	    for (my $j=0;$j<scalar(@{ $$InputSpecRegion{$$IDSeqKeys[$i]} });$j++){
		my $seqlen = $$InputSpecRegion{$$IDSeqKeys[$i]}[$j][1] - $$InputSpecRegion{$$IDSeqKeys[$i]}[$j][0] + 1 + $fraglength - 1;
		my $primer3Seq = substr($$IDSeq{$$IDSeqKeys[$i]},$$InputSpecRegion{$$IDSeqKeys[$i]}[$j][0]-1,$seqlen);
		my $primer3ID = $$InputSpecRegion{$$IDSeqKeys[$i]}[$j][0].'..'.$$InputSpecRegion{$$IDSeqKeys[$i]}[$j][1];
		if (!exists $$InputSpecPrimer{$$IDSeqKeys[$i]}{$primer3ID}){
		    $$InputSpecPrimer{$$IDSeqKeys[$i]}{$primer3ID} = [ [],[],[],[],[],[],[],[],[],[],[],[],[], ];
		}
		else {
		    print $error "$$IDSeqKeys[$i]\tError in parsing of specific regions, $primer3ID not unique\n";
		}
# change de novo design settings here (see box above)
		print PRIMER3 "PRIMER_SEQUENCE_ID=$$IDSeqKeys[$i]|$primer3ID\nSEQUENCE=$primer3Seq\nPRIMER_OPT_SIZE=$param{PRIMER_OPT_SIZE}[0]\nPRIMER_MIN_SIZE=$param{PRIMER_MIN_SIZE}[0]\nPRIMER_MAX_SIZE=$param{PRIMER_MAX_SIZE}[0]\nPRIMER_PRODUCT_SIZE_RANGE=$param{PRIMER_PRODUCT_SIZE_RANGE}[0]-$param{PRIMER_PRODUCT_SIZE_RANGE}[1]\nPRIMER_NUM_RETURN=$param{PRIMER_NUM_RETURN}[0]\n=\n";
	    }
	}
	elsif ($evaluation eq "DSRNA"){
# change evaluation settings for 'DSRNA' here (see box above)
	    my $seqlen = length($$IDSeq{$$IDSeqKeys[$i]});
	    my $primer3Seq = $$IDSeq{$$IDSeqKeys[$i]};
# fixed primers, otherwise products might differ in length from query
	    my $primer_f = substr($$IDSeq{$$IDSeqKeys[$i]},0,20);
            my $primer_r = substr($$IDSeq{$$IDSeqKeys[$i]},length($$IDSeq{$$IDSeqKeys[$i]})-20,20);
            $primer_r = reverse $primer_r;
            $primer_r =~ tr/ACGTacgt/TGCAtgca/;
	    my $primer3ID = $$IDSeqKeys[$i];
	    my $lentemplate = length($primer3Seq) + 50;
	    my $startex = length($primer_f) + 1;
            my $lenex = length($primer3Seq) - length($primer_f) - length($primer_r) - 2;
	    $param{"EXCLUDED_REGION"}[0] = $startex;
            $param{"EXCLUDED_REGION"}[1] = $lenex;
	    $param{"PRIMER_PRODUCT_SIZE_RANGE"}[0] = 40;
	    $param{"PRIMER_PRODUCT_SIZE_RANGE"}[1] = $lentemplate;
	    $param{"PRIMER_NUM_RETURN"}[0] = 1;
# primer designs are forced to the start and end of the sequence
	    print PRIMER3 "PRIMER_SEQUENCE_ID=$$IDSeqKeys[$i]|$primer3ID\nSEQUENCE=$primer3Seq\nPRIMER_LEFT_INPUT=$primer_f\nPRIMER_RIGHT_INPUT=$primer_r\nEXCLUDED_REGION=$param{EXCLUDED_REGION}[0],$param{EXCLUDED_REGION}[1]\nPRIMER_MIN_TM=$param{PRIMER_MIN_TM}[0]\nPRIMER_MAX_TM=$param{PRIMER_MAX_TM}[0]\nPRIMER_SELF_ANY=$param{PRIMER_SELF_ANY}[0]\nPRIMER_SELF_END=$param{PRIMER_SELF_END}[0]\nPRIMER_MAX_POLY_X=$param{PRIMER_MAX_POLY_X}[0]\nPRIMER_MIN_GC=$param{PRIMER_MIN_GC}[0]\nPRIMER_MAX_GC=$param{PRIMER_MAX_GC}[0]\nPRIMER_MAX_END_STABILITY=$param{PRIMER_MAX_END_STABILITY}[0]\nPRIMER_OPT_SIZE=$param{PRIMER_OPT_SIZE}[0]\nPRIMER_MIN_SIZE=$param{PRIMER_MIN_SIZE}[0]\nPRIMER_MAX_SIZE=$param{PRIMER_MAX_SIZE}[0]\nPRIMER_PRODUCT_SIZE_RANGE=$param{PRIMER_PRODUCT_SIZE_RANGE}[0]-$param{PRIMER_PRODUCT_SIZE_RANGE}[1]\nPRIMER_NUM_RETURN=$param{PRIMER_NUM_RETURN}[0]\n=\n";
	}
	else {
# change evaluation settings for 'DSRNA+OLIGO' here (see box above)
	    my $seqlen = length($$IDSeq{$$IDSeqKeys[$i]});
            my $primer3Seq = $$IDSeq{$$IDSeqKeys[$i]};
            my $primer3ID = $$IDSeqKeys[$i];
# check for right annotation of forward an reverse primers
	    my $primer_f = "";
	    my $primer_r = "";
	    if ($$primer{$primerID_f} eq substr($primer3Seq,0,length($$primer{$primerID_f}))){
		$primer_f = $$primer{$primerID_f};
		$primer_r = $$primer{$primerID_r};
	    }
	    else {
		$primer_f = $$primer{$primerID_r};
		$primer_r = $$primer{$primerID_f};
	    }
            my $lentemplate = length($primer3Seq) + 50;
	    $param{"PRIMER_PRODUCT_SIZE_RANGE"}[0] = 40;
            $param{"PRIMER_PRODUCT_SIZE_RANGE"}[1] = $lentemplate;
	    my $startex = length($primer_f) + 1;
	    my $lenex = length($primer3Seq) - length($primer_f) - length($primer_r) - 2;
	    $param{"EXCLUDED_REGION"}[0] = $startex;
	    $param{"EXCLUDED_REGION"}[1] = $lenex;
	    $param{"PRIMER_NUM_RETURN"}[0] = 1;
# primer3 is forced to evaluate the primers handed by PRIMER_LEFT_INPUT and PRIMER_RIGHT_INPUT
	    print PRIMER3 "PRIMER_SEQUENCE_ID=$$IDSeqKeys[$i]|$primer3ID\nSEQUENCE=$primer3Seq\nPRIMER_LEFT_INPUT=$primer_f\nPRIMER_RIGHT_INPUT=$primer_r\nEXCLUDED_REGION=$param{EXCLUDED_REGION}[0],$param{EXCLUDED_REGION}[1]\nPRIMER_MIN_TM=$param{PRIMER_MIN_TM}[0]\nPRIMER_MAX_TM=$param{PRIMER_MAX_TM}[0]\nPRIMER_SELF_ANY=$param{PRIMER_SELF_ANY}[0]\nPRIMER_SELF_END=$param{PRIMER_SELF_END}[0]\nPRIMER_MAX_POLY_X=$param{PRIMER_MAX_POLY_X}[0]\nPRIMER_MIN_GC=$param{PRIMER_MIN_GC}[0]\nPRIMER_MAX_GC=$param{PRIMER_MAX_GC}[0]\nPRIMER_MAX_END_STABILITY=$param{PRIMER_MAX_END_STABILITY}[0]\nPRIMER_OPT_SIZE=$param{PRIMER_OPT_SIZE}[0]\nPRIMER_MIN_SIZE=$param{PRIMER_MIN_SIZE}[0]\nPRIMER_MAX_SIZE=$param{PRIMER_MAX_SIZE}[0]\nPRIMER_PRODUCT_SIZE_RANGE=$param{PRIMER_PRODUCT_SIZE_RANGE}[0]-$param{PRIMER_PRODUCT_SIZE_RANGE}[1]\nPRIMER_NUM_RETURN=$param{PRIMER_NUM_RETURN}[0]\n=\n";
	}
    }
    close PRIMER3;
# run primer3
    system ("$primer3loc\\primer3_core < $outprimer3 > $outprimer3out") eq 0 || die "Failed to open primer3_core: $?\n";
    &fileLoc('Unlink',$outprimer3out);
# parse primer3 output for primer designs
    open (PRIMER3, "<$outprimer3out") || die "Cannot open PRIMER3: $!\n";
    my $sequence = "";
    my $inputID = "";
    my $primer3ID = "";
    while (my $line = <PRIMER3>){
	$line = &cleanLine($line);
# get input sequence to calculate resulting probe sequences
	if ($line =~/^PRIMER_SEQUENCE_ID=(\S+)\|(\S+)$/){
	    $inputID = $1;
	    $primer3ID = $2;
	}
	if ($line =~/^SEQUENCE=(\S+)$/){
	    $sequence = $1;
	}
# overall primer penalty
        if ($line =~/^PRIMER_PAIR_PENALTY.*=(\S+)/){
            push (@{ $$InputSpecPrimer{$inputID}{$primer3ID}[0] }, $1);
        }
# left-primer sequence
        if ($line =~/^PRIMER_LEFT.*SEQUENCE=(\S+)/){
            push (@{ $$InputSpecPrimer{$inputID}{$primer3ID}[1] }, $1);
        }
# right-primer sequence
        if ($line =~/^PRIMER_RIGHT.*SEQUENCE=(\S+)/){
            push (@{ $$InputSpecPrimer{$inputID}{$primer3ID}[2] }, $1);
        }
# start and length of left primer
        if ($line =~/^PRIMER_LEFT.*=(\d+),(\d+)/){
            push (@{ $$InputSpecPrimer{$inputID}{$primer3ID}[3] }, $1);
            push (@{ $$InputSpecPrimer{$inputID}{$primer3ID}[4] }, $2);
        }
# start and length of right primer
        if ($line =~/^PRIMER_RIGHT.*=(\d+),(\d+)/){
            push (@{ $$InputSpecPrimer{$inputID}{$primer3ID}[5] }, $1);
            push (@{ $$InputSpecPrimer{$inputID}{$primer3ID}[6] }, $2);
	    my $probelen = $1 - $$InputSpecPrimer{$inputID}{$primer3ID}[3][-1] + 1;
	    my $probeseq = substr($sequence,$$InputSpecPrimer{$inputID}{$primer3ID}[3][-1],$probelen);
	    push (@{ $$InputSpecPrimer{$inputID}{$primer3ID}[12] }, $probeseq);
        }
# Tm of left primer
        if ($line =~/^PRIMER_LEFT.*TM=(\S+)/){
            push (@{ $$InputSpecPrimer{$inputID}{$primer3ID}[7] }, $1);
        }
# Tm of right primer
        if ($line =~/^PRIMER_RIGHT.*TM=(\S+)/){
            push (@{ $$InputSpecPrimer{$inputID}{$primer3ID}[8] }, $1);
        }
# GC content of left primer
        if ($line =~/^PRIMER_LEFT.*GC_PERCENT=(\S+)/){
            push (@{ $$InputSpecPrimer{$inputID}{$primer3ID}[9] }, $1);
        }
# GC content of right primer
        if ($line =~/^PRIMER_RIGHT.*GC_PERCENT=(\S+)/){
            push (@{ $$InputSpecPrimer{$inputID}{$primer3ID}[10] }, $1);
        }
# primer product (amplicon) size
        if (($line =~/^PRIMER_PRODUCT_SIZE.*=(\d+)/) && ($line !~/^PRIMER_PRODUCT_SIZE_RANGE=/)){
            push (@{ $$InputSpecPrimer{$inputID}{$primer3ID}[11] }, $1);
        }
    }
    close PRIMER3;
# check for primer design errors
    for (my $i=0;$i<scalar(@$IDSeqKeys);$i++){
        if ($evaluation eq "NO"){
	    for (my $j=0;$j<scalar(@{ $$InputSpecRegion{$$IDSeqKeys[$i]} });$j++){
		my $primer3ID = $$InputSpecRegion{$$IDSeqKeys[$i]}[$j][0].'..'.$$InputSpecRegion{$$IDSeqKeys[$i]}[$j][1];
		my @arraylength = ();
		my @splice = ();
		for (my $k=0;$k<scalar(@{ $$InputSpecPrimer{$$IDSeqKeys[$i]}{$primer3ID} });$k++){
# collect entries with primer pair penalty over cut-off
		    my $splicecount = 0;
		    if (($k eq 0) && ($param{"PRIMER_PAIR_PENALTY"}[0] ne 'empty')){
			for (my $l=0;$l<scalar(@{ $$InputSpecPrimer{$$IDSeqKeys[$i]}{$primer3ID}[0] });$l++){
			    if ($$InputSpecPrimer{$$IDSeqKeys[$i]}{$primer3ID}[0][$l] >= $param{"PRIMER_PAIR_PENALTY"}[0]){
				push (@splice, $l - $splicecount);
                                $splicecount++;
			    }
			}
		    }
# slice entries with primer pair penalty over cut-off
		    for (my $l=0;$l<scalar(@splice);$l++){
			if (($splice[$l] + 1) eq scalar(@ { $$InputSpecPrimer{$$IDSeqKeys[$i]}{$primer3ID}[$k] })){
			    pop(@{ $$InputSpecPrimer{$$IDSeqKeys[$i]}{$primer3ID}[$k] });
			}
			else {
			    splice(@{ $$InputSpecPrimer{$$IDSeqKeys[$i]}{$primer3ID}[$k] }, $splice[$l]);
			}
		    }
		    push (@arraylength,scalar(@{ $$InputSpecPrimer{$$IDSeqKeys[$i]}{$primer3ID}[$k] }));
		}
		my $wrongArrayLength = 0;
		for (my $k=1;$k<scalar(@arraylength);$k++){
		    if ($arraylength[$k] ne $arraylength[0]){
			$wrongArrayLength++;
		    }
		}
		unless ($wrongArrayLength eq 0){
		    print $error "$$IDSeqKeys[$i]=>$primer3ID\tERROR in primer3 output, columns do not have the same length: @arraylength\n";
		}
	    }
	}
	else {
	    my $primer3ID = $$IDSeqKeys[$i];
	    my @arraylength = ();
	    if (exists $$InputSpecPrimer{$$IDSeqKeys[$i]}{$primer3ID}){
		for (my $k=0;$k<scalar(@{ $$InputSpecPrimer{$$IDSeqKeys[$i]}{$primer3ID} });$k++){
		    push (@arraylength,scalar(@{ $$InputSpecPrimer{$$IDSeqKeys[$i]}{$primer3ID}[$k] }));
		}
		my $wrongArrayLength = 0;
		for (my $k=1;$k<scalar(@arraylength);$k++){
		    if ($arraylength[$k] ne $arraylength[0]){
			$wrongArrayLength++;
		    }
		}
		unless ($wrongArrayLength eq 0){
		    print $error "$$IDSeqKeys[$i]=>$primer3ID\tERROR in primer3 output, columns do not have the same length: @arraylength\n";
		}
	    }
	    else {
# if no primer design was possible, just take the first and last 20 bases as primers
		my $seqlen = length($$IDSeq{$$IDSeqKeys[$i]});
		my $primer3Seq = $$IDSeq{$$IDSeqKeys[$i]};
		my $primer_l = substr($primer3Seq,0,20);
		my $end = $seqlen - 20;
		my $primer_r = substr($primer3Seq,$end,20);
		$primer_r = reverse $primer_r;
		$primer_r =~ tr/ACGTacgt/TGCAtgca/;
		$$InputSpecPrimer{$$IDSeqKeys[$i]}{$primer3ID} = [ ['NA',],[$primer_l,],[$primer_r,],['1',],['20',],[$end,],['20',],['NA',],['NA',],['NA',],['NA',],[$seqlen,],[$$IDSeq{$$IDSeqKeys[$i]},], ];
		print $error "$$IDSeqKeys[$i]\tNo primer design was possible, 20nt at start and end were defined es primers\n";
	    }
	}
    }
}

##
## Assemble all information from specificity, efficiency and primer3 calculations for each input target sequence for long dsRNA (-r d) queries
##

sub assembleResults{
    my ($fraglength,$designnum,$designwindow,$IDSeq,$IDSeqKeys,$InputSpecRegion,$InputSpecsiRNA,$InputEffsiRNA,$effOption,$InputSpecPrimer,$Designs,$DesignsBest,$DesignsBad,$DesignsLeftover,$DesignsFailed,$targetGroups,$Groupstarget,$InputsiRNATarget,$error,$evaluation,$intron,$query,$nodb,$rankd,$targetType) = @_;
    my @IDSeqKeys = ();
    if ($query eq "Design"){
	@IDSeqKeys = @$IDSeqKeys;
    }
    else {
	@IDSeqKeys = keys %$InputSpecRegion;
	print ERROR "Re-design\tEnter a re-design round\n";
    }
    for (my $i=0;$i<scalar(@IDSeqKeys);$i++){
# iterate over all specific regions identified for a certain target region
	for (my $j=0;$j<scalar(@{ $$InputSpecRegion{$IDSeqKeys[$i]} });$j++){
	    my $SpecRegion = "";
	    if ($evaluation eq "NO"){
		$SpecRegion = $$InputSpecRegion{$IDSeqKeys[$i]}[$j][0].'..'.$$InputSpecRegion{$IDSeqKeys[$i]}[$j][1];
	    }
	    else {
		$SpecRegion = $IDSeqKeys[$i];
	    }
# arrays to collect results
	    my @TakeThat = ( [],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[], );
            my @TrashThat = ( [],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[], );
# iterate over all amplicons (primer designs) for this specific region
	  PRIMERDESIGN:
	    for (my $k=0;$k<scalar(@{ $$InputSpecPrimer{$IDSeqKeys[$i]}{$SpecRegion}[11] });$k++){
                my $TakeorTrash = "";
# differentiate between 'good' and 'bad' regions according to the desired product length
# regions/designs below this length are saved in case no good regions/designs are possible
                if ($$InputSpecPrimer{$IDSeqKeys[$i]}{$SpecRegion}[11][$k] > $$designwindow[0]){
                    $TakeorTrash = \@TakeThat;
                }
                else {
                    $TakeorTrash = \@TrashThat;
                }
# add primer information
                for (my $l=0;$l<scalar(@{ $$InputSpecPrimer{$IDSeqKeys[$i]}{$SpecRegion} });$l++){
# for position of primer, add location of specific region
		    if ($l eq 3){
			push (@{ $$TakeorTrash[$l] }, $$InputSpecPrimer{$IDSeqKeys[$i]}{$SpecRegion}[$l][$k] + $$InputSpecRegion{$IDSeqKeys[$i]}[$j][0]);
		    }
		    elsif ($l eq 5){
			push (@{ $$TakeorTrash[$l] }, $$InputSpecPrimer{$IDSeqKeys[$i]}{$SpecRegion}[$l][$k] + $$InputSpecRegion{$IDSeqKeys[$i]}[$j][0]);
		    }
		    else {
			push (@{ $$TakeorTrash[$l] }, $$InputSpecPrimer{$IDSeqKeys[$i]}{$SpecRegion}[$l][$k]);
		    }
                }
# calculate percent and absolute efficiency and add efficiency array for the amplicon
                if ($$effOption[0] ne "empty"){
		    if ((scalar(@{ $$InputEffsiRNA{$IDSeqKeys[$i]} }) + $fraglength - 1) ne length($$IDSeq{$IDSeqKeys[$i]})){
			my $scal = scalar(@{ $$InputEffsiRNA{$IDSeqKeys[$i]} }) + $fraglength - 1;
			my $len = length($$IDSeq{$IDSeqKeys[$i]});
			print ERROR "$IDSeqKeys[$i]\tNumber of efficiency values not correct (siRNAs: $scal <=> length: $len)\n";
		    }
		    my $eff_start = $$InputSpecPrimer{$IDSeqKeys[$i]}{$SpecRegion}[3][$k] + $$InputSpecRegion{$IDSeqKeys[$i]}[$j][0] - 1;
		    my $eff_end = $$InputSpecPrimer{$IDSeqKeys[$i]}{$SpecRegion}[5][$k] + $$InputSpecRegion{$IDSeqKeys[$i]}[$j][0] - $fraglength + 1 - 1;
		    my @eff = @{ $$InputEffsiRNA{$IDSeqKeys[$i]} }[ $eff_start .. $eff_end ];
		    my $effsiRNA = 0;
		    for (my $l=0;$l<scalar(@eff);$l++){
			if ($eff[$l] >= $$effOption[1]){
			    $effsiRNA++;
			}
		    }
#		    my $effsiRNAPerc = sprintf("%.2f", ($effsiRNA / scalar(@eff) * 100));
# add array of efficiencies
		    push (@{ $$TakeorTrash[13] }, [ @eff ]);
		    my $eff_sum = 0;
		    for (my $l=0;$l<scalar(@eff);$l++){
			$eff_sum+= $eff[$l];
		    }
		    my $eff_avg = sprintf("%.2f", ($eff_sum / scalar(@eff)));
# add name of efficiency method
		    push (@{ $$TakeorTrash[14] }, $$effOption[0]);
# add average (over siRNAs) of percent efficiency of this regions
		    push (@{ $$TakeorTrash[15] }, "$effsiRNA\|$eff_avg");
		}
# calculate percent and absolute specificity and add specificity array for the amplicon
		if ((scalar(@{ $$InputSpecsiRNA{$IDSeqKeys[$i]} }) + $fraglength - 1) ne length($$IDSeq{$IDSeqKeys[$i]})){
		    my $scal = scalar(@{ $$InputSpecsiRNA{$IDSeqKeys[$i]} }) + $fraglength - 1;
		    my $len = length($$IDSeq{$IDSeqKeys[$i]});
		    print ERROR "$IDSeqKeys[$i]\tNumber of specificity values not correct (siRNAs: $scal <=> length: $len)\n";
		}
		my $spec_start = $$InputSpecPrimer{$IDSeqKeys[$i]}{$SpecRegion}[3][$k] + $$InputSpecRegion{$IDSeqKeys[$i]}[$j][0] - 1;
		my $spec_end = $$InputSpecPrimer{$IDSeqKeys[$i]}{$SpecRegion}[5][$k] + $$InputSpecRegion{$IDSeqKeys[$i]}[$j][0] - $fraglength + 1 - 1;
		my @spec = @{ $$InputSpecsiRNA{$IDSeqKeys[$i]} }[ $spec_start .. $spec_end ];
# add array of specificities
		push (@{ $$TakeorTrash[16] }, [ @spec ]);
                my $spec_len = scalar(@spec);
		my $NA = 0;
                my $spec_siRNA = 0;
                my $unspec_siRNA = 0;
		my $seed = 0;
		my $low = 0;
		my $can = 0;
		my $mirseed = 0;
		for (my $l=0;$l<scalar(@spec);$l++){
                    if ($spec[$l] eq 1){
                        $spec_siRNA++;
                    }
                    else {
			my @unspec = split(/\|/,$spec[$l]);
			if ($unspec[0] eq 'NA'){
			    $spec_len--;
			    $NA++;
			}
			else {
			    if ($unspec[0] eq 0){
				$unspec_siRNA++;
			    }
			    else {
				$spec_siRNA++;
			    }
			}
			$seed+= $unspec[2];
			$low+= $unspec[3];
			$can+= $unspec[4];
			$mirseed+= $unspec[5];
                    }
                }
                if ($spec_siRNA ne 0){
                    my $spec_perc = sprintf("%.2f", (($spec_siRNA / $spec_len) * 100));
                    my $spec_abs = scalar(@spec).'/'.$spec_siRNA.'/'.$unspec_siRNA.'/'.$NA.'/'.$seed.'/'.$low.'/'.$can.'/'.$mirseed;
# add absolute quality: number of siRNAs/number of specific siRNAs/number of unspecific siRNAs/number of
# none-targeting siRNAs/number of siRNAs with seed-matches above defined threshold/number of low complexity regions
# /number of CAN repeats/number of conserved (miRNA) seeds
                    push (@{ $$TakeorTrash[17] }, $spec_abs);
# add percent specificity (number of specific siRNAs over number of all siRNAs)
                    push (@{ $$TakeorTrash[18] }, $spec_perc);
                }
                else {
# in case specificity is 0
                    my $spec_abs = scalar(@spec).'/0/'.$unspec_siRNA.'/'.$NA.'/'.$seed.'/'.$low.'/'.$can.'/'.$mirseed;
                    push (@{ $$TakeorTrash[17] }, $spec_abs);
                    push (@{ $$TakeorTrash[18] }, 0);
                }
# intron filter (only for de novo design)
		my $intronPerc = ($NA / scalar(@spec)) * 100;
		if (($intronPerc > $intron) && ($evaluation eq "NO")){
		    for (my $l=0;$l<19;$l++){
			pop (@{ $$TakeorTrash[$l] });
		    }
		    next PRIMERDESIGN;
		}
# add target and target group information (e.g. target transcripts => target gene)
		my $siRNARange_start = $$InputSpecPrimer{$IDSeqKeys[$i]}{$SpecRegion}[3][$k] + $$InputSpecRegion{$IDSeqKeys[$i]}[$j][0] - 1 + 1;
		my $siRNARange_end = $$InputSpecPrimer{$IDSeqKeys[$i]}{$SpecRegion}[5][$k] - 1 + $$InputSpecRegion{$IDSeqKeys[$i]}[$j][0] - $fraglength + 1 + 1;
		my @siRNARange = ($siRNARange_start .. $siRNARange_end);
		my %siRNATarget = ();
		for (my $l=0;$l<scalar(@siRNARange);$l++){
		    my $siRNA = $IDSeqKeys[$i].'_'.$siRNARange[$l];
		    my @targetKeys = keys %{ $$InputsiRNATarget{$IDSeqKeys[$i]}{$siRNA} };
		    for (my $m=0;$m<scalar(@targetKeys);$m++){
# targetgroup exists
			if (exists $$targetGroups{$targetKeys[$m]}){
			    if (!exists $siRNATarget{$$targetGroups{$targetKeys[$m]}}{$targetKeys[$m]}){
				$siRNATarget{$$targetGroups{$targetKeys[$m]}}{$targetKeys[$m]} = $$InputsiRNATarget{$IDSeqKeys[$i]}{$siRNA}{$targetKeys[$m]};
			    }
			    else {
				$siRNATarget{$$targetGroups{$targetKeys[$m]}}{$targetKeys[$m]}+= $$InputsiRNATarget{$IDSeqKeys[$i]}{$siRNA}{$targetKeys[$m]};
			    }
			}
			else {
# no targetgroup exists
			    if (!exists $siRNATarget{$targetKeys[$m]}{$targetKeys[$m]}){
                                $siRNATarget{$targetKeys[$m]}{$targetKeys[$m]} = $$InputsiRNATarget{$IDSeqKeys[$i]}{$siRNA}{$targetKeys[$m]};
                            }
                            else {
                                $siRNATarget{$targetKeys[$m]}{$targetKeys[$m]}+= $$InputsiRNATarget{$IDSeqKeys[$i]}{$siRNA}{$targetKeys[$m]};
                            }
			}
		    }
		}
		my @siRNATargetKeys = keys %siRNATarget;
# check, whether all members (e.g. transcripts) of a target group are covered, if not, set to 0
		for (my $l=0;$l<scalar(@siRNATargetKeys);$l++){
		    if (exists $$Groupstarget{$siRNATargetKeys[$l]}){
			for (my $m=0;$m<scalar(@{ $$Groupstarget{$siRNATargetKeys[$l]} });$m++){
			    if (!exists $siRNATarget{$siRNATargetKeys[$l]}{$$Groupstarget{$siRNATargetKeys[$l]}[$m]}){
				$siRNATarget{$siRNATargetKeys[$l]}{$$Groupstarget{$siRNATargetKeys[$l]}[$m]} = 0;
			    }
			}
		    }
		}
# sort targets according to number of hits
		my @siRNATargetHits = ();
		for (my $l=0;$l<scalar(@siRNATargetKeys);$l++){
		    my @TargetKeys = keys %{ $siRNATarget{$siRNATargetKeys[$l]} };
                    my @TargetSpecs = ();
                    for (my $m=0;$m<scalar(@TargetKeys);$m++){
                        push (@TargetSpecs, $siRNATarget{$siRNATargetKeys[$l]}{$TargetKeys[$m]});
                    }
# sort targets within a certain group
		    @TargetKeys = @TargetKeys[ sort {$TargetSpecs[$b] <=> $TargetSpecs[$a]} 0 .. $#TargetKeys ];
                    @TargetSpecs = sort {$b<=>$a}(@TargetSpecs);
		    push (@siRNATargetHits, $TargetSpecs[0]);
		}
# sort groups according to number of hits to best target within a group, to identify primary/intended target
		@siRNATargetKeys = @siRNATargetKeys[ sort {$siRNATargetHits[$b] <=> $siRNATargetHits[$a]} 0 .. $#siRNATargetKeys ];
		my $groupTargets = "";
		my $Target = "";
		my $TargetSpecs = "";
		for (my $l=0;$l<scalar(@siRNATargetKeys);$l++){
		    if ($l eq 0){
			$groupTargets = $siRNATargetKeys[$l];
		    }
		    else {
			$groupTargets.= '&'.$siRNATargetKeys[$l];
		    }
		    my @TargetKeys = keys %{ $siRNATarget{$siRNATargetKeys[$l]} };
		    my @TargetSpecs = ();
		    for (my $m=0;$m<scalar(@TargetKeys);$m++){
			push (@TargetSpecs, $siRNATarget{$siRNATargetKeys[$l]}{$TargetKeys[$m]});
		    }
# sort targets of target group according to number of hits
		    @TargetKeys = @TargetKeys[ sort {$TargetSpecs[$b] <=> $TargetSpecs[$a]} 0 .. $#TargetKeys ];
		    @TargetSpecs = sort {$b<=>$a}(@TargetSpecs);
		    for (my $m=0;$m<scalar(@TargetKeys);$m++){
			if ($l eq 0){
			    if ($m eq 0){
				$Target = $TargetKeys[$m];
				$TargetSpecs = $TargetSpecs[$m];
			    }
			    else {
				$Target.= '+'.$TargetKeys[$m];
				$TargetSpecs.= '+'.$TargetSpecs[$m];
			    }
			}
			else {
			    if ($m eq 0){
                                $Target.= '&'.$TargetKeys[$m];
				$TargetSpecs.= '&'.$TargetSpecs[$m];
                            }
                            else {
                                $Target.= '+'.$TargetKeys[$m];
				$TargetSpecs.= '+'.$TargetSpecs[$m];
                            }
			}
		    }
		}
		if ($groupTargets eq ""){
		    $groupTargets = 'NA';
		}
		if ($Target eq ""){
                    $Target = 'NA';
                }
		if ($TargetSpecs eq ""){
                    $TargetSpecs = 'NA';
                }
# add group name (e.g. gene)
		push (@{ $$TakeorTrash[19] }, $groupTargets);
# add target name (e.g. transcript)
		push (@{ $$TakeorTrash[20] }, $Target);
# add number of hits to target
		push (@{ $$TakeorTrash[21] }, $TargetSpecs);
            }
#####################################################################
#                                                                   #
# Sorting of results, affecting the reagents printed to the output! #
# Here, sorting for:                                                #
# 1. Percent specificity (to get 100% specificity)                  #
# 2. Number of efficient siRNAs (or average efficiency if cutoff 0) #
# or (if efficiency not calculated or specificity was selected)     #
# 2. Absolute specificity (to maximize length of specific reagent)  #
#                                                                   #
#####################################################################
	    my @specSort = @{ $TakeThat[17] };
	    my @specSort2 = @{ $TakeThat[17] };
	    for (my $k=0;$k<scalar(@specSort);$k++){
                my @specSplit = split(/\//,$specSort[$k]);
                $specSort[$k] = $specSplit[1];
		$specSort2[$k] = $specSplit[2];
            }
	    my @relSort = @{ $TakeThat[18] };
	    if ($$effOption[0] ne "empty"){
		my @effSort = @{ $TakeThat[15] };
		for (my $k=0;$k<scalar(@effSort);$k++){
		    my @effSplit = split(/\|/,$effSort[$k]);
# if efficiency cutoff is 0, sort according to percent efficiency		    
		    if ($$effOption[1] eq 0){
			$effSort[$k] = $effSplit[1];
		    }
		    else {
			$effSort[$k] = $effSplit[0];
		    }
		}
		if ($rankd eq 'SPEC'){
		    for (my $k=0;$k<scalar(@TakeThat);$k++){
			my $TakeThatlen = scalar(@{ $TakeThat[$k] }) - 1;
			if ($targetType eq 'NA'){
			    @{ $TakeThat[$k] } = @{ $TakeThat[$k] } [ sort { 
				$specSort[$a] <=> $specSort[$b]
				    ||
				    $specSort2[$a] <=> $specSort2[$b]
				    ||
				    $effSort[$b] <=> $effSort[$a]
								      } 0 .. $TakeThatlen ];
			}
			else {
			    @{ $TakeThat[$k] } = @{ $TakeThat[$k] } [ sort {
                                $relSort[$b] <=> $relSort[$a]
                                    ||
                                    $specSort[$b] <=> $specSort[$a]
                                    ||
                                    $effSort[$b] <=> $effSort[$a]
                                                                      } 0 .. $TakeThatlen ];
			}
		    }
		}
		elsif ($rankd eq 'EFF'){
		    for (my $k=0;$k<scalar(@TakeThat);$k++){
                        my $TakeThatlen = scalar(@{ $TakeThat[$k] }) - 1;
                        if ($targetType eq 'NA'){
			    @{ $TakeThat[$k] } = @{ $TakeThat[$k] } [ sort {
				$specSort[$a] <=> $specSort[$b]
				    ||
				    $specSort2[$a] <=> $specSort2[$b]
				    ||
				    $effSort[$b] <=> $effSort[$a]
								      } 0 .. $TakeThatlen ];
			}
			else {
			    @{ $TakeThat[$k] } = @{ $TakeThat[$k] } [ sort {
                                $relSort[$b] <=> $relSort[$a]
                                    ||
                                    $effSort[$b] <=> $effSort[$a]
                                    ||
                                    $specSort[$b] <=> $specSort[$a]
                                                                      } 0 .. $TakeThatlen ];
			}
		    }
		}
	    }
	    else {
                for (my $k=0;$k<scalar(@TakeThat);$k++){
                    my $TakeThatlen = scalar(@{ $TakeThat[$k] }) - 1;
                    if ($targetType eq 'NA'){
			@{ $TakeThat[$k] } = @{ $TakeThat[$k] } [ sort {
			    $specSort[$a] <=> $specSort[$b]
				||
				$specSort2[$a] <=> $specSort2[$b]
								  } 0 .. $TakeThatlen ];
		    }
		    else {
			@{ $TakeThat[$k] } = @{ $TakeThat[$k] } [ sort {
                            $relSort[$b] <=> $relSort[$a]
                                ||
                                $specSort[$b] <=> $specSort[$a]
                                                                  } 0 .. $TakeThatlen ];
		    }
		}
	    }

#
# Re-sort according to absolute specific siRNAs if number of specific siRNAs of the best design is 0
#
	    if (defined $TakeThat[17][0]){
		my @specSplit = split(/\//,$TakeThat[17][0]);
		if (($specSplit[1] eq 0) && ($targetType ne 'NA')){
		    for (my $k=0;$k<scalar(@TakeThat);$k++){
			my $TakeThatlen = scalar(@{ $TakeThat[$k] }) - 1;
			@{ $TakeThat[$k] } = @{ $TakeThat[$k] } [ sort {
			    $specSort[$b] <=> $specSort[$a]
								  } 0 .. $TakeThatlen ];
		    }
		}
	    }
            if (!exists $$Designs{$IDSeqKeys[$i]}{$SpecRegion}){
                $$Designs{$IDSeqKeys[$i]}{$SpecRegion} = [ @TakeThat ];
            }
            else {
                print ERROR "$IDSeqKeys[$i]=>$SpecRegion\tSpecific region $SpecRegion occurs multiple times in design\n";
            }
# save "bad" designs (in case there are no "good" designs)
            undef @specSort;
	    undef @relSort;
# Sorting of "bad" designs according to absolute and relative specificity
	    @specSort = @{ $TrashThat[17] };
	    for (my $k=0;$k<scalar(@specSort);$k++){
                my @specSplit = split(/\//,$specSort[$k]);
                $specSort[$k] = $specSplit[1];
            }
            @relSort = @{ $TrashThat[11] };
	    for (my $k=0;$k<scalar(@TrashThat);$k++){
		my $TrashThatlen = scalar(@{ $TrashThat[$k] }) - 1;
		if ($targetType eq 'NA'){
		    @{ $TrashThat[$k] } = @{ $TrashThat[$k] }[ sort {
			$specSort[$a] <=> $specSort[$b]
			    ||
			    $specSort2[$a] <=> $specSort2[$b]
							       } 0 .. $TrashThatlen];
		}
		else {
		    @{ $TrashThat[$k] } = @{ $TrashThat[$k] }[ sort {
                        $specSort[$b] <=> $specSort[$a]
                            ||
                            $relSort[$b] <=> $relSort[$a]
                                                               } 0 .. $TrashThatlen];
		}
	    }
# Re-sort according to absolute specific siRNAs if number of specific siRNAs of the best design is 0
	    if (defined $TrashThat[17][0]){
		my @specSplit = split(/\//,$TrashThat[17][0]);
		if (($specSplit[1] eq 0) && ($targetType ne 'NA')){
		    for (my $k=0;$k<scalar(@TrashThat);$k++){
			my $TrashThatlen = scalar(@{ $TrashThat[$k] }) - 1;
			@{ $TrashThat[$k] } = @{ $TrashThat[$k] } [ sort {
			    $specSort[$b] <=> $specSort[$a]
								    } 0 .. $TrashThatlen ];
		    }
		}
	    }
	    if (!exists $$DesignsBad{$IDSeqKeys[$i]}){
                $$DesignsBad{$IDSeqKeys[$i]} = [ [],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[], ];
            }
            for (my $k=0;$k<scalar(@TrashThat);$k++){
                for (my $l=0;$l<scalar(@{ $TrashThat[$k] });$l++){
                    push (@{ $$DesignsBad{$IDSeqKeys[$i]}[$k] }, $TrashThat[$k][$l]);
                }
            }
# save best results from TakeThat array
            if (!exists $$DesignsBest{$IDSeqKeys[$i]}){
                $$DesignsBest{$IDSeqKeys[$i]} = [ [],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[] ];
            }
	    my $bestnum = 0;
	    if (scalar(@{ $TakeThat[0] }) >= $designnum){
		$bestnum = $designnum;
	    }
	    else {
		$bestnum = scalar(@{ $TakeThat[0] });
		print ERROR "$IDSeqKeys[$i]=>$SpecRegion\tOnly $bestnum RNAi reagents could be designed for this specific region\n";
	    }
            for (my $k=0;$k<scalar(@TakeThat);$k++){
                for (my $l=0;$l<$bestnum;$l++){
		    if (exists $$DesignsBest{$IDSeqKeys[$i]}){
			push (@{ $$DesignsBest{$IDSeqKeys[$i]}[$k] }, $TakeThat[$k][$l]);
		    }
                }
            }	    
	}
    }
# identify input sequences, for which RNAi reagent design was not possible and, if possible, fuse specific regions for re-design
    for (my $i=0;$i<scalar(@IDSeqKeys);$i++){
        if (!defined $$DesignsBest{$IDSeqKeys[$i]}[0][0]){
# if only one specific region exists, expand to whole sequence (if it is not already), otherwise design failed
	    if ((scalar(@{ $$InputSpecRegion{$IDSeqKeys[$i]} }) eq 1) && (($$InputSpecRegion{$IDSeqKeys[$i]}[0][1] + $fraglength - $$InputSpecRegion{$IDSeqKeys[$i]}[0][0]) eq (length($$IDSeq{$IDSeqKeys[$i]})))){
# these designs failed, try to get at least "bad" designs
		$$DesignsFailed{$IDSeqKeys[$i]} = $$DesignsBad{$IDSeqKeys[$i]};
		if (exists $$DesignsLeftover{$IDSeqKeys[$i]}){
		    delete $$DesignsLeftover{$IDSeqKeys[$i]};
		    print ERROR "$IDSeqKeys[$i]\tRe-design failed\n";
		}
		else {
		    print ERROR "$IDSeqKeys[$i]\tDesign failed\n";
		}
            }
            else {
		if (!exists $$DesignsLeftover{$IDSeqKeys[$i]}){
		    $$DesignsLeftover{$IDSeqKeys[$i]} = $$IDSeq{$IDSeqKeys[$i]};
		    print ERROR "$IDSeqKeys[$i]\tRe-design possible\n";
		}
		else {
		    print ERROR "$IDSeqKeys[$i]\tFurther re-design possible\n";
		}
            }
        }
	if ((defined $$DesignsBest{$IDSeqKeys[$i]}[0][0]) && (exists $$DesignsLeftover{$IDSeqKeys[$i]})){
	    delete $$DesignsLeftover{$IDSeqKeys[$i]};
	    print ERROR "$IDSeqKeys[$i]\tRe-design was successful\n";
	}
    }
}

##
## Assemble all information from specificity, efficiency and primer3 calculations for each input target sequence for siRNA (-r s) queries
##

sub assembleResultsiRNA{
    my ($fraglength,$designnum,$designwindow,$IDSeq,$IDSeqKeys,$InputSpecRegion,$InputSpecsiRNA,$InputEffsiRNA,$effOption,$seedNum,$Designs,$DesignsBest,$targetGroups,$Groupstarget,$InputsiRNATarget,$error,$evaluation,$query,$targetType) = @_;
    my @IDSeqKeys = ();
    if ($query eq "Design"){
        @IDSeqKeys = @$IDSeqKeys;
    }
    else {
        @IDSeqKeys = keys %$InputSpecRegion;
	print ERROR "Re-design\tEnter a re-design round\n";
    }
    my $seedNumScal = scalar( keys %$seedNum );
    for (my $i=0;$i<scalar(@IDSeqKeys);$i++){
# iterate over all specific regions identified for a certain target region
        for (my $j=0;$j<scalar(@{ $$InputSpecRegion{$IDSeqKeys[$i]} });$j++){
            my $SpecRegion = "";
            if ($evaluation eq "NO"){
                $SpecRegion = $$InputSpecRegion{$IDSeqKeys[$i]}[$j][0].'..'.$$InputSpecRegion{$IDSeqKeys[$i]}[$j][1];
            }
            else {
                $SpecRegion = $IDSeqKeys[$i];
            }
	    my @TakeThat = ( [],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[], );
# iterate over all siRNAs within a specific target region
	    for (my $k=$$InputSpecRegion{$IDSeqKeys[$i]}[$j][0];$k<=$$InputSpecRegion{$IDSeqKeys[$i]}[$j][1];$k++){
                my $TakeorTrash = \@TakeThat;
# calculate position and sequence
		my $pos = $k - 1;
		my $seq = substr($$IDSeq{$IDSeqKeys[$i]},$pos,$fraglength);
# add position
		push (@{ $$TakeorTrash[10] }, $k);
# add length
		push (@{ $$TakeorTrash[11] }, $fraglength);
# add sequence
		push (@{ $$TakeorTrash[12] }, $seq);
# efficiency
		if ($effOption ne "empty"){
		    if ((scalar(@{ $$InputEffsiRNA{$IDSeqKeys[$i]} }) + $fraglength - 1) ne length($$IDSeq{$IDSeqKeys[$i]})){
			my $scal = scalar(@{ $$InputEffsiRNA{$IDSeqKeys[$i]} }) + $fraglength - 1;
			my $len = length($$IDSeq{$IDSeqKeys[$i]});
			print ERROR "$IDSeqKeys[$i]\tNumber of efficiency values not correct (siRNAs: $scal <=> $len)\n";
		    }
# add name of efficiency method
		    push (@{ $$TakeorTrash[14] }, $effOption);
# add siRNA efficiency
		    push (@{ $$TakeorTrash[15] }, sprintf("%.2f", $$InputEffsiRNA{$IDSeqKeys[$i]}[$pos]));
		}
# specificity
                if ((scalar(@{ $$InputSpecsiRNA{$IDSeqKeys[$i]} }) + $fraglength - 1) ne length($$IDSeq{$IDSeqKeys[$i]})){
		    my $scal = scalar(@{ $$InputSpecsiRNA{$IDSeqKeys[$i]} }) + $fraglength - 1;
                    my $len = length($$IDSeq{$IDSeqKeys[$i]});
                    print ERROR "$IDSeqKeys[$i]\tNumber of specificity values not correct (siRNAs: $scal <=> length: $len)\n";
                }
		my $spec = "";
		if ($$InputSpecsiRNA{$IDSeqKeys[$i]}[$pos] eq 1){
		    $spec = '1/1/0/0/0/0/0/0';
		}
		else {
		    my @unspec = split(/\|/,$$InputSpecsiRNA{$IDSeqKeys[$i]}[$pos]);
		    if ($unspec[0] eq 'NA'){
			$spec = "1/0/0/1/$unspec[2]/$unspec[3]/$unspec[4]/$unspec[5]";
		    }
		    else {
			if ($unspec[0] eq 1){
			    $spec = "1/1/0/0/$unspec[2]/$unspec[3]/$unspec[4]/$unspec[5]";
			}
			else {
			    $spec = "1/0/1/0/$unspec[2]/$unspec[3]/$unspec[4]/$unspec[5]";
			}
		    }
		}
# add absolute quality: number of siRNAs/number of specific siRNAs/number of unspecific siRNAs/number of
# none-targeting siRNAs/number of siRNAs with seed-matches above defined threshold/number of low complexity regions
# /number of CAN repeats/number of conserved (miRNA) seeds
		push (@{ $$TakeorTrash[17] }, $spec);
# add seed complement frequency information
		my $siRNA = $IDSeqKeys[$i].'_'.$k;
		if ($seedNumScal ne 0){
		    push (@{ $$TakeorTrash[18] }, $$seedNum{$siRNA});
		}
# add target and target group information (e.g. target transcripts => target gene)
		my %siRNATarget = ();
		my @targetKeys = keys %{ $$InputsiRNATarget{$IDSeqKeys[$i]}{$siRNA} };
		for (my $m=0;$m<scalar(@targetKeys);$m++){
# targetgroup exists
		    if (exists $$targetGroups{$targetKeys[$m]}){
			if (!exists $siRNATarget{$$targetGroups{$targetKeys[$m]}}{$targetKeys[$m]}){
			    $siRNATarget{$$targetGroups{$targetKeys[$m]}}{$targetKeys[$m]} = $$InputsiRNATarget{$IDSeqKeys[$i]}{$siRNA}{$targetKeys[$m]};
			}
			else {
			    $siRNATarget{$$targetGroups{$targetKeys[$m]}}{$targetKeys[$m]}+= $$InputsiRNATarget{$IDSeqKeys[$i]}{$siRNA}{$targetKeys[$m]};
			}
		    }
		    else {
# no targetgroup exists
			if (!exists $siRNATarget{$targetKeys[$m]}{$targetKeys[$m]}){
			    $siRNATarget{$targetKeys[$m]}{$targetKeys[$m]} = $$InputsiRNATarget{$IDSeqKeys[$i]}{$siRNA}{$targetKeys[$m]};
			}
			else {
			    $siRNATarget{$targetKeys[$m]}{$targetKeys[$m]}+= $$InputsiRNATarget{$IDSeqKeys[$i]}{$siRNA}{$targetKeys[$m]};
			}
		    }
		}
                my @siRNATargetKeys = keys %siRNATarget;
# check, whether all members (e.g. transcripts) of a target group are covered, if not, set to 0
                for (my $l=0;$l<scalar(@siRNATargetKeys);$l++){
                    if (exists $$Groupstarget{$siRNATargetKeys[$l]}){
                        for (my $m=0;$m<scalar(@{ $$Groupstarget{$siRNATargetKeys[$l]} });$m++){
                            if (!exists $siRNATarget{$siRNATargetKeys[$l]}{$$Groupstarget{$siRNATargetKeys[$l]}[$m]}){
                                $siRNATarget{$siRNATargetKeys[$l]}{$$Groupstarget{$siRNATargetKeys[$l]}[$m]} = 0;
                            }
                        }
                    }
                }
# sort targets according to number of hits
                my @siRNATargetHits = ();
                for (my $l=0;$l<scalar(@siRNATargetKeys);$l++){
                    my @TargetKeys = keys %{ $siRNATarget{$siRNATargetKeys[$l]} };
                    my @TargetSpecs = ();
                    for (my $m=0;$m<scalar(@TargetKeys);$m++){
                        push (@TargetSpecs, $siRNATarget{$siRNATargetKeys[$l]}{$TargetKeys[$m]});
                    }
# sort targets within a certain group
                    @TargetKeys = @TargetKeys[ sort {$TargetSpecs[$b] <=> $TargetSpecs[$a]} 0 .. $#TargetKeys ];
                    @TargetSpecs = sort {$b<=>$a}(@TargetSpecs);
                    push (@siRNATargetHits, $TargetSpecs[0]);
                }
# sort groups according to number of hits to best target within a group, to identify primary/intended target
                @siRNATargetKeys = @siRNATargetKeys[ sort {$siRNATargetHits[$b] <=> $siRNATargetHits[$a]} 0 .. $#siRNATargetKeys ];
                my $groupTargets = "";
                my $Target = "";
                my $TargetSpecs = "";
                for (my $l=0;$l<scalar(@siRNATargetKeys);$l++){
                    if ($l eq 0){
                        $groupTargets = $siRNATargetKeys[$l];
                    }
                    else {
                        $groupTargets.= '&'.$siRNATargetKeys[$l];
                    }
                    my @TargetKeys = keys %{ $siRNATarget{$siRNATargetKeys[$l]} };
		    my @TargetSpecs = ();
                    for (my $m=0;$m<scalar(@TargetKeys);$m++){
                        push (@TargetSpecs, $siRNATarget{$siRNATargetKeys[$l]}{$TargetKeys[$m]});
                    }
# sort targets of target group according to number of hits
                    @TargetKeys = @TargetKeys[ sort {$TargetSpecs[$b] <=> $TargetSpecs[$a]} 0 .. $#TargetKeys ];
                    @TargetSpecs = sort {$b<=>$a}(@TargetSpecs);
                    for (my $m=0;$m<scalar(@TargetKeys);$m++){
                        if ($l eq 0){
                            if ($m eq 0){
                                $Target = $TargetKeys[$m];
                                $TargetSpecs = $TargetSpecs[$m];
                            }
                            else {
                                $Target.= '+'.$TargetKeys[$m];
                                $TargetSpecs.= '+'.$TargetSpecs[$m];
                            }
                        }
                        else {
                            if ($m eq 0){
                                $Target.= '&'.$TargetKeys[$m];
                                $TargetSpecs.= '&'.$TargetSpecs[$m];
                            }
                            else {
                                $Target.= '+'.$TargetKeys[$m];
                                $TargetSpecs.= '+'.$TargetSpecs[$m];
                            }
                        }
                    }
                }
                if ($groupTargets eq ""){
                    $groupTargets = 'NA';
                }
                if ($Target eq ""){
                    $Target = 'NA';
                }
                if ($TargetSpecs eq ""){
                    $TargetSpecs = 'NA';
                }
# add group name (e.g. gene)
                push (@{ $$TakeorTrash[19] }, $groupTargets);
# add target name (e.g. transcript)
		push (@{ $$TakeorTrash[20] }, $Target);
# add number of hits to target
		push (@{ $$TakeorTrash[21] }, $TargetSpecs);
            }
#####################################################################
#                                                                   #
# Sorting of results, affecting the reagents printed to the output! #
# Here, sorting for:                                                #
# 1. Absolute specificity (which is either 0 or 1 for an siRNA)     #
# 2. Percentage efficiency according to the queried method          #
# 3. Seed complement frequency (if queried)                         #
#                                                                   #
#####################################################################
	    if (($seedNumScal ne 0) && ($effOption ne "empty")){
		my @specSort = @{ $TakeThat[17] };
		my @specSort2 = @{ $TakeThat[17] };
		for (my $k=0;$k<scalar(@specSort);$k++){
		    my @specSplit = split(/\//,$specSort[$k]);
		    $specSort[$k] = $specSplit[1];
		    $specSort2[$k] = $specSplit[2];
		}
		my @effSort = @{ $TakeThat[15] };
		my @SCFSort = @{ $TakeThat[18] };
		for (my $k=0;$k<scalar(@TakeThat);$k++){
		    my $TakeThatlen = scalar(@{ $TakeThat[$k] }) - 1;
		    if ($targetType eq 'NA'){

			@{ $TakeThat[$k] } = @{ $TakeThat[$k] } [ sort { 
			    $specSort[$a] <=> $specSort[$b]
				||
				$specSort2[$a] <=> $specSort2[$b]
				||
				$effSort[$b] <=> $effSort[$a]
				||
				$SCFSort[$a] <=> $SCFSort[$b]
								  } 0 .. $TakeThatlen ];
		    }
		    else {
			@{ $TakeThat[$k] } = @{ $TakeThat[$k] } [ sort {
                            $specSort[$b] <=> $specSort[$a]
                                ||
                                $effSort[$b] <=> $effSort[$a]
                                ||
                                $SCFSort[$a] <=> $SCFSort[$b]
                                                                  } 0 .. $TakeThatlen ];
		    }
		}
		if (!exists $$Designs{$IDSeqKeys[$i]}{$SpecRegion}){
		    $$Designs{$IDSeqKeys[$i]}{$SpecRegion} = [ @TakeThat ];
		}
		else {
		    print ERROR "$IDSeqKeys[$i]=>$SpecRegion\tSpecific region $SpecRegion occurs multiple in Design\n";
		}
	    }
	    elsif (($seedNumScal eq 0) && ($effOption ne "empty")) {
		my @specSort = @{ $TakeThat[17] };
		my @specSort2 = @{ $TakeThat[17] };
		for (my $k=0;$k<scalar(@specSort);$k++){
		    my @specSplit = split(/\//,$specSort[$k]);
		    $specSort[$k] = $specSplit[1];
		    $specSort2[$k] = $specSplit[2];
		}
		my @effSort = @{ $TakeThat[15] };
		for (my $k=0;$k<scalar(@TakeThat);$k++){
                    my $TakeThatlen = scalar(@{ $TakeThat[$k] }) - 1;
                    if ($targetType eq 'NA'){
			@{ $TakeThat[$k] } = @{ $TakeThat[$k] } [ sort {
			    $specSort[$a] <=> $specSort[$b]
				||
				$specSort2[$a] <=> $specSort2[$b]
				||
				$effSort[$b] <=> $effSort[$a]
								  } 0 .. $TakeThatlen ];
		    }
		    else {
			@{ $TakeThat[$k] } = @{ $TakeThat[$k] } [ sort {
                            $specSort[$b] <=> $specSort[$a]
                                ||
                                $effSort[$b] <=> $effSort[$a]
                                                                  } 0 .. $TakeThatlen ];
		    }
		}
                if (!exists $$Designs{$IDSeqKeys[$i]}{$SpecRegion}){
		    $$Designs{$IDSeqKeys[$i]}{$SpecRegion} = [ @TakeThat ];
                }
                else {
                    print ERROR "$IDSeqKeys[$i]=>$SpecRegion\tSpecific region $SpecRegion occurs multiple in Design\n";
                }
	    }
	    elsif (($seedNumScal ne 0) && ($effOption eq "empty")) {
		my @specSort = @{ $TakeThat[17] };
		my @specSort2 = @{ $TakeThat[17] };
                for (my $k=0;$k<scalar(@specSort);$k++){
                    my @specSplit = split(/\//,$specSort[$k]);
                    $specSort[$k] = $specSplit[1];
		    $specSort2[$k] = $specSplit[2];
                }
                my @SCFSort = @{ $TakeThat[18] };
                for (my $k=0;$k<scalar(@TakeThat);$k++){
                    my $TakeThatlen = scalar(@{ $TakeThat[$k] }) - 1;
# there is a bug when analyzing the shRNA-mir library!!!
		    if ($targetType eq 'NA'){
			@{ $TakeThat[$k] } = @{ $TakeThat[$k] } [ sort {
			    $specSort[$a] <=> $specSort[$b]
				||
				$specSort2[$a] <=> $specSort2[$b]
				||
				$SCFSort[$a] <=> $SCFSort[$b]
								  } 0 .. $TakeThatlen ];
		    }
		    else {
			@{ $TakeThat[$k] } = @{ $TakeThat[$k] } [ sort {
                            $specSort[$b] <=> $specSort[$a]
				||
                                $SCFSort[$a] <=> $SCFSort[$b]
                                                                  } 0 .. $TakeThatlen ];
		    }
		}
                if (!exists $$Designs{$IDSeqKeys[$i]}{$SpecRegion}){
                    $$Designs{$IDSeqKeys[$i]}{$SpecRegion} = [ @TakeThat ];
                }
                else {
                    print ERROR "$IDSeqKeys[$i]=>$SpecRegion\tSpecific region $SpecRegion occurs multiple in Design\n";
                }
	    }
	    else {
		my @specSort = @{ $TakeThat[17] };
		my @specSort2 = @{ $TakeThat[17] };
                for (my $k=0;$k<scalar(@specSort);$k++){
                    my @specSplit = split(/\//,$specSort[$k]);
                    $specSort[$k] = $specSplit[1];
		    $specSort2[$k] = $specSplit[2];
                }
                for (my $k=0;$k<scalar(@TakeThat);$k++){
                    my $TakeThatlen = scalar(@{ $TakeThat[$k] }) - 1;
		    if ($targetType eq 'NA'){
			@{ $TakeThat[$k] } = @{ $TakeThat[$k] } [ sort {
			    $specSort[$a] <=> $specSort[$b]
				||
				$specSort2[$a] <=> $specSort2[$b]
								  } 0 .. $TakeThatlen ];
		    }
		    else {
			@{ $TakeThat[$k] } = @{ $TakeThat[$k] } [ sort {
                            $specSort[$b] <=> $specSort[$a] } 0 .. $TakeThatlen ];
		    }
		}
                if (!exists $$Designs{$IDSeqKeys[$i]}{$SpecRegion}){
                    $$Designs{$IDSeqKeys[$i]}{$SpecRegion} = [ @TakeThat ];
                }
                else {
                    print ERROR "$IDSeqKeys[$i]=>$SpecRegion\tSpecific region $SpecRegion occurs multiple in Design\n";
                }
	    }
# save best results from TakeThat array
            if (!exists $$DesignsBest{$IDSeqKeys[$i]}){
                $$DesignsBest{$IDSeqKeys[$i]} = [ [],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[] ];
            }
            
            my $bestnum = 0;
            if (scalar(@{ $TakeThat[10] }) >= $designnum){
                $bestnum = $designnum;
            }
            else {
                $bestnum = scalar(@{ $TakeThat[10] });
                print ERROR "$IDSeqKeys[$i]=>$SpecRegion\tOnly $bestnum RNAi reagents could be designed for this specific region\n";
            }
            for (my $k=0;$k<scalar(@TakeThat);$k++){
                for (my $l=0;$l<$bestnum;$l++){
                    if (exists $$DesignsBest{$IDSeqKeys[$i]}){
                        if (defined $TakeThat[$k][$l]){
			    push (@{ $$DesignsBest{$IDSeqKeys[$i]}[$k] }, $TakeThat[$k][$l]);
			}
			else {
			    push (@{ $$DesignsBest{$IDSeqKeys[$i]}[$k] }, 'NA');
			}
                    }
                }
            }       
        }
    }
}

##
## Re-design of RNAi reagents
##

sub redesign {
    print ERROR "Status\tNew round of re-design\n";
    my ($IDSeq,$InputSpecRegion,$DesignsLeftover,$LeftoverSpecRegion,$fraglength) = @_;
    my @LeftoverKeys = keys %$DesignsLeftover;
    for (my $i=0;$i<scalar(@LeftoverKeys);$i++){
	if (scalar(@{ $$InputSpecRegion{$LeftoverKeys[$i]} }) > 1){
	    my @regions = @{ $$InputSpecRegion{$LeftoverKeys[$i]} };
	    my $start = 0;
	    my $end = 1;
	    my $bestdiff = $regions[1][0] - $regions[0][1];
# identify best (shortest) regions to fuse
	    for (my $j=1;$j<(scalar(@regions)-1);$j++){
		if (($regions[$j+1][0] - $regions[$j][1]) < $bestdiff){
		    $start = $j;
		    $end = $j+1;
		    $bestdiff = $regions[$j+1][0] - $regions[$j][1];
		}
	    }
# re-write region boundaries
	    $regions[$start][1] = $regions[$end][1];
	    splice(@regions,$end,1);
	    $$InputSpecRegion{$LeftoverKeys[$i]} = [ @regions, ];
	    $$LeftoverSpecRegion{$LeftoverKeys[$i]} = [ [$regions[$start][0],$regions[$start][1]], ];
	}
	else {
# if only one region is leftover, expand target region to full sequence
	    my $end = length($$IDSeq{$LeftoverKeys[$i]}) - $fraglength + 1;
	    $$InputSpecRegion{$LeftoverKeys[$i]} = [ [1,$end], ];
	    $$LeftoverSpecRegion{$LeftoverKeys[$i]} = [ [1,$end], ];
	}
    }
}

##
## Parse Bowtie for RNAi reagent specificity by comparing to its mapping (to the same database)
##

sub BowtieParsePos {
    my ($KeysRef,$bowtie,$RNAiloc,$OTE,$mapped,$mappedIndex,$mappedFile) = @_;
    open (BOWTIE, "<$bowtie") || die "Cannot open Bowtie: $!\n";
    my %OTEnum = ();
    my %siRNA = ();
    my $ID = "";
    my $siRNAID = "";
    while (my $line = <BOWTIE>){
	$line = &cleanLine($line);
	my (@columns) = ();
        @columns = split(/\t/, $line);
	if (scalar(@columns) eq 7){
	    if ($columns[0]=~/^((\S+)_\d+)/){
		$siRNAID = $1;
		$ID = $2;
	    }
	    if (exists $$RNAiloc{$ID}){
# compare positions of current siRNA with mappings of RNAi reagent
		my $in = 0;
# check for multiple mappings of RNAi reagent
		my @chroms = ();
		my @starts = ();
		my @ends = ();
		my @orientations = ();
		&ParseMAPPING($RNAiloc,$ID,$mapped,$mappedIndex,$mappedFile,\@chroms,\@starts,\@ends,\@orientations);
		for (my $i=0;$i<scalar(@chroms);$i++){
# check for mappings containing gaps
		    my @starts2 = split(/\,/,$starts[$i]);
		    my @ends2 = split(/\,/,$ends[$i]);
		    if (($chroms[$i] eq $columns[2]) && ($columns[3] >= $starts2[0]) && ($columns[3] <= $ends2[-1])){
			if (!exists $OTEnum{$ID}){
# number of siRNAs with "off-targets", number of overall "off-targets" (hits outside the mapping of the RNAi reagent)
			    $OTEnum{$ID} = [ 0, 0, ];
			}
			$in = 1;
		    }
		}
		if ($in eq 0){
		    if (!exists $OTEnum{$ID}){
			$OTEnum{$ID} = [ 1, 1, ];
			$siRNA{$siRNAID} = 1;
		    }
		    else {
			$OTEnum{$ID}[0]++;
			if (!exists $siRNA{$siRNAID}){
			    $OTEnum{$ID}[1]++;
			    $siRNA{$siRNAID} = 1;
			}
		    }
		}
	    }
	    else {
		if (!exists $OTEnum{$ID}){
		    $OTEnum{$ID} = [ 1, 1, ];
		    $siRNA{$siRNAID} = 1;
		}
		else {
		    $OTEnum{$ID}[0]++;
		    if (!exists $siRNA{$siRNAID}){
			$OTEnum{$ID}[1]++;
			$siRNA{$siRNAID} = 1;
		    }
		}
	    }
	}
    }
    close BOWTIE;
    for (my $i=0;$i<scalar(@$KeysRef);$i++){
# write output format
	if (exists $OTEnum{$$KeysRef[$i]}){
	    my $OTEvalue = $OTEnum{$$KeysRef[$i]}[1].'/'.$OTEnum{$$KeysRef[$i]}[0];
	    if (!exists $$OTE{$$KeysRef[$i]}){
		$$OTE{$$KeysRef[$i]} = [ $OTEvalue, ];
	    }
	    else {
		push (@{ $$OTE{$$KeysRef[$i]} }, $OTEvalue);
	    }
	}
	else {
# if siRNAs have no target at all, write 0/0
	    my $OTEvalue = '0/0';
	    if (!exists $$OTE{$$KeysRef[$i]}){
		$$OTE{$$KeysRef[$i]} = [ $OTEvalue, ];
	    }
	    else {
		push (@{ $$OTE{$$KeysRef[$i]} }, $OTEvalue);
	    }
	}
    }
}

##
## Parse Bowtie for RNAi reagent specificity by comparing to its identified target
##

sub BowtieParseTarget {
    my ($KeysRef,$bowtie,$targetGroups,$probeTargetGroups,$OTE) = @_;
    open (BOWTIE, "<$bowtie") || die "Cannot open $bowtie: $!\n";
    my %OTEnum = ();
    my %OTEgroup = ();
    my %siRNAID = ();
    my $ID = "";
    my $siRNAID = "";
    while (my $line = <BOWTIE>){
        $line = &cleanLine($line);
	my (@columns) = ();
        @columns = split(/\t/, $line);
        if (scalar(@columns) eq 7){
	    if ($columns[0]=~/((\S+)_\d+)/){
		$siRNAID = $1;
		$ID = $2;
	    }
	    my $target = $columns[2];
# if no targetgroup defined, set to target ID
	    my $ref = "";
	    if (exists $$targetGroups{$target}){
		$ref = $$targetGroups{$target};
	    }
	    else {
		$ref = $target;
	    }
	    if ($ref ne $$probeTargetGroups{$ID}){
		if (!exists $OTEnum{$ID}){
# number of siRNAs with "off-targets", number of overall "off-targets" (having targets besides the calculated primary target)
		    $OTEnum{$ID} = [ 1, 1, ];
		    $OTEgroup{$ID}{$ref} = 1;
		    $siRNAID{$siRNAID} = 1;
		}
		else {
# siRNAs with multiple off-targets to the same targetgroup
		    if ((exists $OTEgroup{$ID}{$ref}) && (exists $siRNAID{$siRNAID})){
			$OTEgroup{$ID}{$ref}++;
		    }
		    elsif ((exists $OTEgroup{$ID}{$ref}) && (!exists $siRNAID{$siRNAID})){
# another siRNA hits the same off-target
			$siRNAID{$siRNAID} = 1;
			$OTEgroup{$ID}{$ref}++;
			$OTEnum{$ID}[0]++;
		    }
		    elsif ((!exists $OTEgroup{$ID}{$ref}) && (exists $siRNAID{$siRNAID})){
# siRNAs with multiple off-targets to different targetgroups
			$OTEgroup{$ID}{$ref} = 1;
			$OTEnum{$ID}[1]++;
		    }
		    else {
# new siRNA with new off-target
			$OTEgroup{$ID}{$ref} = 1;
			$siRNAID{$siRNAID} = 1;
			$OTEnum{$ID}[0]++;
			$OTEnum{$ID}[1]++;
		    }
		}
	    }
	}
    }
    close BOWTIE;
    for (my $i=0;$i<scalar(@$KeysRef);$i++){
# write output format
        if (exists $OTEnum{$$KeysRef[$i]}){
            my $OTEvalue = $OTEnum{$$KeysRef[$i]}[0].'/'.$OTEnum{$$KeysRef[$i]}[1];
            if (!exists $$OTE{$$KeysRef[$i]}){
                $$OTE{$$KeysRef[$i]} = [ $OTEvalue, ];
            }
            else {
                push (@{ $$OTE{$$KeysRef[$i]} }, $OTEvalue);
            }
        }
	else {
# if siRNAs have no target at all, write 0/0
            my $OTEvalue = '0/0';
            if (!exists $$OTE{$$KeysRef[$i]}){
                $$OTE{$$KeysRef[$i]} = [ $OTEvalue, ];
            }
            else {
                push (@{ $$OTE{$$KeysRef[$i]} }, $OTEvalue);
            }
        }
    }
}

##
## Mapping of sequences using BLAST
##

sub BLASTHomology {
    my ($BLAST,$BLASTDB,$cutoff,$outBLAST,$probes,$Homology,$targetgroups,$error) = @_;
# run BLAST
    system ("$BLAST/blastall -p blastn -d $BLASTDB -m 8 -i $probes -o $outBLAST -e 1 -F F") eq 0 || die "Failed to open blastall: $?\n";
# parse BLAST
    open (BLASTOUT, "<$outBLAST") || die "Cannot open BLASTOUT: $!\n";
    my %BestHom = ();
    while (my $line = <BLASTOUT>){
        $line = &cleanLine($line);
        my (@columns) = ();
        @columns = split(/\t/, $line);
        if ($columns[10] < $cutoff){
	    if (exists $$targetgroups{$columns[1]}){
		if (!exists $BestHom{$columns[0]}{$$targetgroups{$columns[1]}}){
		    $BestHom{$columns[0]}{$$targetgroups{$columns[1]}} = $columns[10];
		}
		else {
		    if ($BestHom{$columns[0]}{$$targetgroups{$columns[1]}} > $columns[10]){
			$BestHom{$columns[0]}{$$targetgroups{$columns[1]}} = $columns[10];
		    }
		}
	    }
	    else {
		if (!exists $BestHom{$columns[0]}{$columns[1]}){
                    $BestHom{$columns[0]}{$columns[1]} = $columns[10];
                }
		else {
                    if ($BestHom{$columns[0]}{$columns[1]} > $columns[10]){
			$BestHom{$columns[0]}{$columns[1]} = $columns[10];
                    }
                }
	    }
	}
    }
    close BLASTOUT;
    my @query = keys %BestHom;
    for (my $i=0;$i<scalar(@query);$i++){
	my @target = keys %{ $BestHom{$query[$i]} };
	my @evalue = ();
	for (my $j=0;$j<scalar(@target);$j++){
	    push (@evalue, $BestHom{$query[$i]}{$target[$j]});
	}
	@target = @target[ sort {$evalue[$a] <=> $evalue[$b]} 0 .. $#target ];
	for (my $j=0;$j<scalar(@target);$j++){
	    if (!exists $$Homology{$query[$i]}){
		$$Homology{$query[$i]} = [ "$target[$j]\($BestHom{$query[$i]}{$target[$j]}\)", ];
	    }
	    else {
		push (@{ $$Homology{$query[$i]} }, "$target[$j]\($BestHom{$query[$i]}{$target[$j]}\)");
	    }
	}
    }
}

##
## Mapping of sequences using BLAT
##

sub BLATMapping {
    my ($BLAT,$BLATDBs,$BLATDBFile,$split,$outPrimer,$outBLAT,$error,$program,$host,$port) = @_;
    if ($program eq 'blat'){
	if (scalar(@$BLATDBs) eq 1){
	    $BLATDBFile = $$BLATDBs[0];
	    unless (-e $BLATDBFile){
		print $error "$BLATDBFile\tGENOMEFASTA database for mapping with 'blat' does not exist\n";
		print "GENOMEFASTA database $BLATDBFile for mapping with 'blat' does not exist\n";
		return 'GENOMEFASTA database for BLAT does not exist';
	    }
	}
	else {
	    my @BLATDB = ();
	    for (my $i=0;$i<scalar(@$BLATDBs);$i++){
		if (-e $$BLATDBs[$i]){
		    push (@BLATDB, $$BLATDBs[$i]);
		}
		else {
		    print $error "$$BLATDBs[$i]\tGENOMEFASTA database for mapping with 'blat' does not exist\n";
		    print "GENOMEFASTA database $$BLATDBs[$i] for mapping with 'blat' does not exist\n";
		}
	    }
	    if (-e $BLATDBFile){
# concatenate all BLAT DB files
		print $error "$BLATDBFile\tWARNING: Database file for 'blat' already exists\n";
		print "WARNING: $BLATDBFile database file for 'blat' already exists and will be re-used\n";
	    }
	    else {
		if (scalar(@BLATDB) ne 0){
		    open (BLATDBFILE, ">$BLATDBFile") || die "Cannot open BLATDBFILE: $!\n";
		    for (my $i=0;$i<scalar(@BLATDB);$i++){
			open (BLATDB, "<$BLATDB[$i]") || die "Cannot open BLATDB: $!\n";
			while (my $line = <BLATDB>){
			    $line = &cleanLine($line);
			    print BLATDBFILE "$line\n";
			}
			close BLATDB;
		    }
		    close BLATDBFILE;
		    &fileLoc('Unlink',$BLATDBFile);
		}
		else {
		    print $error "GENOMEFASTA\tNo databases for mapping with 'blat' found\n";
		    print "No GENOMEFASTA databases for mapping with 'blat' found\n";
		    return 'No GENOMEFASTA databases for mapping with blat found';
		}
	    }
	}
    }
# split database file if queried                                                                                                                                                                         
    if (($split ne 0) && ($program eq 'blat')){
        my @splitfiles = ();
        &splitDB($split,$BLATDBFile,\@splitfiles,$error);
        if (scalar(@splitfiles) > 0){
# run BLAT program, consider split option
            my @BLATfiles = ();
            for (my $i=0;$i<scalar(@splitfiles);$i++){
                my $outBLA = $outBLAT.'_'.$i;
                push (@BLATfiles,$outBLA);
                if (-e $outBLA){
                    print $error "$outBLA\tRunning BLAT omitted, output file already exists\n";
                }
                else {
		    system ("$BLAT/blat -stepSize=5 -repMatch=2253 -minScore=0 -minIdentity=100 $splitfiles[$i] $outPrimer $outBLA") eq 0 || die "Failed to open blat: $?\n";
                    &fileLoc('Unlink',$outBLA);
                }
            }
# concatenate BLAT output
	    open (OUTBLAT, ">$outBLAT") || die "Cannot open OUTBLAT: $!\n";
	    for (my $i=0;$i<scalar(@BLATfiles);$i++){
		open (OUT, "<$BLATfiles[$i]") || die "Cannot open OUT: $!\n";
		while (my $line = <OUT>){
		    $line = &cleanLine($line);
		    print OUTBLAT "$line\n";
		}
		close OUT;
	    }
	    close OUTBLAT;
        }
        else {
	    system ("$BLAT/blat -stepSize=5 -repMatch=2253 -minScore=0 -minIdentity=100 $BLATDBFile $outPrimer $outBLAT") eq 0 || die "Failed to open blat: $?\n";
        }
    }
    else {
	if ($program eq 'gfClient'){
	    system ("$BLAT/gfClient $host $port / $outPrimer $outBLAT -minScore=0 -minIdentity=0") eq 0 || die "Failed to open gfClient: $?\n";
	}
	elsif ($program eq 'blat'){
	    system ("$BLAT/blat -stepSize=5 -repMatch=2253 -minScore=0 -minIdentity=100 $BLATDBFile $outPrimer $outBLAT") eq 0 || die "Failed to open blat: $?\n";
	}
    }
    if (-e $outBLAT){
        &fileLoc('Unlink',$outBLAT);
        return 'Success';
    }
}

##
## Mapping of sequences using Bowtie
##

sub BowtieMapping {
    my ($MAPDB,$bowtie,$outPrimer,$outBowtie,$error) = @_;
    my $index = 0;
    for (my $i=0;$i<scalar(@$MAPDB);$i++){
	if ((-e "$$MAPDB[$i]\.1\.ebwt") && (-e "$$MAPDB[$i]\.2\.ebwt") && (-e "$$MAPDB[$i]\.3\.ebwt") && (-e "$$MAPDB[$i]\.4\.ebwt") && (-e "$$MAPDB[$i]\.rev.1\.ebwt") && (-e "$$MAPDB[$i]\.rev.2\.ebwt")){
	    if ($index eq 0){
		system ("$bowtie\\bowtie -p 4 -f -v 0 -k 3 -m 3 $$MAPDB[$i] $outPrimer > $outBowtie") eq 0 || die "Failed to open bowtie: $?\n";
	    }
	    else {
		system ("$bowtie\\bowtie -p 4 -f -v 0 -k 3 -m 3 $$MAPDB[$i] $outPrimer >> $outBowtie") eq 0 || die "Failed to open bowtie: $?\n";
	    }
	    $index++;
	}
	else {
	    print $error "$$MAPDB[$i]\tNot all Bowtie index files found for this database\n";
	    print "Not all Bowtie index files found for $$MAPDB[$i]\n";
	}
    }
    if (-e $outBowtie){
	&fileLoc('Unlink',$outBowtie);
	return 'Success';
    }
    else {
	return 'No Bowtie index/database found, mapping failed';
    }
}

##
## Parse single line from Bowtie mapping
##

sub ParseBowtieLine{
    my $line = $_[0];
    $line = &cleanLine($line);
    my (@columns) = ();
    @columns = split(/\t/, $line);
    if (scalar(@columns) eq 7){
	my $start = $columns[3] + 1;
	my $end = $start + length($columns[4]) - 1;
	return ($columns[2],$start,$end,$columns[1]);
    }
}

##
## Parse Bowtie output of primer/reagent mappings
##

sub ParseBowtieMap{
    my ($outPrimer2,$name,$PrimerMapped,$PrimerNotMapped,$RNAiloc,$NOTmapped,$outBO,$indexBO) = @_;
# get length of sequences to compare them with mapping length in Bowtie result
    my %PrimerLen = ();
    my %PrimerSeq = ();
    my %PrimerIDs = ();
    open (OUTPRIME2, "<$outPrimer2") || die "Cannot open OUTPRIME2: $!\n";
    while (my $line = <OUTPRIME2>){
        $line = &cleanLine($line);
        my (@columns) = ();
        @columns = split(/\t/, $line);
# calculate sequence length
        my $len = length($columns[1]);
        if (!exists $PrimerLen{$columns[0]}){
            $PrimerLen{$columns[0]} = $len;
            $PrimerSeq{$columns[0]} = $columns[1];
        }
        if (($columns[0]=~/^(\S+).*_f$/) || ($columns[0]=~/^(\S+).*_r$/)){
            if (!exists $PrimerIDs{$1}){
                $PrimerIDs{$1} = 1;
            }
        }
    }
    close OUTPRIME2;

# check for mappings of reagents and write results to file
    open (MAPPED, ">$PrimerMapped") || die "Cannot open MAPPED: $!\n";
    open (NOTMAPPED, ">$PrimerNotMapped") || die "Cannot open NOTMAPPED: $!\n";
    my @PrimerIDs = keys %PrimerIDs;
# no primer information available
    if (scalar(@PrimerIDs) eq 0){
	my @PrimerLen = keys %PrimerLen;
	for (my $i=0;$i<scalar(@PrimerLen);$i++){
# no match
	    if (!exists $$RNAiloc{$PrimerLen[$i]}{$name}){
		print NOTMAPPED "$PrimerLen[$i]\t$PrimerSeq{$PrimerLen[$i]}\tReagent could not be mapped\n";
		if (!exists $$NOTmapped{$PrimerLen[$i]}){
		    $$NOTmapped{$PrimerLen[$i]} = [ $PrimerSeq{$PrimerLen[$i]}, 'Reagent could not be mapped', ];
		}                
		else {
		    $$NOTmapped{$PrimerLen[$i]}[1].= '|Reagent could not be mapped';
		}
	    }
	    elsif ((exists $$RNAiloc{$PrimerLen[$i]}{$name}) && (scalar (keys %{ $$RNAiloc{$PrimerLen[$i]}{$name} }) eq 1)){
# exactly one match
		my @lines = keys %{ $$RNAiloc{$PrimerLen[$i]}{$name} };
		my $line = &line_with_index($outBO,$indexBO,$lines[0]);
		my ($chrom,$start,$end,$orientation) = &ParseBowtieLine($line);
		print MAPPED "$PrimerLen[$i]\tNA\tNA\tNA\t$orientation\tNA\tNA\tNA\t$orientation\t$chrom\t$start\t$end\tfull\n";
		if (exists $$NOTmapped{$PrimerLen[$i]}){
		    if (($$NOTmapped{$PrimerLen[$i]}[1]!~/times/) && ($$NOTmapped{$PrimerLen[$i]}[1]!~/products/)){
			delete $$NOTmapped{$PrimerLen[$i]};
		    }
		}
	    }
	    else {
# multiple matches
		my @lines = keys %{ $$RNAiloc{$PrimerLen[$i]}{$name} };
		my $product = scalar(@lines);
		print NOTMAPPED "$PrimerLen[$i]\t$PrimerSeq{$PrimerLen[$i]}\tReagent maps $product times\n";
		if (!exists $$NOTmapped{$PrimerLen[$i]}){
		    $$NOTmapped{$PrimerLen[$i]} = [ $PrimerSeq{$PrimerLen[$i]}, "Reagent maps $product times", ];
		}
		else {
		    $$NOTmapped{$PrimerLen[$i]}[1].= "|Reagent maps $product times";
		}
		
		for (my $j=0;$j<scalar(@lines);$j++){
		    my $line = &line_with_index($outBO,$indexBO,$lines[$j]);
		    my ($chrom,$start,$end,$orientation) = &ParseBowtieLine($line);
		    print MAPPED "$PrimerLen[$i]\tNA\tNA\tNA\t$orientation\tNA\tNA\tNA\t$orientation\t$chrom\t$start\t$end\tfull\n";
		}
	    }
	}
    }
    else {
# primer information available
# check for possible products from forward and reverse primer mapping and write results to file
        for (my $i=0;$i<scalar(@PrimerIDs);$i++){
            my $forward = $PrimerIDs[$i].'_f';
            my $reverse = $PrimerIDs[$i].'_r';
            my $product = 0;
# no amplicon, cause one or both primers could not be mapped
            if ((!exists $$RNAiloc{$forward}{$name}) || (!exists $$RNAiloc{$reverse}{$name})){
                print NOTMAPPED "$PrimerIDs[$i]\t$PrimerSeq{$forward}\t$PrimerSeq{$reverse}\tOne or no primer matches the genome\n";
		if (!exists $$NOTmapped{$PrimerIDs[$i]}){
                    $$NOTmapped{$PrimerIDs[$i]} = [ "$PrimerSeq{$forward},$PrimerSeq{$reverse}", 'One or no primer matches the genome', ];
                }
                else {
                    $$NOTmapped{$PrimerIDs[$i]}[1].= '|One or no primer matches the genome';
                }
            }
            else {
                my @amplicons = ();
		my @linesFor = keys %{ $$RNAiloc{$forward}{$name} };
                my @linesRev = keys %{ $$RNAiloc{$reverse}{$name} };
		for (my $j=0;$j<scalar(@linesFor);$j++){
		    my $lineFor = &line_with_index($outBO,$indexBO,$linesFor[$j]);
                    my ($chromFor,$startFor,$endFor,$orientationFor) = &ParseBowtieLine($lineFor);
		    for (my $k=0;$k<scalar(@linesRev);$k++){
			my $lineRev = &line_with_index($outBO,$indexBO,$linesRev[$k]);
			my ($chromRev,$startRev,$endRev,$orientationRev) = &ParseBowtieLine($lineRev);
# same chromosome
                        if ($chromFor eq $chromRev){
# one primer matches '+', the other '-' orientation
                            if ($orientationFor ne $orientationRev){
                                my $pointer = 0;
# end of forward primer on '+' must be upstream of end of reverse primer
                                if ($orientationFor eq '+'){
                                    if ($endFor < $endRev){
                                        $pointer = 1;
                                    }
                                }
                                else {
# end of forward primer on '-' must be downstream of end of reverse primer
                                    if ($endFor > $endRev){
                                        $pointer = 1;
                                    }
                                }
                                if ($pointer eq 1){
                                    my $primers = $startRev - $startFor;
                                    my $diff = 0;
                                    my $print_start = 0;
                                    my $print_end = 0;
# calculation of product length
				    if ($primers < 0){
                                        $diff = $endFor - $startRev;
                                        $print_start = $startRev;
                                        $print_end = $endFor;
                                    }
                                    else {
                                        $diff = $endRev - $startFor;
                                        $print_start = $startFor;
                                        $print_end = $endRev;
                                    }
# primers must map within range of 4 kb
                                    if ($diff<=4000){
                                        push (@amplicons, [ $PrimerIDs[$i], $forward, $startFor, $endFor, $orientationFor, $reverse, $startRev, $endRev, $orientationRev, $chromFor, $print_start, $print_end ]);
                                    }
                                }
                            }
                        }
                    }
                }
# no amplicon
                if (scalar(@amplicons) eq 0){
                    print NOTMAPPED "$PrimerIDs[$i]\t$PrimerSeq{$forward}\t$PrimerSeq{$reverse}\tPrimer do not amplify a product during PCR\n";
                    if (!exists $$NOTmapped{$PrimerIDs[$i]}){
                        $$NOTmapped{$PrimerIDs[$i]} = [ "$PrimerSeq{$forward},$PrimerSeq{$reverse}", 'Primer do not amplify a product during PCR', ];
                    }
                    else {
                        $$NOTmapped{$PrimerIDs[$i]}[1].= '|Primer do not amplify a product during PCR';
                    }
                }
                elsif (scalar(@amplicons) eq 1){
# exactly one amplicon
                    print MAPPED "$amplicons[0][0]\t$amplicons[0][1]\t$amplicons[0][2]\t$amplicons[0][3]\t$amplicons[0][4]\t$amplicons[0][5]\t$amplicons[0][6]\t$amplicons[0][7]\t$amplicons[0][8]\t$amplicons[0][9]\t$amplicons[0][10]\t$amplicons[0][11]\tfull\n";
                    if (exists $$NOTmapped{$amplicons[0][0]}){
                        if (($$NOTmapped{$amplicons[0][0]}[1]!~/times/) && ($$NOTmapped{$amplicons[0][0]}[1]!~/products/)){
                            delete $$NOTmapped{$amplicons[0][0]};
                        }
                    }
		}
		else {
# multiple amplicons
		    my $product = scalar(@amplicons);
		    print NOTMAPPED "$PrimerIDs[$i]\t$PrimerSeq{$forward}\t$PrimerSeq{$reverse}\tPrimer amplify $product products during PCR\n";
		    if (!exists $$NOTmapped{$amplicons[0][0]}){
			$$NOTmapped{$amplicons[0][0]} = [ "$PrimerSeq{$forward},$PrimerSeq{$reverse}", "Primer amplify $product products during PCR", ];
		    }
		    else {
			$$NOTmapped{$amplicons[0][0]}[1].= "|Primer amplify $product products during PCR";
		    }
		    for (my $j=0;$j<scalar(@amplicons);$j++){
			print MAPPED "$amplicons[$j][0]\t$amplicons[$j][1]\t$amplicons[$j][2]\t$amplicons[$j][3]\t$amplicons[$j][4]\t$amplicons[$j][5]\t$amplicons[$j][6]\t$amplicons[$j][7]\t$amplicons[$j][8]\t$amplicons[$j][9]\t$amplicons[$j][10]\t$amplicons[$j][11]\tfull\n";
		    }
		}
	    }
	}
    }
    close MAPPED;
    close NOTMAPPED;
}

##
## Parse single line from BLAT mapping
##

sub ParseBLATLine{
    my ($line,$starts,$ends,$lengths) = @_;
    $line = &cleanLine($line);
    my (@columns) = ();
    @columns = split(/\t/, $line);
    if (scalar(@columns) eq 21){
# length of fragments
	@$lengths = split(/,/, $columns[18]);
# starts in target region
	@$starts = split(/,/, $columns[20]);
# add 1 to start (BLAT property) and calculate ends
	for (my $i=0;$i<scalar(@$starts);$i++){
	    $$starts[$i]++;
	    my $end = $$starts[$i] + $$lengths[$i] - 1;
	    push (@$ends, $end);
	}
	if (($columns[0] eq $columns[10]) && ($columns[1] eq 0)){
	    return ($columns[13],$columns[8],'full');
	}
	else {
	    return ($columns[13],$columns[8],'partial');
	}
    }
}

##
## Parse BLAT output of primer/reagent mappings
##

sub ParseBLAT{
    my ($outPrimer2,$name,$PrimerMapped,$PrimerNotMapped,$RNAiloc,$NOTmapped,$siRNAExt,$mapping,$outBLAT,$indexBLAT) = @_;
# get IDs to be remapped
    print "Mapping: $mapping\n";
    my %remap = ();
    open (OUTPRIME2, "<$outPrimer2") || die "Cannot open OUTPRIME2: $!\n";
    while (my $line = <OUTPRIME2>){
        $line = &cleanLine($line);
        my (@columns) = ();
        @columns = split(/\t/, $line);
        if (!exists $remap{$columns[0]}){
            $remap{$columns[0]} = $columns[1];
        }
    }
    close OUTPRIME2;

# check if remapping was successfull
    if (-e $PrimerMapped){
        open (MAPPED, ">>$PrimerMapped") || die "Cannot open MAPPED: $!\n";
    }
    else {
        open (MAPPED, ">$PrimerMapped") || die "Cannot open MAPPED: $!\n";
    }
    if (-e $PrimerNotMapped){
        open (NOTMAPPED, ">>$PrimerNotMapped") || die "Cannot open NOTMAPPED: $!\n";
    }
    else {
        open (NOTMAPPED, ">$PrimerNotMapped") || die "Cannot open NOTMAPPED: $!\n";
    }
    my %partial = ();
    my @remap = keys %remap;
    for (my $i=0;$i<scalar(@remap);$i++){
        if (exists $$RNAiloc{$remap[$i]}{$name}){
	    my @lines = keys %{ $$RNAiloc{$remap[$i]}{$name} };
	    my $full = 0;
	    my $partial = 0;
	    for (my $j=0;$j<scalar(@lines);$j++){
		my $line = &line_with_index($outBLAT,$indexBLAT,$lines[$j]);
		my @starts = ();
		my @ends = ();
		my @lengths = ();
		my ($chrom,$orientation,$status) = &ParseBLATLine($line,\@starts,\@ends,\@lengths);
# subtract extensions in case siRNAs were mapped, only two mappings are allowed
		if ($status eq 'full'){
		    if (exists $$siRNAExt{$remap[$i]}){
			if (scalar(@starts) eq 2){
			    if ($orientation eq '+'){
				$starts[0]+= $$siRNAExt{$remap[$i]}[0];
				$ends[1]-= $$siRNAExt{$remap[$i]}[1];
			    }
			    else {
				$starts[0]+= $$siRNAExt{$remap[$i]}[1];
				$ends[1]-= $$siRNAExt{$remap[$i]}[0];
			    }
			}
			else {
			    if ($orientation eq '+'){
				$starts[0]+= $$siRNAExt{$remap[$i]}[0];
				$ends[0]-= $$siRNAExt{$remap[$i]}[1];
			    }
			    else {
				$starts[0]+= $$siRNAExt{$remap[$i]}[1];
				$ends[0]-= $$siRNAExt{$remap[$i]}[0];
			    }
			}
		    }
		    my $startjoin = join(",", @starts);
		    my $endjoin = join(",", @ends);
		    print MAPPED "$remap[$i]\tNA\tNA\tNA\t$orientation\tNA\tNA\tNA\t$orientation\t$chrom\t$startjoin\t$endjoin\tfull\n";
		    $full++;
		}
		else {
		    if ($mapping ne 'PERFECT'){
			my $startjoin = join(",", @starts);
			my $endjoin = join(",", @ends);
			print MAPPED "$remap[$i]\tNA\tNA\tNA\t$orientation\tNA\tNA\tNA\t$orientation\t$chrom\t$startjoin\t$endjoin\tpartial\n";
			$partial++;
		    }
		}
	    }
	    if ($full eq 1){
# single, complete mapping
		if (exists $$NOTmapped{$remap[$i]}){
		    delete $$NOTmapped{$remap[$i]};
		}
	    }
	    elsif ($full > 1){
# multiple, complete mappings
		print NOTMAPPED "$remap[$i]\t$remap{$remap[$i]}\tReagent maps $full times\n";
		$$NOTmapped{$remap[$i]} = [ $remap{$remap[$i]}, "Reagent maps $full times", ];
	    }
	    else {
# partial mappings
		if (($mapping ne 'PERFECT') && ($partial > 0)){
		    print NOTMAPPED "$remap[$i]\t$remap{$remap[$i]}\tReagent maps incomplete\n";
		    $$NOTmapped{$remap[$i]} = [ $remap{$remap[$i]}, "Reagent maps incomplete", ];
		}
		else {
# no mappings
		    print NOTMAPPED "$remap[$i]\t$remap{$remap[$i]}\tReagent could not be mapped\n";
		    if (!exists $$NOTmapped{$remap[$i]}){
			$$NOTmapped{$remap[$i]} = [ $remap{$remap[$i]}, 'Reagent could not be mapped', ];
		    }
		    else {
			$$NOTmapped{$remap[$i]}[1].= '|Reagent could not be mapped';
		    }
		}
	    }
	}
    }
    close MAPPED;
    close NOTMAPPED;
}

##
## Extract sub-sequence from FASTA file
##

sub getSeq {
    my ($outfolder,$identifier,$DBFile,$PrimerMapped,$IDSeq,$error) = @_;
# read source sequences in pieces of 50 nt
    my %sequences = ();
    for (my $i=0;$i<scalar(@$DBFile);$i++){
	if (-e $$DBFile[$i]){
	    &readFASTA($$DBFile[$i],\%sequences,$error,'permissive');
	}
	else {
	    print $error "$$DBFile[$i]\tGENOMEFASTA databases file was not found\n";
	    print "GENOMEFASTA databases file $$DBFile[$i] was not found.\n";
	}
    }
    my @sequences = keys %sequences;
    if (scalar(@sequences) eq 0){
	return 'No sequences found in GENOMEFASTA file(s)';
    }
    my %seq = ();
    for (my $i=0;$i<scalar(@sequences);$i++){
	$seq{$sequences[$i]} = [];
	my $pos = 0;
	while ($pos <= length($sequences{$sequences[$i]})){
	    push (@{ $seq{$sequences[$i]} },substr($sequences{$sequences[$i]},$pos,50));
	    $pos+= 50;
	}
	$pos-= 50;
	my $len = length($sequences{$sequences[$i]}) - $pos;
	push (@{ $seq{$sequences[$i]} },substr($sequences{$sequences[$i]},$pos,$len));
	delete $sequences{$sequences[$i]};
    }
# read mapping information and extract sequences
    my %amb = ();
    open (MAPPING,"<$PrimerMapped") || die "Cannot open MAPPING: $!\n";
    my $outIDSeq = $outfolder.'NEXT-RNAi_'.$identifier.'_Input.fa';
    &fileLoc('Input',$outIDSeq,'Input1');
    open (PROBES, ">$outIDSeq") || die "Cannot open PROBES: $!\n";
    while (my $line = <MAPPING>){
	$line = &cleanLine($line);
	my (@columns) = ();
	@columns = split(/\t/, $line);
# calculate length, start, end and reference ID of sequence
	my $len = $columns[11] - $columns[10] + 1;
	my $start = $columns[10];
	my $end = $columns[11];
	my $chrom = $columns[9];
# calculate position in sequence array
	my $div_start = int($start/50);
	my $mod_start = $start % 50;
	if ($mod_start eq 0){
	    $div_start-=1;
	}
	my $div_end = int($end/50);
	my $mod_end = $end % 50;
	my $sequence = "";
# assemble sequence
	for (my $k=$div_start;$k<=$div_end;$k++){
	    if ($k eq $div_start){
		if ($div_start eq $div_end){
		    $sequence = substr($seq{$chrom}[$k],($mod_start-1),($mod_end - $mod_start + 1));
		}
		else {
		    $sequence = substr($seq{$chrom}[$k],($mod_start-1));
		}
	    }
	    elsif ($k eq $div_end){
		$sequence = $sequence.substr($seq{$chrom}[$k],0,$mod_end);
	    }
	    else {
		$sequence = $sequence.$seq{$chrom}[$k];
	    }
	}
	if (!exists $$IDSeq{$columns[0]}){
	    if (!exists $amb{$columns[0]}){
		$$IDSeq{$columns[0]} = $sequence;
	    }
	}
	else {
	    delete $$IDSeq{$columns[0]};
	    $amb{$columns[0]} = "";
	    print $error "$columns[0]\tPrimer amplify muliple products during PCR and are not further analysed\n";
	}
    }
    undef %seq;
    close MAPPING;
    my @validSeq = keys %$IDSeq;
    for (my $i=0;$i<scalar(@validSeq);$i++){
	print PROBES ">$validSeq[$i]\n$$IDSeq{$validSeq[$i]}\n";
    }
    close PROBES;
    return 'Success';
}

##
## Parse mapping output for ID and return location
##

sub ParseMAPPING{
    my ($RNAiloc,$ID,$mapped,$mappedIndex,$mappedFile,$chroms,$starts,$ends,$orientations) = @_;    
    if (exists $$RNAiloc{$ID}{$mappedFile}){
	my @lines = keys %{ $$RNAiloc{$ID}{$mappedFile} };
	for (my $i=0;$i<scalar(@lines);$i++){
	    my $line = &line_with_index($mapped,$mappedIndex,$lines[$i]);
	    $line = &cleanLine($line);
	    my (@columns) = split(/\t/, $line);
# differentiate between 'full' and 'partial' mappings
	    push (@$chroms,$columns[9]);
	    push (@$starts,$columns[10]);
	    push (@$ends,$columns[11]);
	    push (@$orientations,$columns[8]);
	}
    }
}


##
## Generate GFF2 or GFF3 output file
##

sub GFFGenerator {
    my ($identifier,$DesignsOUTPUT,$RNAiloc,$outGff,$gffFormat,$mapped,$mappedIndex,$mappedFile) = @_;
    &fileLoc('Output',$outGff,'GFF');
    open (OUTGFF, ">$outGff") || die "Cannot open OUTGFF: $!\n";
    my @RNAiloc = keys %$RNAiloc;
    my %ID = ();
# get mappings of RNAi reagents
    for (my $i=0;$i<scalar(@RNAiloc);$i++){
	my $IDmain = "";
	my $IDindex = "";
	if ($RNAiloc[$i]=~/(\S+)_(\d+)/){
	    $IDmain = $1;
	    $IDindex = $2 - 1;
	}
# get target information of RNAi reagents
	if (exists $$DesignsOUTPUT{$IDmain}){
	    my @targetFBgn = split(/\&/, $$DesignsOUTPUT{$IDmain}[19][$IDindex]);
	    my $target = "";
# check for multiple mappings of RNAi reagent
	    my @chroms = ();
            my @starts = ();
            my @ends = ();
            my @orientations = ();
	    &ParseMAPPING($RNAiloc,$RNAiloc[$i],$mapped,$mappedIndex,$mappedFile,\@chroms,\@starts,\@ends,\@orientations);
	    for (my $j=0;$j<scalar(@chroms);$j++){
# check if ID was already printed
		my $IDnew = "";
		if (!exists $ID{$RNAiloc[$i]}){
		    $IDnew = $RNAiloc[$i];
		    $ID{$RNAiloc[$i]} = 2;
		}
		else {
		    $IDnew = $RNAiloc[$i].':'.$ID{$RNAiloc[$i]};
		    $ID{$RNAiloc[$i]}++;
		}
# check for mappings containing gaps
		my @starts2 = split(/\,/,$starts[$j]);
		my @ends2 = split(/\,/,$ends[$j]);
		for (my $k=0;$k<scalar(@starts2);$k++){
		    if (scalar (@targetFBgn) eq 0){
			$target = "$$RNAiloc{$RNAiloc[$i]}[$j]\:$starts2[$k]\.\.$ends2[$k]";
		    }
		    else {
			$target = join(",", @targetFBgn);
		    }
		    if ($gffFormat eq 'GFF3'){
# print GFF3 format
			print OUTGFF "$chroms[$j]\tNEXT-RNAi\t$identifier\t$starts2[$k]\t$ends2[$k]\t.\t$orientations[$j]\t.\tID=$IDnew;Name=$RNAiloc[$i];Note=$IDmain($target);\n";
		    }
		    else {
# print GFF2 format
			print OUTGFF "$chroms[$j]\tNEXT-RNAi\t$identifier\t$starts2[$k]\t$ends2[$k]\t.\t$orientations[$j]\t.\t$identifier $IDnew;Note \"$IDmain($target)\"\n";
		    }
		}
	    }
	}
    }
    close OUTGFF;
}

##
## Generate annotation output file for direct GBrowse upload
##

sub AFFGenerator {
    my ($identifier,$RNAiloc,$outAff,$mapped,$mappedIndex,$mappedFile) = @_;
    &fileLoc('Output',$outAff,'AFF');
    open (OUTAFF, ">$outAff") || die "Cannot open OUTAFF: $!\n";
# print TRACK settings
    print OUTAFF "\[$identifier\]\nglyph = segments\nfgcolor = grey\nbgcolor = palegoldenrod\ndescription = 0\nlink = link\n\n";
    my @RNAiloc = keys %$RNAiloc;
    my %ID = ();
# get mappings of RNAi reagents
    for (my $i=0;$i<scalar(@RNAiloc);$i++){
# check for multiple mappings of RNAi reagent
	my @chroms = ();
	my @starts = ();
	my @ends = ();
	my @orientations = ();
	&ParseMAPPING($RNAiloc,$RNAiloc[$i],$mapped,$mappedIndex,$mappedFile,\@chroms,\@starts,\@ends,\@orientations);
	for (my $j=0;$j<scalar(@chroms);$j++){
# check if ID was already printed
	    my $IDnew = "";
	    if (!exists $ID{$RNAiloc[$i]}){
		$IDnew = $RNAiloc[$i];
		$ID{$RNAiloc[$i]} = 2;
	    }
	    else {
		$IDnew = $RNAiloc[$i].':'.$ID{$RNAiloc[$i]};
		$ID{$RNAiloc[$i]}++;
	    }
# check for mappings containing gaps
	    my @starts2 = split(/\,/,$starts[$j]);
	    my @ends2 = split(/\,/,$ends[$j]);
	    my @mapped = ();
	    for (my $k=0;$k<scalar(@starts2);$k++){
		if ($orientations[$j] ne '-'){
		    push(@mapped,"$starts2[$k]..$ends2[$k]");
		}
		else {
		    push(@mapped,"$ends2[$k]..$starts2[$k]");
		}
	    }
	    my $mapped = "";
	    if ($orientations[$j] eq '-'){
		@mapped = reverse(@mapped);
	    }
	    $mapped = join(",",@mapped);
	    my $IDlink = "";
	    if ($IDnew=~/^(\S+)\_\d+$/){
		$IDlink = $1;
	    }
	    print OUTAFF "reference = $chroms[$j]\n$identifier\t$IDnew\t$mapped\t$IDlink\t\'\'\n\n";
	}
    }
    close OUTAFF;
}

##
## Calculate feature content of an RNAi reagent
##

sub FeatureContent {
    print "Parsing feature contents\n";
    my ($DesignsPRINT,$RNAiloc,$FeatureLoc,$FeatureName,$FeatureNum,$error,$mapped,$mappedIndex,$mappedFile) = @_;
    my @IDscovered = keys %$DesignsPRINT;
# get locations for all features
    my %FeatureLoc = ();
    my %FeatureDone = ();
    my %header = ();
    my $header = 0;
# parse 1000000 features at a time from feature input file
    my $cut = 1000000;
    my $count = 0;
    open (FEATLOC, "<$FeatureLoc") || die "Cannot open FEATLOC: $!\n";
    &fileLoc('Input',$FeatureLoc,'Feature');
  FEATURELOC:
    while (my $line = <FEATLOC>){
	$line = &cleanLine($line);
	my (@columns) = ();
	@columns = split(/\t/, $line);
# get header information
	if ($header eq 0){
	    for (my $i=0;$i<scalar(@columns);$i++){
		if (!exists $header{$columns[$i]}){
		    $header{$columns[$i]} = $i;
		}
	    }
	    if ((!exists $header{'FeatureName'}) || (!exists $header{'FeatureStart'}) || (!exists $header{'FeatureEnd'}) || (!exists $header{'FeatureLoc'})){
		print $error "$FeatureLoc\tFeature file contains wrong header information ('FeatureName', 'FeatureStart', 'FeatureEnd' and 'FeatureLoc' headers required)\n";
		print "$FeatureLoc feature file contains wrong header information ('FeatureName', 'FeatureStart', 'FeatureEnd' and 'FeatureLoc' headers required)\n";
		last FEATURELOC;
	    }
	}
	else {
# get feature locations
	    if (!exists $$FeatureName{$columns[$header{'FeatureName'}]}){
		$$FeatureName{$columns[$header{'FeatureName'}]} = '';
	    }
	    if ($count <= $cut){
		for (my $i=$columns[$header{'FeatureStart'}];$i<=$columns[$header{'FeatureEnd'}];$i++){
		    $FeatureLoc{$columns[$header{'FeatureName'}]}{$columns[$header{'FeatureLoc'}]}{$i} = '';
		    $count++;
		}
	    }
	    else {
		for (my $i=0;$i<scalar(@IDscovered);$i++){
		    my $index = 1;
		    for (my $j=0;$j<scalar(@{ $$DesignsPRINT{$IDscovered[$i]}[17] });$j++){
			my $IDsub = $IDscovered[$i].'_'.$index;
			my @features = keys %FeatureLoc;
			for (my $k=0;$k<scalar(@features);$k++){
			    if (exists $$RNAiloc{$IDsub}){
# check all mappings for contained features
				my @chroms = ();
				my @starts = ();
				my @ends = ();
				my @orientations = ();
				&ParseMAPPING($RNAiloc,$IDsub,$mapped,$mappedIndex,$mappedFile,\@chroms,\@starts,\@ends,\@orientations);
				for (my $x=0;$x<scalar(@chroms);$x++){
				    if (exists $FeatureLoc{$features[$k]}{$chroms[$x]}){
# check for mappings containing gaps
					my @starts2 = split(/\,/,$starts[$x]);
					my @ends2 = split(/\,/,$ends[$x]);
					for (my $y=0;$y<scalar(@starts2);$y++){
					    for (my $l=$starts2[$y];$l<=$ends2[$y];$l++){
# identify overlaps between RNAi reagent mapping and feature location
						if ((exists $FeatureLoc{$features[$k]}{$chroms[$x]}{$l}) && (!exists $FeatureDone{$IDsub}{$features[$k]}{$l})){
						    if (!exists $$FeatureNum{$features[$k]}{$IDsub}){
							$$FeatureNum{$features[$k]}{$IDsub} = 1;
						    }
						    else {
							$$FeatureNum{$features[$k]}{$IDsub}++;
						    }
						    $FeatureDone{$IDsub}{$features[$k]}{$l} = '';
						}
					    }
					}
				    }
				}
			    }
			}
			$index++;
		    }
		}
		undef %FeatureLoc;
		$count = 0;
	    }
	}
	$header++;
    }
    close FEATLOC;
}

##
## Identify regions of low complexity (simple nucleotide repeats) using mDust (compiled version)
##

sub LowCompRegions{
    my ($outfolder,$identifier,$IDSeqBest,$mdust,$IDSeqBestFASTA,$LowCompRegions,$LowCompPos) = @_;
#########################################################
#                                                       #
# mdust arguments to identify regions of low complexity #
#                                                       #
# w = word size for masking (default = 3)               #
# v = cutoff score for masking (default = 28)           #
# $wstart/$wend and $vstart/$vend define a range of     #
# arguments over which the subroutine iterates          #
#                                                       #
#########################################################
    my $wstart = 3;
    my $wend = 3;
    my $vstart = 28;
    my $vend = 28;
    my $output_1 = "";
    my $output_2 = "";
    my @seq = keys %$IDSeqBest;
# run mdust
    for (my $i=$wstart;$i<=$wend;$i++){
        for (my $j=$vstart;$j<=$vend;$j++){
            $output_1 = $outfolder.'NEXT-RNAi_'.$identifier.'.mdust'.'_'.$i.'_'.$j;
            $output_2 = $outfolder.'NEXT-RNAi_'.$identifier.'.mdust_report'.'_'.$i.'_'.$j;
	    &fileLoc('Unlink',$output_1);
            system ("$mdust\\mdust $IDSeqBestFASTA -w $i -v $j >$output_1") eq 0 || die "Failed to open mdust: $?\n";
	    &fileLoc('Unlink',$output_2);
            system ("$mdust\\mdust $IDSeqBestFASTA -w $i -v $j -c >$output_2") eq 0 || die "Failed to open mdust: $?\n";
        }
    }
# write summary file with extracted low-complexity sequences and their length
    my %LowComp = ();
    for (my $i=$wstart;$i<=$wend;$i++){
        for (my $j=$vstart;$j<=$vend;$j++){
            $output_2 = $outfolder.'NEXT-RNAi_'.$identifier.'.mdust_report'.'_'.$i.'_'.$j;
            open (OUTPUT2, "<$output_2") || die "Cannot open OUTPUT2: $!\n";
            my %ID_unique = ();
            my $low_num = 0;
            my $low_unique = 0;
            my $multiple = 0;
            while (my $line = <OUTPUT2>){
                $line = &cleanLine($line);
                my @columns = ();
                @columns = split(/\t/, $line);
                if (!exists $ID_unique{$columns[0]}){
                    $ID_unique{$columns[0]} = 1;
                    $low_unique++;
                }
                else {
                    $ID_unique{$columns[0]}++;
                }
                my $len_low = $columns[3] - $columns[2] + 1;
                my $start_low = $columns[2] - 1;
		my $seq_low = substr($$IDSeqBest{$columns[0]}, $start_low, $len_low);
                $low_num++;
                if (!exists $LowComp{$columns[0]}){
                    $LowComp{$columns[0]} = [ $len_low, ];
                }
                else {
                    push (@{ $LowComp{$columns[0]} }, $len_low);
                }
# save positions of low complexity
		for (my $k=$columns[2];$k<=$columns[3];$k++){
		    if (!exists $$LowCompPos{$columns[0]}{$k}){
			$$LowCompPos{$columns[0]}{$k} = 'low';
		    }
		    else {
			$$LowCompPos{$columns[0]}{$k}.= '|low';
		    }
		}
            }
            $multiple = $low_num - $low_unique;
            close OUTPUT2;
        }
    }
# create output format
    for (my $i=0;$i<scalar(@seq);$i++){
        my $LowComp = '0/0';
        if (exists $LowComp{$seq[$i]}){
            my $LowCompNum = @{ $LowComp{$seq[$i]} };
            my $LowCompLen = "";
            for (my $j=0;$j<scalar(@{ $LowComp{$seq[$i]} });$j++){
                if ($j eq 0){
                    $LowCompLen = $LowComp{$seq[$i]}[$j];
                }
                else {
                    $LowCompLen.= '+'.$LowComp{$seq[$i]}[$j];
                }
            }
# number of low complexity regions over their lengths, separated by '+'
            $LowComp = $LowCompNum.'/'.$LowCompLen;
        }
        if (!exists $$LowCompRegions{$seq[$i]}){
            $$LowCompRegions{$seq[$i]} = $LowComp;
        }
    }
}

##
## Calculate CAN repeats from sequence
##

sub CANrepeats{
    my ($IDSeqBest,$CANNum,$CANrepeats,$CANpos) = @_;
# also compute reverse complementary sequences
    my @seq = keys %$IDSeqBest;
    my %IDSeqBestCAN = ();
    for (my $i=0;$i<scalar(@seq);$i++){
        my $revseq = $$IDSeqBest{$seq[$i]};
        $revseq = reverse $revseq;
        $revseq =~ tr/ACGTacgt/TGCAtgca/;
        my $IDCAN = $seq[$i].'_revcom';
        if (!exists $IDSeqBestCAN{$IDCAN}){
            $IDSeqBestCAN{$seq[$i]} = $$IDSeqBest{$seq[$i]};
            $IDSeqBestCAN{$IDCAN} = $revseq;
        }
    }
# find CAN repeats in sequence and reverse complementary sequence
    my @seqCAN = keys %IDSeqBestCAN;
    my %CAN = ();
    my %CANNextRNAi = ();
    my %CANLoc = ();
    for (my $i=0;$i<scalar(@seqCAN);$i++){
        my $counter = 0;
        my $CANLen = 0;
        my $CANLen2 = 0;
	while ($IDSeqBestCAN{$seqCAN[$i]} =~ m/(CA[ACGT]){$CANNum,}/g){
            if ($counter eq 0){
                $CAN{$seqCAN[$i]} = [ 0, ];
            }
# $start is end of actual CAN repeat free region, $end is start of next CAN repeat free region
            my $start = $CAN{$seqCAN[$i]}[0] + length($`);
            my $end = $CAN{$seqCAN[$i]}[0] + length($`) + length($&) - 1;
# start and end of CAN repeat in actual sequence
            my $CANstart = length($`) + 1;
            my $CANend = length($`) + length($&);
            if ($counter eq 0){
		$CANLoc{$seqCAN[$i]} = "$CANstart\.\.$CANend";
		$CANLen = (length($&) / 3)."($&)";
                $CANLen2 = (length($&) / 3);
            }
            else {
		$CANLoc{$seqCAN[$i]}.= '|'."$CANstart\.\.$CANend";
		my $len = (length($&) / 3);
                $CANLen.= '|'.$len."($&)";
                $CANLen2.= '+'.$len;
            }
            push (@{ $CAN{$seqCAN[$i]} }, $start);
            push (@{ $CAN{$seqCAN[$i]} }, $end);
            $counter++;
        }
# no CAN repeats at all
        if ($counter eq 0){
            $CAN{$seqCAN[$i]} = [ 0, length($IDSeqBestCAN{$seqCAN[$i]}) ];
        }
        else {
            push (@{ $CAN{$seqCAN[$i]} }, length($IDSeqBestCAN{$seqCAN[$i]}));
# number of CAN repeats and their lenghts
	    $CANNextRNAi{$seqCAN[$i]} = [ $counter, $CANLen2 ];
        }
    }
    for (my $i=0;$i<scalar(@seq);$i++){
        my $CANrep = "";
        my $IDCAN = $seq[$i].'_revcom';
# create output format (CAN repeats om '+'/lengths of repeats | CAN repeats on '-'/lengths of repeats)
# '+' sequence
        if (exists $CANNextRNAi{$seq[$i]}){
            $CANrep = $CANNextRNAi{$seq[$i]}[0].'/'.$CANNextRNAi{$seq[$i]}[1];
        }
        else {
            $CANrep = '0/0';
        }
# '-' sequence
        if (exists $CANNextRNAi{$IDCAN}){
            $CANrep.= '|'.$CANNextRNAi{$IDCAN}[0].'/'.$CANNextRNAi{$IDCAN}[1];
        }
        else {
            $CANrep.= '|0/0';
        }
        if (!exists $$CANrepeats{$seq[$i]}){
            $$CANrepeats{$seq[$i]} = $CANrep;
        }
# save positions of CAN repeats
	if (exists $CANLoc{$seq[$i]}){
	    my @CANs = split (/\|/,$CANLoc{$seq[$i]});
	    for (my $j=0;$j<scalar(@CANs);$j++){
		my @pos = split (/\.\./,$CANs[$j]);
		for (my $k=$pos[0];$k<=$pos[1];$k++){
		    if (!exists $$CANpos{$seq[$i]}{$k}){
			$$CANpos{$seq[$i]}{$k} = 'can';
		    }
		    else {
			$$CANpos{$seq[$i]}{$k}.= '|can';
		    }
		}
	    }
	}
	if (exists $CANLoc{$IDCAN}){
	    my @CANs = split (/\|/,$CANLoc{$IDCAN});
	    for (my $j=0;$j<scalar(@CANs);$j++){
		my @pos = split (/\.\./,$CANs[$j]);
		for (my $k=$pos[0];$k<=$pos[1];$k++){
		    my $realpos = length($IDSeqBestCAN{$seq[$i]}) - $k + 1;
		    if (!exists $$CANpos{$seq[$i]}{$realpos}){
			$$CANpos{$seq[$i]}{$k} = 'can';
		    }
		    else {
			$$CANpos{$seq[$i]}{$k}.= '|can';
		    }
		}
	    }
	}
    }
}

##
## Map siRNA seeds
##

sub seedMapper {
    my ($siRNAs,$identifier,$outfolder,$bowtie,$seedlen,$SCF,$UTRfile,$seedseq) = @_;
    my $seedfile = $outfolder.'NEXT-RNAi_'.$identifier.'_siRNAseeds.fa';
    my $seedbo = $outfolder.'NEXT-RNAi_'.$identifier.'_siRNAseeds.fa.bwt';
# retrieve siRNA seeds from sequences
    open (SIRNAS, $siRNAs) || die "Cannot open SIRNAS: $!\n";
    open (SEEDS, ">$seedfile") || die "Cannot open SEEDS: $!\n";
    while (my $line = <SIRNAS>){
        $line = &cleanLine($line);
        if ($line!~/>(\S+)/){
# siRNA seed is located at positions 12/13-18 on sense strand (2-7/8 on anti-sense strand)
            my $seedstart = length($line) - $seedlen - 1;
            my $seedsequence = substr($line,$seedstart,$seedlen);
# only write unique seeds to file, for hexamers this is a maximum of 4096 sequences, for heptamers it is 16384      
	    if (!exists $$seedseq{$seedsequence}){
		print SEEDS ">$seedsequence\n$seedsequence\n";
		$$seedseq{$seedsequence} = 0;
	    }
	}
    }
    close SIRNAS;
    close SEEDS;

# run Bowtie of unique seed sequences against queried sequence file (e.g. containing 3'-UTRs)
    system ("$bowtie\\bowtie -p 4 -f -v 0 -a $UTRfile $seedfile > $seedbo") eq 0 || die "Failed to open bowtie: $?\n";
# calculate number of siRNA seed matches to 3'-UTRs
    my %target = ();
    open (SEEDBO, "<$seedbo") || die "Cannot open SEEDBO: $!\n";
    while (my $line = <SEEDBO>){
        $line = &cleanLine($line);
        my (@columns) = ();
        @columns = split(/\t/, $line);
        if (scalar(@columns) eq 7){
# count number of unique seed matches to a certain target (seed must be identical to UTR, not the reverse complement)
            if (($columns[1] eq "+") && (!exists $target{$columns[0]}{$columns[2]})){
                $$seedseq{$columns[0]}++;
# multiple seedmatches to the same transcript are just counted once
                $target{$columns[0]}{$columns[2]} = "";
            }
        }
    }
    close SEEDBO;
    unlink ($seedfile,$seedbo);
}

##
## Calculate siRNA seed matches (seed complement frequencies)
##

sub seedMatcher {
    my ($siRNAs,$identifier,$outfolder,$seedlen,$SCF,$seedseq,$seedNum,$seedPos) = @_;
    my $seedout = $outfolder.'NEXT-RNAi_'.$identifier.'_siRNAseeds.report';
# retrieve siRNA seeds from sequences
    open (SIRNAS, $siRNAs) || die "Cannot open SIRNAS: $!\n";
    my $siRNAID= "";
    my %siRNA = ();
    while (my $line = <SIRNAS>){
	$line = &cleanLine($line);
	if ($line=~/>(\S+)/){
	    $siRNAID = $1;
	}
	else {
# siRNA seed is located at positions 12/13-18 on sense strand (2-7/8 on anti-sense strand)
	    my $seedstart = length($line) - $seedlen - 1;
	    my $seedsequence = substr($line,$seedstart,$seedlen);
	    $siRNA{$siRNAID} = $seedsequence;
	}
    }
    close SIRNAS;

    my @siRNA = keys %siRNA;
# write seed complement frequencies to file
    &fileLoc('Unlink',$seedout);
    open (OUT, ">$seedout") || die "Cannot open OUT: $!\n";
    for (my $i=0;$i<scalar(@siRNA);$i++){
	if ($siRNA[$i]=~/(\S+)_(\d+)/){
# save positions if seed could be mapped and has seed complement frequency exceeding the defined cutoff
	    if ((exists $$seedseq{$siRNA{$siRNA[$i]}}) && ($$seedseq{$siRNA{$siRNA[$i]}} > $SCF)){
		if (!exists $$seedPos{$1}{$2}){
		    $$seedPos{$1}{$2} = 'seed';
		}
		else {
		    $$seedPos{$1}{$2}.= '|seed';
		}
	    }
	    $$seedNum{$siRNA[$i]} = $$seedseq{$siRNA{$siRNA[$i]}};
	    print OUT "$siRNA[$i]\t$$seedseq{$siRNA{$siRNA[$i]}}\n";
	}
    }
    close OUT;
    undef %siRNA;
}

##
## Calculate content of conserved (miRNA) seed matches in siRNAs
##

sub mirSeed {
    my ($siRNAs,$seedlen,$seedfile,$mirSeed,$mirPos,$error) = @_;
# read miRNA FASTA sequence (including 'U'/'u' => 'T' conversions
    my %miRNAs = ();
    &readFASTA($seedfile,\%miRNAs,$error,'permissive');
# get seeds from sequences read
    my %seed = ();
    my %seedSeq = ();
    for my $key (keys %miRNAs){
	my $seed = substr($miRNAs{$key},1,$seedlen);
	$seed{$key} = $seed;
	if (!exists $seedSeq{$seed}){
	    $seedSeq{$seed} = [ $key, ];
	}
	else {
	    push (@{ $seedSeq{$seed} }, $key);
	}
    }
# check siRNAs for content of conserved seeds
    my $siRNA = "";
    my $siRNAID = "";
    my $siRNANum = "";
    my %siRNA = ();
    open (SIRNAS, $siRNAs) || die "Cannot open SIRNAS: $!\n";
    while (my $line = <SIRNAS>){
        $line = &cleanLine($line);
        if ($line=~/>((\S+)_(\d+))/){
            $siRNA = $1;
	    $siRNAID = $2;
	    $siRNANum = $3;
	}
        else {
# siRNA seed is located at positions 12/13-18 on sense strand (2-7/8 on anti-sense strand)
            my $seedstart = length($line) - $seedlen - 1;
            my $seedsequence = substr($line,$seedstart,$seedlen);
            $seedsequence = reverse $seedsequence;
            $seedsequence =~ tr/ACGTacgt/TGCAtgca/;
	    if (exists $seedSeq{$seedsequence}){
		if (!exists $$mirPos{$siRNAID}{$siRNANum}){
                    $$mirPos{$siRNAID}{$siRNANum} = 'mirseed';
                }
                else {
                    $$mirPos{$siRNAID}{$siRNANum}.= '|mirseed';
                }
		$$mirSeed{$siRNAID}{$siRNANum} = $seedSeq{$seedsequence};
	    }
        }
    }
    close SIRNAS;
}


##
## Static header region with css for HTML output
##

sub HTMLhead {
    my $identifier = $_[0];
    my $HTML = <<'';
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>NEXT-RNAi HTML Output</title>
<style type="text/css">
<!--
#allcontent {
width:  750px;
text-align: center;
font-family: Verdana, Arial, sans-serif;
font-size: 11px;
margin-left: auto;
margin-right: auto;
}
.main {
    text-align: top;
    font-size: 11px;
    vertical-align: text-top;
  top: 0;
  left: 0;
}
.main2 {
    text-align: justify;
    font-size: 12px;
    vertical-align: text-top;
  top: 0px;
  left: 0px;
    line-height: 120%;
    padding-top: 30px;
    padding-right: 20px;
    padding-bottom: 30px;
    padding-left: 30px;
  margin: 0px;
}
.news {
    text-decoration: bold;
    border-bottom-style: dotted;
    border-bottom-width: 1px;
    font-weight: bold;
}
a:link  { 
  color: #333; 
    font-size: 10px; 
    font-family: Verdana; 
    text-decoration: underline; 
}
a:visited  { 
  color: #333; 
    font-size: 10px; 
    font-family: Verdana; 
    text-decoration: underline; 
}
.style6 {
    font-family: "Courier New", Courier, monospace; font-size: 11px;
}
.style8 {
  color: #2662C3
}
-->
</style>
</head>
<body>
<div id="allcontent">
    <table width="800px" border="0" cellpadding="0" cellspacing="0">
        <tr>
            <td id="main" width="800">
                <table width="800" border="0" cellpadding="0" cellspacing="0" align="top">
                    <tr>
                        <td class="main">
                            <table width="800" border="0" cellpadding="0" cellspacing="0" class="main">
                                <tr>
                                    <td>
                                        <table width="800" border="0" cellpadding="0" cellspacing="0" style="border:1px dashed black;">
                                            <tr>
                                                <td class="main2">
<!--- begin of main part --->


    if ($identifier eq 'E-RNAi'){
	$HTML.= <<'';
<table border="0" cellspacing="0" cellpadding="0" align="center" width="800">
    <tr>
        <td align="left"><img src="http://www.dkfz.de/signaling/e-rnai3/headsection.jpg" border="0" usemap="#Map">
        </td>
    </tr>
</table>
<map name="Map">
<area shape="rect" coords="11,20,225,78" href="http://www.dkfz.de/signaling/e-rnai3/" /><area shape="rect" coords="13,89,56,107" href="http://www.dkfz.de/signaling/e-rnai3/" /><area shape="rect" coords="70,89,114,108" href="http://www.dkfz.de/signaling/e-rnai3/idseq.php" /><area shape="rect" coords="126,89,192,108" href="http://www.dkfz.de/signaling/e-rnai3/gbrowse.php" /><area shape="rect" coords="203,89,274,109" href="http://www.dkfz.de/signaling/e-rnai3/evaluation.php" /><area shape="rect" coords="305,89,399,109" href="http://www.genomernai.org/" /><area shape="rect" coords="689,89,725,109" href="http://b110-wiki.dkfz.de/signaling/wiki/display/ernai/Overview" /><area shape="rect" coords="735,90,780,109" href="http://www.dkfz.de/signaling/e-rnai3/about.php" /><area shape="rect" coords="548,22,797,57" href="http://www.dkfz.de/" />
</map>


    }
    return $HTML;
}

##
## Static footer region for HTML output
##

##
## Links to input/report/output files in HTML format
##

sub HTMLfoot {
    my $HTML = <<'';
<!--- end of main part --->
	                                            </td>
	                                        </tr>
	                                    </table>
	                                </td>
	                            </tr>
	                        </table>
	                    </td>
	                </tr>
	            </table>
	        </td>
	    </tr>
        </table>
    </div>
</body>


return $HTML;
}

sub HTMLfiles {
    my ($option,$name,$title,$text,$outHTML) = @_;
    my $link = $fileLocs{$option}{$name};
    if ($fileLocs{$option}{$name}=~/.*\/(.*\.\S+)$/){
	$link = $1;
    }
    print $outHTML <<"";
    <tr>
        <td class="main"><a href="$link" title="$title" target="_blank">$text</a></td>
    </tr>


}

##
## Calculate average and standard deviation from an array of values
##

sub stats {
    my ($values) = $_[0];
    my $sum = 0;
    my $na = 0;
    for (my $i=0;$i<scalar(@$values);$i++){
        if ($$values[$i] ne 'NA'){
	    $sum+= $$values[$i];
	}
	else {
	    $na++;
	}
    }
    my $scalar = scalar(@$values) - $na;
    my $avg = $sum / $scalar;
    my $squaresum = 0;
    for (my $i=0;$i<scalar(@$values);$i++){
	if ($$values[$i] ne 'NA'){
	    $squaresum+= ($$values[$i] - $avg)**2;
	}
    }
    $avg = sprintf("%.2f", $avg);
    my $stdev = sprintf("%.2f", ($squaresum/$scalar)**0.5);
    return ($avg,$stdev);
}

##
## Interactive mode for NEXT-RNAi input
##

sub promptInput {
    print "Enter interactive input mode\n";
    my ($inputFile, $splitInput, $reagent, $databaseFile, $optionsFile, $evaluation, $identifier) = @_;
    print "Design of [n]ew reagents or [e]valuation of reagents: ";
    $$evaluation = <STDIN>;
    chomp($$evaluation);
    if (($$evaluation ne 'n') && ($$evaluation ne 'e')){
	print "\n\'$$evaluation\' option invalid! Exiting program.\n\n";
	exit;
    }
    if ($$evaluation eq 'n'){
	$$evaluation = 'NO';
	print "Design of long [d]sRNAs or [s]iRNAs: ";
	$$reagent = <STDIN>;
	chomp($$reagent);
	if (($$reagent ne 'd') && ($$reagent ne 's')){
	    print "\n\'$$reagent\' option invalid! Exiting program.\n\n";
	    exit;
	}
	print "Enter location of input file containing target sequences: ";
	$$inputFile = <STDIN>;
	chomp($$inputFile);
	unless (-e $$inputFile){
	    print "\nFile at \'$$inputFile\' does not exist! Exiting program.\n\n";
	    exit;
	}
    }
    if ($$evaluation eq 'e'){
        print "Evaluation of long [d]sRNAs or [s]iRNAs: ";
        $$reagent = <STDIN>;
        chomp($$reagent);
	if (($$reagent ne 'd') && ($$reagent ne 's')){
            print "\n\'$$reagent\' option invalid! Exiting program.\n\n";
            exit;
        }
	if ($$reagent eq 'd'){
	    my $evalmode = "";
	    print "Evaluation of long dsRNAs from [p]rimer sequences, [d]sRNA sequences or [b]oth: ";
	    $evalmode = <STDIN>;
	    chomp($evalmode);
	    if (($evalmode ne 'p') && ($evalmode ne 'd') && ($evalmode ne 'b')){
		print "\n\'$evalmode\' option invalid! Exiting program.\n\n";
		exit;
	    }
	    if ($evalmode eq 'p'){
		print "Enter location of input file containing primer sequences in FASTA format: ";
		$$inputFile = <STDIN>;
		chomp($$inputFile);
		unless (-e $$inputFile){
		    print "\nFile at \'$$inputFile\' does not exist! Exiting program.\n\n";
		    exit;
		}
		$$evaluation = 'OLIGO';
	    }
	    if ($evalmode eq 'd'){
		print "Enter location of input file containing dsRNA sequences in FASTA format: ";
		$$inputFile = <STDIN>;
		chomp($$inputFile);
		unless (-e $$inputFile){
		    print "\nFile at \'$$inputFile\' does not exist! Exiting program.\n\n";
		    exit;
		}
		$$evaluation = 'DSRNA';
            }
	    if ($evalmode eq 'b'){
                print "Enter location of input file containing primer sequences in FASTA format: ";
                $$inputFile = <STDIN>;
                chomp($$inputFile);
		unless (-e $$inputFile){
		    print "\nFile at \'$$inputFile\' does not exist! Exiting program.\n\n";
		    exit;
		}
		my $in2 = "";
		print "Enter location of input file containing dsRNA sequences in FASTA format: ";
		$in2 = <STDIN>;
		chomp($in2);
		unless (-e $in2){
		    print "\nFile at \'$in2\' does not exist! Exiting program.\n\n";
		    exit;
		}
		$$inputFile.='+'.$in2;
		$$evaluation = 'DSRNA+OLIGO';
            }
	}
	if ($$reagent eq 's'){
	    print "Enter location of input file containing siRNA sequences in FASTA format: ";
	    $$inputFile = <STDIN>;
	    chomp($$inputFile);
	    unless (-e $$inputFile){
		print "\nFile at \'$$inputFile\' does not exist! Exiting program.\n\n";
		exit;
	    }
	    $$evaluation = 'OLIGO';
	}
    }
    print "How many target sequences should be processed in one run (a maximum of 1000 sequences is recommended, depending on the computer power): ";
    $$splitInput = <STDIN>;
    chomp($$splitInput);
    if (($$splitInput !~ /^\d+$/) || ($$splitInput <= 0)){
	print "\nOnly (integer) numbers > 0 are allowed for this parameter! Exiting program\n\n";
	exit;
    }
    print "Enter location of Bowtie index file for off-target evaluation or type \'nodb\' to run NEXT-RNAi without off-target evaluation: ";
    $$databaseFile = <STDIN>;
    chomp($$databaseFile);
    if ($$databaseFile ne 'nodb'){
	unless ((-e "$$databaseFile\.1\.ebwt") && (-e "$$databaseFile\.2\.ebwt") && (-e "$$databaseFile\.3\.ebwt") && (-e "$$databaseFile\.4\.ebwt") && (-e "$$databaseFile\.rev\.1\.ebwt") && (-e "$$databaseFile\.rev\.2\.ebwt")){
	    print "\nBowtie index/database for off-target evaluation not found. The Bowtie database should consist of six files: db.1.ebwt, db.2.ebwt, db.3.ebwt, db.4.ebwt, db.rev.1.ebwt and db.rev.2.ebwt (see also documentation for 'bowtie-build' on the Bowtie webpage (http://bowtie-bio.sourceforge.net/index.shtml). Exiting program.\n\n";
	    exit;
	}
    }
    print "Name of this run: ";
    $$identifier = <STDIN>;
    chomp($$identifier);
    if ($$identifier eq ''){
	$$identifier= 'Probe';
    }
    print "NEXT-RNAi allows the adjustment of further settings in an additional options file. Is this file already [p]repared or should it be [c]reated now: ";
    my $option = <STDIN>;
    chomp($option);
    if (($option ne 'p') && ($option ne 'c')){
	print "\n\'$option\' option invalid! Exiting program.\n\n";
	exit;
    }
    if ($option eq 'p'){
	print "Enter location of additional options file: ";
	$$optionsFile = <STDIN>;
	chomp($$optionsFile);
	unless (-e $$optionsFile){
	    print "\nFile at \'$$optionsFile\' does not exist! Exiting program.\n\n";
	    exit;
	}
    }
    if ($option eq 'c'){
	print "\nThe options file with the most common settings will be created in the following steps. Please run \'perl NEXT-RNAi.pl -h\' to get information about further options available.\n";
# options folder is folder, where input file is located
	$$optionsFile = '';
	my $optionsfolder = "";
	if ($$inputFile=~/(.+[\/\\])\S+\.\S+/){
	    $optionsfolder = $1;
	}
	$$optionsFile = $optionsfolder.'NEXT-RNAi_options.txt';
	my $index = 1;
	while (-e $$optionsFile){
	    $$optionsFile = $optionsfolder.'NEXT-RNAi_options_'.$index.'.txt';
	    $index++;
	}
	open (OPTIONS, ">$$optionsFile") || die "Cannot open $$optionsFile: !\n";
	print "\nOptions file $$optionsFile was generated\n";
	print "\nThe following options will be added to the options file. To skip an option, just hit enter.\n";
# location of output file
	print "Enter location of the output folder (default location = $optionsfolder): ";
	my $value = <STDIN>;
	chomp($value);
	if ($value ne ''){
	    if (-d $value){
		print OPTIONS "OUTPUT=$value\n";
	    }
	    else {
		print "\nFolder at \'$value\' does not exist! Exiting program.\n\n";
		exit;
	    }
	}
# location of bowtie program
	print "Enter location of bowtie program required for mapping sequences to the off-target database (if not set to \'nodb\') or to the genome (default=/usr/bin/): ";
	$value = "";
	$value = <STDIN>;
	chomp($value);
	if ($value ne ''){
	    if (-e "$value/bowtie"){
		print OPTIONS "BOWTIE=$value\n";
	    }
	    else {
		print "\n\'bowtie\' not found in \'$value\'! Exiting program.\n\n";
                exit;
	    }
	}
# location of primer3_core program
	if ($$reagent eq 'd'){
	    print "Enter location of primer3 program (primer3_core script)(default=/usr/bin/): ";
	    $value = "";
	    $value = <STDIN>;
	    chomp($value);
	    if ($value ne ''){
		if (-e "$value/primer3_core"){
		    print OPTIONS "PRIMER3=$value\n";
		}
		else {
		    print "\n\'primer3_core\' not found in \'$value\'! Exiting program.\n\n";
		    exit;
		}
	    }
	}
# targetgroups file
	print "Location of file connection transcript and gene identifiers (targetgroups file) important for specificity calculations: ";
	$value = "";
	$value = <STDIN>;
	chomp($value);
	if ($value ne ''){
	    if (-e $value){
		print OPTIONS "TARGETGROUPS=$value\n";
	    }
	    else {
		print "\nFile at \'$value\' does not exist! Exiting program.\n\n";
		exit;
	    }
	}
# designwindow for long dsRNA designs
	if (($$reagent eq 'd') && ($$evaluation eq 'NO')){
	    print "Minimal allowed length (default=80) for long dsRNA designs: ";
	    my $min = "";
	    $min = <STDIN>;
	    chomp($min);
	    print "Maximal allowed length (default=500) for long dsRNA designs: ";
            my $max = "";
            $max = <STDIN>;
            chomp($max);
	    if ($min eq ''){
		$min = 80;
	    }
	    else {
		if (($min !~ /^\d+$/) || ($min <= 0)){
		    print "\nOnly (integer) numbers > 0 are allowed for the minimal and maximal length! Exiting program\n\n";
		    exit;
		}
	    }
	    if ($max eq ''){
		$max = 500;
            }
            else {
                if (($max !~ /^\d+$/) || ($max <= 0)){
                    print "\nOnly (integer) numbers > 0 are allowed for the minimal and maximal length! Exiting program\n\n";
                    exit;
		}
            }
	    print OPTIONS "DESIGNWINDOW=$min,$max\n";
	}
# siRNA length (for off-target evaluation)
	print "siRNA lenght (for off-target evaluation) in bp (default=19): ";
	$value = "";
	$value = <STDIN>;
	chomp($value);
	if ($value ne ''){
	    if (($value !~ /^\d+$/) || ($value <= 14) || ($value >= 28)){
		print "\nOnly (integer) numbers > 13 and < 29 are allowed for this parameter! Exiting program\n\n";
		exit;
	    }
	    else {
		print OPTIONS "SIRNALENGTH=$value\n";
	    }
	}
# number of designs per input
	if ($$evaluation eq 'NO'){
	    print "Number of designs per input sequence (default=1): ";
	    $value = "";
	    $value = <STDIN>;
	    chomp($value);
	    print OPTIONS "DESIGNNUM=50\n";
	    if ($value ne ''){
		if (($value !~ /^\d+$/) || ($value <= 0)){
		    print "\nOnly (integer) numbers > 0 are allowed for this parameter! Exiting program\n\n";
		    exit;
		}
		else {
		    print OPTIONS "OUTPUTNUM=$value\n";
		}
	    }
	}
# maximal allowed intron content in long dsRNA
	if (($$reagent eq 'd') && ($$evaluation eq 'NO')){
	    print "Maximal allowed intron content (in percent) for the design of long dsRNAs (default = 25): ";
	    $value = "";
	    $value = <STDIN>;
	    chomp($value);
	    if ($value ne ''){
		if (($value !~ /^\d+$/) || ($value < 0) || ($value > 100)){
		    print "\nOnly (integer) numbers >= 0 and <=100 are allowed for this parameter! Exiting program\n\n";
		    exit;
		}
		else {
		    print OPTIONS "INTRON=$value\n";
		}
	    }
	}
# efficiency calculation method
	print "Calculate efficiency according to [r]ational or [w]eighted method (default = disabled): ";
	my $eff = <STDIN>;
	chomp($eff);
	if ($eff ne ''){
	    if (($eff ne 'r') && ($eff ne 'w')){
		print "\n\'$eff\' option invalid! Exiting program.\n\n";
		exit;
	    }
	    if ($eff eq 'r'){
		print "For rational efficiency calculation the Vienna RNA package is required. Please enter the location of the RNAfold.pl script included in this package (default=/usr/bin/): ";
		$value = "";
		$value = <STDIN>;
		chomp($value);
		if ($value ne ''){
		    if (-e "$value/RNAfold.pl"){
			print OPTIONS "VIENNA=$value\n";
		    }
		    else {
			print "\n\'RNAfold.pl\' not found in \'$value\'! Exiting program.\n\n";
			exit;
		    }
		}
	    }
	    my $effCut = 0;
	    if (($eff eq 'r') || ($eff eq 'w')){
		print "Efficiency score cut-off (value between 0 and 100): ";
		$effCut = <STDIN>;
		chomp($effCut);
		if (($effCut !~ /^\d+$/) || ($effCut < 0) || ($effCut > 100)){
		    print "\nOnly (integer) numbers >= 0 and <= 100 are allowed for this parameter! Exiting program\n\n";
		    exit;
		}
	    }
	    if ($eff eq 'r'){
		print OPTIONS "EFFICIENCY=RATIONAL,$effCut\n";
	    }
	    if ($eff eq 'w'){
		print OPTIONS "EFFICIENCY=SIR,$effCut\n";
	    }
	}
# low complexity regions
	print "Evaluate input sequences for low-complexity regions with mdust filter (y/n) (default = n): ";
	$value = "";
	$value = <STDIN>;
	chomp($value);
	if ($value ne ''){
	    if (($value ne 'y') && ($value ne 'n')){
		print "\n\'$value\' option invalid! Exiting program.\n\n";
		exit;
	    }
	    if ($value eq 'y'){
		print "Enter locations of mdust script required for low-complexity evaluation: ";
		$value = "";
		$value = <STDIN>;
		chomp($value);
		if ($value ne ''){
		    if (-e "$value/mdust"){
			print OPTIONS "LOWCOMPEVAL=$value\n";
		    }
		    else {
			print "\n\'mdust\' not found in \'$value\'! Exiting program.\n\n";
			exit;
		    }
		}
	    }
	}
# CAN repeats
	print "Evaluate input sequences for CA[ACGT] (CAN) repeats (y/n) (default = n): ";
	$value = "";
	$value = <STDIN>;
	chomp($value);
	if ($value ne ''){
	    if (($value ne 'y') && ($value ne 'n')){
		print "\n\'$value\' option invalid! Exiting program.\n\n";
		exit;
	    }
	    if ($value eq 'y'){
		print "Minimal allowed number of contiguous CAN-repeats (e.g. 6): ";
		$value = "";
		$value = <STDIN>;
		chomp($value);
		if ($value ne ''){
		    if (($value !~ /^\d+$/) || ($value < 1)){
			print "\nOnly (integer) numbers > 0 are allowed for this parameter! Exiting program\n\n";
			exit;
		    }
		    else {
			print OPTIONS "CANEVAL=$value\n";
		    }
		}
		else {
		    print OPTIONS "CANEVAL=6\n";
		}
	    }
	}
# seed complement frequency
	print "Evaluate input siRNAs for seed matches in a defined FASTA database or Bowtie database/index (y/n) (default = n): ";
        $value = "";
        $value = <STDIN>;
        chomp($value);
	if ($value ne ''){
	    if (($value ne 'y') && ($value ne 'n')){
		print "\n\'$value\' option invalid! Exiting program.\n\n";
		exit;
	    }
	    if ($value eq 'y'){
		print "Location of FASTA database file or Bowtie database/index to be searched for seed matches: ";
		my $seedDB = "";
		$seedDB = <STDIN>;
		chomp($seedDB);
		unless ((-e "$seedDB\.1\.ebwt") && (-e "$seedDB\.2\.ebwt") && (-e "$seedDB\.3\.ebwt") && (-e "$seedDB\.4\.ebwt") && (-e "$seedDB\.rev\.1\.ebwt") && (-e "$seedDB\.rev\.2\.ebwt")){
		    unless (-e $seedDB){
			print "\nNo valid FASTA file or Bowtie database/index for off-target evaluation found in $seedDB. Please either provide a valid FASTA database file or a Bowtie database/index that should consist of six files: db.1.ebwt, db.2.ebwt, db.3.ebwt, db.4.ebwt, db.rev.1.ebwt and db.rev.2.ebwt (see also documentation for'bowtie-build' on the Bowtie webpage (http://bowtie-bio.sourceforge.net/index.shtml). Exiting program.\n\n";
			exit;
		    }
		}
		print "Length of seed in siRNA to be searched in database (6, 7 or 8): ";
		my $seed = "";
		$seed = <STDIN>;
		chomp($seed);
		if (($seed !~ /^\d$/) && ($seed ne 6) && ($seed ne 7) && ($seed ne 8)){
		    print "\nValues \'6\', \'7\' or \'8\' are allowed for this parameter only! Exiting program.\n\n";
		    exit;
		}
		print "Maximal allowed seed matches in database: ";
		my $seedCut = "";
		$seedCut = <STDIN>;
		chomp($seedCut);
		if ($seedCut !~ /^\d+$/){
		    print "\nValues >= 0 are allowed for this parameter only! Exiting program.\n\n";
		    exit;
		}
		print OPTIONS "SEEDMATCH=$seed,$seedCut,$seedDB\n";
	    }
	}
# miRNA seed analysis
	print "Evaluate input siRNAs for the appearance of certain miRNA seeds (y/n) (default = n): ";
        $value = "";
        $value = <STDIN>;
        chomp($value);
	if ($value ne ''){
	    if (($value ne 'y') && ($value ne 'n')){
		print "\n\'$value\' option invalid! Exiting program.\n\n";
		exit;
	    }
	    if ($value eq 'y'){
		print "FASTA file containing miRNAs with seeds to be searched in designs: ";
		my $miRNA = "";
		$miRNA = <STDIN>;
		chomp($miRNA);
		unless (-e $miRNA){
		    print "\nFile at \'$miRNA\' does not exist! Exiting program.\n\n";
		    exit;
		}
		print "Length of seed in miRNA to be searched in  designs (6, 7 or 8): ";
		my $len = "";
		$len = <STDIN>;
		chomp($len);
		if (($len !~ /^\d$/) && ($len ne 6) && ($len ne 7) && ($len ne 8)){
		    print "\nValues \'6\', \'7\' or \'8\' are allowed for this parameter only! Exiting program.\n\n";
		    exit;
		}
		print OPTIONS "MIRSEED=$len,$miRNA\n";
	    }
	}
# redesign
	if (($$reagent eq 'd') && ($$evaluation eq 'NO')){
	    print "Do you want to allow the re-designs of long dsRNAs, in case input sequences do not meet the criteria defined above (specificity, efficiency or low-complexity (including CAN repeats), seedmatches and miRNA seeds) defined above (y/n) (default = n): ";
	    $value = "";
	    $value = <STDIN>;
	    chomp($value);
	    if ($value ne ''){
		if (($value ne 'y') && ($value ne 'n')){
		    print "\n\'$value\' option invalid! Exiting program.\n\n";
		    exit;
		}
		if ($value eq 'y'){
		    print OPTIONS "REDESIGN=ON\n";
		}
	    }
	}
# exlcude sequence e.g. to calculate independent designs
	if ($$evaluation eq 'NO'){
	    print "If certain sequences should be avoided from new designs (e.g. to calculate independent designs) please provide the location of a valid FASTA file or Bowtie database/index with sequences to be excluded: ";
	    $value = "";
	    $value = <STDIN>;
	    chomp($value);
	    if ($value ne ''){
		unless ((-e "$value\.1\.ebwt") && (-e "$value\.2\.ebwt") && (-e "$value\.3\.ebwt") && (-e "$value\.4\.ebwt") && (-e "$value\.rev\.1\.ebwt") && (-e "$value\.rev\.2\.ebwt")){
		    unless (-e $value){
			print "\nNo valid FASTA file or Bowtie database/index for off-target evaluation found in $value. Please either provide a valid FASTA database file or a Bowtie database/index that should consist of six files: db.1.ebwt, db.2.ebwt, db.3.ebwt, db.4.ebwt, db.rev.1.ebwt and db.rev.2.ebwt (see also documentation for 'bowtie-build' on the Bowtie webpage (http://bowtie-bio.sourceforge.net/index.shtml). Exiting program.\n\n";
			exit;
		    }
		}
		print OPTIONS "INDEPENDENT=$value\n";
	    }
	}
# summarize evaluation of siRNAs for pools
	if (($$reagent eq 's') && ($$evaluation eq 'e')){
	    print 'For the evaluation of siRNAs results can be summarized for siRNA pools. Please provide a tab-delimited file (with headers \'siRNAID\' and \'POOLID\') containing the siRNA identifiers used in the input FASTA file and a pool identifier for the siRNA pool it belongs to: ';
	    $value = "";
	    $value = <STDIN>;
	    chomp($value);
	    if ($value ne ''){
		unless (-e $value){
		    print "\nFile at \'$value\' does not exist! Exiting program.\n\n";
		    exit;
		}
		print OPTIONS "POOL=$value\n";
	    }
	}
# mapping of reagents using bowtie
	print "If reagents should be mapped to the genome (e.g. to chromosomes or contigs), please provide the location of a valid bowtie index: ";
	$value = "";
	$value = <STDIN>;
	chomp($value);
	my $mapping = 0;
	if ($value ne ''){
	    unless ((-e "$value\.1\.ebwt") && (-e "$value\.2\.ebwt") && (-e "$value\.3\.ebwt") && (-e "$value\.4\.ebwt") && (-e "$value\.rev\.1\.ebwt") && (-e "$value\.rev\.2\.ebwt")){
		print "\nBowtie index/database for off-target evaluation not found. The Bowtie database should consist of six files: db.1.ebwt, db.2.ebwt, db.3.ebwt, db.4.ebwt, db.rev.1.ebwt and db.rev.2.ebwt (see also documentation for 'bowtie-build' on the Bowtie webpage (http://bowtie-bio.sourceforge.net/index.shtml). Exiting program.\n\n";
		exit;
	    }
	    print OPTIONS "GENOMEBOWTIE=$value\n";
	    $mapping = 1;
	}
# evaluate reagents for homology using blast
	print "Evaluate designs for homology to a given FASTA database (y/n) (default = n): ";
	$value = "";
	$value = <STDIN>;
	chomp($value);
	if ($value ne ''){
	    if (($value ne 'y') && ($value ne 'n')){
		print "\n\'$value\' option invalid! Exiting program.\n\n";
		exit;
	    }
	    if ($value eq 'y'){
		print "Location of \'blastall\' program for homology evaluation (e.g. /usr/bin/): ";
		my $blast = "";
		$blast = <STDIN>;
		chomp($blast);
		unless (-e "$blast/blastall"){
                    print "\n\'blastall\' not found in \'$blast\'! Exiting program.\n\n";
                    exit;
                }
		print "Location of a valid FASTA database (already formatted with \'formatdb\') for homology evaluation: ";
                my $blastDB = "";
                $blastDB = <STDIN>;
                chomp($blastDB);
		unless ((-e "$blastDB\.nhr") && (-e "$blastDB\.nin") && (-e "$blastDB\.nsd") && (-e "$blastDB\.nsi") && (-e "$blastDB\.nsq") && (-e $blastDB)){
		    print "\nBlast database $blastDB not found or incomplete. A Blast database consists of a FASTA file, *.nhr file, *.nin file, *.nsd file, *.nsi file and *.nsq file and can be obtained by the use of the 'formatdb' program coming with blast. Exiting program.\n\n";
		    exit;
		}
		print "E-value cut-off for blast homology (integer, floating point or scientific numbers > 0 are allowed): ";
		my $evalue = "";
		$evalue = <STDIN>;
                chomp($evalue);
		if ((($evalue=~/^\d+$/) || ($evalue=~/^\d+\.\d+$/) || ($evalue=~/^\d+e-\d+$/)) && ($evalue >= 0)){
		    print OPTIONS "HOMOLOGY=$blast,$blastDB,$evalue\n";
		}
		else {
		    print "\nOnly integer numbers, floating point numbers and scientific numbers >= 0 are allowed as homology cut-off. Exiting program.\n\n";
		}
	    }
	}
# GFF output
	if ($mapping eq 1){
	    print "Produce GFF (generic feature format) output file (y/n) (default = n): ";
	    $value = "";
	    $value = <STDIN>;
	    chomp($value);
	    if ($value ne ''){
		if (($value ne 'y') && ($value ne 'n')){
		    print "\n\'$value\' option invalid! Exiting program.\n\n";
		    exit;
		}
		if ($value eq 'y'){
		    print "Output in [GFF2] or [GFF3] format: ";
		    $value = "";
		    $value = <STDIN>;
		    chomp($value);
		    if (($value eq "GFF2") || ($value eq "GFF3")){
			print OPTIONS "GFF=$value\n";
		    }
		    else {
			print "\n\'$value\' option invalid! Exiting program.\n\n";
			exit;
		    }
		}
	    }
# AFF output
	    print "Produce AFF (annotation file format) output file (y/n) (default = n): ";
	    $value = "";
	    $value = <STDIN>;
	    chomp($value);
	    if ($value ne ''){
		if (($value ne 'y') && ($value ne 'n')){
		    print "\n\'$value\' option invalid! Exiting program.\n\n";
		    exit;
		}
		if ($value eq 'y'){
		    print OPTIONS "AFF=YES\n";
		}
	    }
# GBrowse visualization
	    print "Set URL for 'gbrowse_img' and the corresponding organism if designs should be visualized in GBrowse (default = disabled): ";
	    $value = "";
	    $value = <STDIN>;
	    chomp($value);
	    if ($value ne ''){
		print OPTIONS "GBROWSEBASE=$value\n";
	    }
	    print "Set tracks to be shown with the designs in GBrowse (e.g. tracks for genes or transcripts), concatenate multiple tracks with \'+\' (e.g. \'GENE+TRANSCRIPTS\'): ";
	    $value = "";
	    $value = <STDIN>;
	    chomp($value);
	    if ($value ne ''){
		print OPTIONS "GBROWSETRACK=$value\n";
	    }
	}
    }
#    print "$$inputFile, $$splitInput, $$reagent, $$databaseFile, $$optionsFile, $$evaluation, $$identifier\n";
}

####################
####################
####################
###              ###
### Main program ###
###              ###
####################
####################
####################

##
## Input handling with the help of Pod and Getopt
##

#changes made by chenchen
#allow getopt just get one option at a time
Getopt::Long::Configure("pass_through");

my ($inputFile, $splitInput, $reagent, $databaseFile, $optionsFile, $evaluation, $identifier, $help, $prompt);
GetOptions("inputFile:s"=>\$inputFile,         # file(s) with input target sequences (string)
	   "splitInput:i"=>\$splitInput,       # input file split option (integer)
	   "reagent:s"=>\$reagent,             # reagent type (string)
	   "databaseFile:s"=>\$databaseFile,   # database file(s) (string)
	   "optionsFile:s"=>\$optionsFile,     # option file (string), optional
	   "evaluation:s"=>\$evaluation,       # de novo design or evaluation mode (string)
	   "name:s"=>\$identifier,             # design name (string), optional
           "help"=>\$help,                     # open help
	   "prompt"=>\$prompt);                 # open interactive (prompt) mode
#changes made by Chenchen
#turn off pod2usage otherwise other flags won't work
#|| pod2usage(-verbose=>1);
    
# if help is queried, full Pod documentation is shown
pod2usage(-verbose=>2) if $help;
&promptInput(\$inputFile,\$splitInput,\$reagent,\$databaseFile,\$optionsFile,\$evaluation,\$identifier) if $prompt;

# split $inputFile if defined
my @inputfiles = ();
if ((defined $inputFile) && ($inputFile ne '')){
    @inputfiles = split(/\+/,$inputFile);
}
else {
    print "No sequence input file (option '-i') defined. Start NEXT-RNAi with '-h' for help.\n";
    exit;
}

# set split option for input file to 4000 if not defined
if ((!defined $splitInput) || ($splitInput eq '')){
    $splitInput = 4000;
}

# check for existence of input file(s)
my $inputFile1 = "";
my $inputFile2 = "";
for (my $i=0;$i<scalar(@inputfiles);$i++){
    if (-e $inputfiles[$i]){
	if ($i eq 0){
	    $inputFile1 = $inputfiles[0];
	}
	elsif ($i eq 1){
	    $inputFile2 = $inputfiles[1];
	}
    }
    else {
	print "$inputfiles[$i] was not found (option '-i'). Start NEXT-RNAi with '-h' for help.\n";
	exit;
    }
}

# run without off-target evaluation
if ($databaseFile eq 'nodb'){
    print "NEXT-RNAi is run without off-target evaluation\n";
}
else {
# check for existence of database file
    if ((defined $databaseFile) && ($databaseFile ne '')){
	unless ((-e "$databaseFile\.1\.ebwt") && (-e "$databaseFile\.2\.ebwt") && (-e "$databaseFile\.3\.ebwt") && (-e "$databaseFile\.4\.ebwt") && (-e "$databaseFile\.rev\.1\.ebwt") && (-e "$databaseFile\.rev\.2\.ebwt")){
	    print "Bowtie index/databases for off-target evaluation not found (option '-d'). The Bowtie database should consist of six files: db.1.ebwt, db.2.ebwt, db.3.ebwt, db.4.ebwt, db.rev.1.ebwt and db.rev.2.ebwt (see also documentation for 'bowtie-build' on the Bowtie webpage (http://bowtie-bio.sourceforge.net/index.shtml). NEXT-RNAi requires the 'db' name to run Bowtie. E.g. in case the six index files are located in /home/user/Desktop/, '-d' option should be set to '/home/user/Desktop/db'. Start NEXT-RNAi with '-h' for help.\n";
	    exit;
	}
    }
    else {
	print "Bowtie index/databases for off-target evaluation not found (option '-d'). The Bowtie database should consist of six files: db.1.ebwt, db.2.ebwt, db.3.ebwt, db.4.ebwt, db.rev.1.ebwt and db.rev.2.ebwt (see also documentation for 'bowtie-build' on the Bowtie webpage (http://bowtie-bio.sourceforge.net/index.shtml). NEXT-RNAi requires the 'db' name to run Bowtie. E.g. in case the six index files are located in /home/user/Desktop/, '-d' option should be set to '/home/user/Desktop/db'. Start NEXT-RNAi with '-h' for help.\n";
	exit;
    }
}
&fileLoc('Input',$databaseFile,'OTEDatabase');

# check for existence of reagent information

if ((!defined $reagent) || ($reagent eq '')){ 
    print "No reagent information (option '-r') found. Start NEXT-RNAi with '-h' for help.\n";
    exit;
}
else {
    if (($reagent ne 'd') && ($reagent ne 's')){
	print "$reagent is no valid reagent (option '-r'). Only 'd' (for long dsRNA) and 's' (for siRNA) are allowed. Start NEXT-RNAi with '-h' for help.\n";
	exit;
    }
}

# check for existence of evaluation information

if ((!defined $evaluation) || ($evaluation eq '')){
    print "No evaluation information (option '-e') found. Start NEXT-RNAi with '-h' for help.\n";
    exit;
}
else {
    if (($evaluation ne 'NO') && ($evaluation ne 'OLIGO') && ($evaluation ne 'DSRNA') && ($evaluation ne 'DSRNA+OLIGO')){
        print "$evaluation is no valid evaluation option (option '-e'). Only 'NO' (de novo design), 'DSRNA' (evaluation of long dsRNAs), 'OLIGO' (evaluation of long dsRNAs starting with primers or evaluation of siRNAs) and 'DSRNA+OLIGO' (evaluation of long dsRNAs with underlying primers) are allowed. Start NEXT-RNAi with '-h' for help.\n";
	exit;
    }
}

# check for design identifier, othwerwise set to default value
if ((!defined $identifier) || ($identifier eq '')){ $identifier = "Probe" };

# get options (only single calls possible, except for TARGETGROUPS, GENOMEBOWTIE, GENOMEFASTA, EXCLUDED, INTENDED, INDEPENDENT and OTEEVAL) from $optionsFile if defined, or use default values 
# from here on also documentation in ERROR logfile
my %options = ();
# definition of arguments see pod above
%options = ( OUTPUT => [ "", ],
	     DESIGNWINDOW => [ "80,500", ],
	     DESIGNNUM => [ 50, ],
	     OUTPUTNUM => [ 1, ],
	     SIRNALENGTH => [ 19, ],
	     EFFICIENCY => [ "empty,empty", ],
	     VIENNA => [ "/usr/bin/", ],
	     TARGETGROUPS => [ "", ],
	     EXCLUDED => [ "", ],
	     INTENDED => [ "", ],
	     TARGETSEQ => [ "CALC", ],
	     PRIMER3 => [ "/usr/bin/", ],
	     PRIMER3OPT => [ "empty", ],
	     PRIMERTAG => [ "none", ],
	     REDESIGN => [ "OFF", ],
	     SOURCE => [ "GENOMIC", ],
	     BOWTIE => ["/usr/bin/", ],
	     GENOMEBOWTIE => [ "empty", ],
	     GENOMEFASTA => [ "empty", ],
	     BLAT => [ "empty", ],
	     BLATALIGN => [ "PERFECT", ],
	     BLATSPLIT => [ "0", ],
	     BLATPROGRAM => [ "blat", ],
	     BLATHOST => [ "", ],
	     BLATPORT => [ "", ],
	     TXNFASTA => [ "empty", ],
	     GFF => [ "NO", ],
	     GBROWSEBASE => [ "", ],
	     GBROWSETRACK => [ "", ],
	     AFF => [ "NO", ],
	     FEATURE => [ "empty", ],
	     OTEEVAL => [ "empty", ],
	     LOWCOMPEVAL => [ "empty", ],
	     CANEVAL => [ "empty", ],
	     SEEDMATCH => [ "empty", ],
	     MIRSEED => [ "empty", ],
	     HOMOLOGY => [ "empty", ],
	     POOL => [ "empty", ],
	     INDEPENDENT => [ "empty", ],
	     R => [ "", ],
	     INTRON => [ 25, ],
	     RANKD => [ 'EFF', ],
	     TARGETTYPE => [ "ANNO", ],
    );
my $errorfolder = "";
if ($inputFile1=~/(.+[\/\\])\S+\.\S+/){
    $errorfolder = $1;
}

# open ERROR logfile => this file will contain error messages only from after the input evaluation (which is done by Getopt)
# ERROR file located in same folder as input file
my $outError = $errorfolder.'NEXT-RNAi_'.$identifier.'_error.txt';
my $indexError = 1;
while (-e $outError){
    $outError = $errorfolder.'NEXT-RNAi_'.$identifier.'_error_'.$indexError.'.txt';
    $indexError++;
}
&fileLoc('Output',$outError,'Error');
open (ERROR, ">$outError") || die "Cannot open ERROR: $!\n";


# replace parameters with values from optionsfile, if queried
my %UserOptions = ();
if (defined $optionsFile){
    if (-e $optionsFile){ 
	&fileLoc('Input',$optionsFile,'Options');
	open (OPTIONS, "<$optionsFile") || die "Cannot open OPTIONS: $!\n";
	while (my $line = <OPTIONS>){
	    $line = &cleanLine($line);
	    if ($line=~/(\S+)=(\S+)/){
		&options($1,$2,\%options,\%UserOptions,\*ERROR);	
	    }    
	}
	close OPTIONS;
    }
    else {
	print "Additional options file $optionsFile not found, default settings are used. Start NEXT-RNAi with '-h' for help.\n";
	print ERROR "$optionsFile\tAdditional options file not found, default settings are used\n";
    }
}
else {
    print "No additional options file defined, default settings are used. Start NEXT-RNAi with '-h' for help.\n";
}

#changes made by Chenchen
#define the flags
my @option_flags = ( "OUTPUT","DESIGNWINDOW","DESIGNNUM","OUTPUTNUM","SIRNALENGTH","EFFICIENCY","VIENNA","TARGETGROUPS","EXCLUDED","INTENDED","TARGETSEQ","PRIMER3","PRIMER3OPT","PRIMERTAG","REDESIGN","SOURCE","BOWTIE","GENOMEBOWTIE","GENOMEFASTA","BLAT","BLATALIGN","BLATSPLIT","BLATPROGRAM","BLATHOST","BLATPORT","TXNFASTA","GFF","GBROWSEBASE","GBROWSETRACK","AFF","FEATURE","OTEEVAL","LOWCOMPEVAL","CANEVAL","SEEDMATCH","MIRSEED","HOMOLOGY","POOL","INDEPENDENT","R","INTRON","RANKD","TARGETTYPE" );


&flag_options(\%options,\@option_flags);
#if flag is set to "NONE" or undefinded no value will be changed
#if a flag has multiple value it's also ok, this is for Targetgroups and some other parameters
sub flag_options{
    my ($ref_options,$ref_flags) = @_;
    foreach my $flag (@{$ref_flags}){
        my $tmp;
        GetOptions("$flag=s@" => \$tmp,);
	if( defined($tmp)){
	    unless((${$tmp}[0] eq "NONE") || (${$tmp}[0] eq "None")){
		$ref_options->{$flag} = $tmp;
	    }
	}
	

    }

}
########
# create output folder if it is not defined in options file                            
#changes made by Chenchen                                                              
my $outfolder;
if ($options{"OUTPUT"}[0] eq ""){
    $outfolder = $errorfolder.'out/';
    my $index = 1;
    while (-d $outfolder){
        $outfolder = $errorfolder.'out_'.$index.'/';
        $index++;
    }
}
else{
    $outfolder = $options{"OUTPUT"}[0]."/";
    my $index=1;
    while (-d $outfolder){
        $outfolder = $options{"OUTPUT"}[0]."_".$index.'/';
        $index++;
    }
}
# make directory                        
system ("mkdir $outfolder") eq 0 || die "Failed to make output folder $outfolder (check permissions): $?\n";
$options{"OUTPUT"}[0] = $outfolder;


# make HTML directory
my $HTMLoutfolder = $options{"OUTPUT"}[0].'HTML/';
my $index = 1;
while (-d $HTMLoutfolder){
    $HTMLoutfolder = $options{"OUTPUT"}[0].'HTML_'.$index.'/';
    $index++;
}
system ("mkdir $HTMLoutfolder") eq 0 || die "Failed to make HTML output folder $HTMLoutfolder (check permissions): $?\n";

# output files
# tab delimited output file (main output)
my $outTab = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'.txt';
# gff output file (if it was queried) for visualization of reagents in a genome browser
my $outGff = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'.gff';
# annotations output file (if it was queried) for direct upload of designed reagents to GBrowse
my $outAff = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'.aff';
# file containing miRNA seeds
my $outmiRNASeed = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'.mirseed';
# HTML output file
my $outHTML = $options{"OUTPUT"}[0].'index.html';
# report on the progress of the program
my $outReport = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_report.txt';
# list of failed designs
my $outFailed = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_failed.txt';
&fileLoc('Output',$outReport,'Report');
open (REPORT, ">$outReport") || die "Cannot open REPORT: $!\n";
&fileLoc('Output',$outFailed,'Failed');
open (FAILED, ">$outFailed") || die "Cannot open FAILED: $!\n";

##
## Get input headers (as unique identifiers!) and sequences from FASTA input file
##

my %IDSeq = ();
my %IDSeqPrimer = ();
my @IDSeqKeys = ();
my %IDfail = ();

# read first input file 
if (($evaluation eq 'NO') || (($reagent eq 'd') && ($evaluation eq 'DSRNA')) || (($reagent eq 'd') && ($evaluation eq 'DSRNA+OLIGO')) || (($reagent eq 's') && ($evaluation eq 'OLIGO'))){
# read from FASTA input files
    &fileLoc('Input',$inputFile1,'Input1');
    &readFASTA($inputFile1,\%IDSeq,\*ERROR,'strict');
    @IDSeqKeys = keys %IDSeq;

# delete sequence too short for primer design (< 40bp), write validated input file
    my $IDSeqInput = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_Input.fa';
    &fileLoc('Input',$IDSeqInput,'Input1validatedFASTA');
    open (INPUT, ">$IDSeqInput") || die "Cannot open INPUT: $!\n";
    my $discard = 0;
    for (my $i=0;$i<scalar(@IDSeqKeys);$i++){
	if (length($IDSeq{$IDSeqKeys[$i]}) < 40){
	    if ($reagent eq 's'){
		print INPUT ">$IDSeqKeys[$i]\n$IDSeq{$IDSeqKeys[$i]}\n";
	    }
	    else {
		my $len = length($IDSeq{$IDSeqKeys[$i]});
		if (!exists $IDfail{$IDSeqKeys[$i]}){
		    $IDfail{$IDSeqKeys[$i]} = "Sequence too short for design ($len)";
		    print FAILED "$IDSeqKeys[$i]\tSequence too short for design ($len)\n";
		}
		print ERROR "$IDSeqKeys[$i]\tSequence to short ($len), primer design is not possible\n";
		delete $IDSeq{$IDSeqKeys[$i]};
		$discard++;
	    }
	}
	else {
	    print INPUT ">$IDSeqKeys[$i]\n$IDSeq{$IDSeqKeys[$i]}\n";
	}
    }
    close INPUT;

    undef @IDSeqKeys;
    @IDSeqKeys = keys %IDSeq;

    my $ScalarIDSeqKeys = scalar(@IDSeqKeys); 
    print REPORT "$ScalarIDSeqKeys sequences were read from input file ($discard sequences were discarded)\n";
    print "\n$ScalarIDSeqKeys sequences were read from input file ($discard sequences were discarded)\n";
# exit program in case no valid input sequence was found
    if ($ScalarIDSeqKeys eq 0){
	print ERROR "$inputFile1\tNo valid sequences found, check input format (FASTA required)\n";
	print "No valid sequences found in $inputFile1. Check input format (FASTA required).\n";
	exit;
	close ERROR;
	close REPORT;
	close FAILED;
    }
}

# parse primer input file
my $outPrimer = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_Primer.fa';
my $outPrimer2 = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_Primer.txt';
if ((($reagent eq 'd') && ($evaluation eq 'OLIGO')) || (($reagent eq 'd') && ($evaluation eq 'DSRNA+OLIGO'))){
    my $inputFilePrimer = "";
    if ($evaluation eq 'DSRNA+OLIGO'){
	$inputFilePrimer = $inputFile2;
	&fileLoc('Input',$inputFile2,'Input2');
    }
    else {
	$inputFilePrimer = $inputFile1;
	&fileLoc('Input',$inputFile1,'Input2');
    }
    &readFASTA($inputFilePrimer,\%IDSeqPrimer,\*ERROR,'strict');

    my @IDSeqPrimerKeys = ();
    my %IDPrimerUnique = ();
    
# delete IDs with no or only one primer available
    my $discard = 0;
    if ($evaluation eq 'DSRNA+OLIGO'){
	for (my $i=0;$i<scalar(@IDSeqKeys);$i++){
	    my $ID1 = $IDSeqKeys[$i].'_f';
	    my $ID2 = $IDSeqKeys[$i].'_r';
	    if ((!exists $IDSeqPrimer{$ID1}) || (!exists $IDSeqPrimer{$ID2})){
		delete $IDSeq{$IDSeqKeys[$i]};
		if (exists $IDSeqPrimer{$ID1}){
		    delete $IDSeqPrimer{$ID1};
		}
		if (exists $IDSeqPrimer{$ID2}){
		    delete $IDSeqPrimer{$ID2};
		}
		if (!exists $IDfail{$IDSeqKeys[$i]}){
                    $IDfail{$IDSeqKeys[$i]} = "Primer input incomplete";
                    print FAILED "$IDSeqKeys[$i]\tPrimer input incomplete\n";
                }
		print ERROR "$IDSeqKeys[$i]\tOne or both primer are missing, primers and dsRNA are not considered for further calculations\n";
		$discard++;
	    }
	}
	undef @IDSeqPrimerKeys;
	@IDSeqPrimerKeys = keys %IDSeqPrimer;
	for (my $i=0;$i<scalar(@IDSeqPrimerKeys);$i++){
	    my $ID = "";
	    if ($IDSeqPrimerKeys[$i]=~/(\S+)_\w+/){
		$ID = $1;
	    }
	    if (!exists $IDPrimerUnique{$ID}){
		$IDPrimerUnique{$ID} = "";
	    }
	}
	my @IDPrimerUnique = keys %IDPrimerUnique;
	for (my $i=0;$i<scalar(@IDPrimerUnique);$i++){
            if (!exists $IDSeq{$IDPrimerUnique[$i]}){
		my $ID1 = $IDPrimerUnique[$i].'_f';
		my $ID2 = $IDPrimerUnique[$i].'_r';
		delete $IDSeqPrimer{$ID1};
		delete $IDSeqPrimer{$ID2};
		if (!exists $IDfail{$IDPrimerUnique[$i]}){
                    $IDfail{$IDPrimerUnique[$i]} = "Amplicon sequence missing";
                    print FAILED "$IDPrimerUnique[$i]\tAmplicon sequence missing\n";
                }
                print ERROR "$IDPrimerUnique[$i]\tLong dsRNA sequence belonging to input primers was not found, primers are not considered for further calculations\n";
		$discard++;
	    }
        }
    }
    else {
# generate file for primer-mapping input
	&fileLoc('Input',$outPrimer,'Input1validatedFASTA');
	open (OUTPRIME, ">$outPrimer") || die "Cannot open OUTPRIME: $!\n";
	&fileLoc('Input',$outPrimer2,'Input1validatedTAB');
	open (OUTPRIME2, ">$outPrimer2") || die "Cannot open OUTPRIME2: $!\n";

	@IDSeqPrimerKeys = keys %IDSeqPrimer;
        for (my $i=0;$i<scalar(@IDSeqPrimerKeys);$i++){
            my $ID = "";
            if ($IDSeqPrimerKeys[$i]=~/(\S+)_\w+/){
                $ID = $1;
            }
            if (!exists $IDPrimerUnique{$ID}){
                $IDPrimerUnique{$ID} = "";
            }
        }
        my @IDPrimerUnique = keys %IDPrimerUnique;
	for (my $i=0;$i<scalar(@IDPrimerUnique);$i++){
	    my $ID1 = $IDPrimerUnique[$i].'_f';
            my $ID2 = $IDPrimerUnique[$i].'_r';
# modify identifier for later analysis
	    if ((!exists $IDSeqPrimer{$ID1}) || (!exists $IDSeqPrimer{$ID2})){
                if (exists $IDSeqPrimer{$ID1}){
                    delete $IDSeqPrimer{$ID1};
                }
                if (exists $IDSeqPrimer{$ID2}){
                    delete $IDSeqPrimer{$ID2};
                }
		delete $IDPrimerUnique{$IDPrimerUnique[$i]};
		if (!exists $IDfail{$IDPrimerUnique[$i]}){
                    $IDfail{$IDPrimerUnique[$i]} = "Primer input incomplete";
                    print FAILED "$IDPrimerUnique[$i]\tPrimer input incomplete\n";
                }
                print ERROR "$IDPrimerUnique[$i]\tOne primer is missing, ID is  not considered for further calculations\n";
		$discard++;
            }
	    else {
		print OUTPRIME ">$ID1\n$IDSeqPrimer{$ID1}\n>$ID2\n$IDSeqPrimer{$ID2}\n";
		print OUTPRIME2 "$ID1\t$IDSeqPrimer{$ID1}\n$ID2\t$IDSeqPrimer{$ID2}\n";
	    }
        }
	close OUTPRIME;
	close OUTPRIME2;
    }
    undef @IDSeqKeys;
    @IDSeqKeys = keys %IDSeq;
    undef @IDSeqPrimerKeys;
    @IDSeqPrimerKeys = keys %IDSeqPrimer;
    
    if ($evaluation eq 'DSRNA+OLIGO'){
	my $ScalarIDSeqKeys = scalar(@IDSeqKeys); 
	print REPORT "$ScalarIDSeqKeys sequences were read in from input file (after checking availability of corresponding primer), $discard pairs were discarded\n";
	print "\n$ScalarIDSeqKeys sequences were read in from input file (after checking availability of corresponding primer), $discard pairs were discarded\n";
# exit program in case no valid input sequence was found
	if ($ScalarIDSeqKeys eq 0){
	    print ERROR "$inputFile1, $inputFile2\tNo valid input found, check input format (FASTA required)\n";
	    print "No valid input found in $inputFile1, $inputFile2. Check input format (FASTA required).\n";
	    close ERROR;
	    close REPORT;
	    close FAILED;
	    exit;
	}
    }
    else {
	my $ScalarIDSeqKeys = scalar(keys %IDPrimerUnique);
	print REPORT "$ScalarIDSeqKeys primer pairs were read in from input file (after checking availability of both primers), $discard pairs were discarded\n";
        print "\n$ScalarIDSeqKeys primer pairs were read in from input file (after checking availability of both primers), $discard pairs were discarded\n";
# exit program in case no valid input sequence was found
        if ($ScalarIDSeqKeys eq 0){
	    print ERROR "$inputFile1\tNo valid input found, check input format (FASTA required)\n";
            print "No valid input found in $inputFile1. Check input format (FASTA required).\n";
	    close ERROR;
	    close REPORT;
	    close FAILED;
            exit;
        }
    }
}

##
## Mapping of primers and calculation of amplified long dsRNA sequence
##

my %RNAiloc = ();
my %NOTmapped = ();
my $mapping = 0;
my $Mapped = "";
if (($evaluation eq 'OLIGO') && ($reagent eq 'd')){
    my $discard = 0;
    if (($options{"GENOMEBOWTIE"}[0] eq "empty") || ($options{"GENOMEFASTA"}[0] eq "empty")){
# if no Bowtie mapping and FASTA databases were found, abort program
        print ERROR "Error in mapping primers\tThe evaluation of long dsRNAs from primer sequences requires a Bowtie index/database for sequences from which long dsRNAs were amplified (e.g. genome sequence) by defining the 'GENOMEBOWTIE' option in the additional options file and the corresponding FASTA file for sequence extraction by defining the GENOMEFASTA option in the additional options file\n";
        print "Error in mapping primers! The evaluation of long dsRNAs from primer sequences requires a Bowtie index/database for sequences from which long dsRNAs were amplified (e.g. genome sequence) by defining the 'GENOMEBOWTIE' option in the additional options file and the corresponding FASTA file for sequence extraction by defining the GENOMEFASTA option in the additional options file. Start NEXT-RNAi with '-h' for help.\n";
        close ERROR;
        close REPORT;
        close FAILED;
        exit;
    }
    else {
	$Mapped = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_Primer.Mapped';
	my $PrimerNotMapped = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_Primer.NotMapped';
# Mapping with Bowtie
	my $outBowtie = $outPrimer.'.bwt';
	my $BowtieMapping = "";
	if (-e "$options{\"BOWTIE\"}[0]bowtie"){
	    $BowtieMapping = &BowtieMapping(\@{ $options{"GENOMEBOWTIE"} },$options{"BOWTIE"}[0],$outPrimer,$outBowtie,\*ERROR);
	}
	else {
	    print ERROR "$options{\"BOWTIE\"}[0]\t'bowtie' program not found at this location, primers cannot be mapped\n";
            print "'bowtie' program not found in $options{\"BOWTIE\"}[0], primers cannot be mapped. Start NEXT-RNAi with '-h' for help.\n";
	    close ERROR;
	    close REPORT;
	    close FAILED;
	    exit;
	}
	if ($BowtieMapping eq 'Success'){
# parse primer Bowtie search
	    &fileLoc('Unlink',$Mapped);
	    &fileLoc('Output',$PrimerNotMapped,'PrimerNotMapped');
# generate index of bowtie output
            open(OUTBO,"<$outBowtie") or die "Cannot open $outBowtie for reading: $!\n";
            open(INDEX, "+>$outBowtie.idx") or die "Cannot open $outBowtie.idx for read/write: $!\n";
            &build_index(\*OUTBO,\*INDEX,'bowtie',$outBowtie,\%RNAiloc);
# parse mapping via index
            &ParseBowtieMap($outPrimer2,$outBowtie,$Mapped,$PrimerNotMapped,\%RNAiloc,\%NOTmapped,\*OUTBO,\*INDEX);
            close OUTBO;
            close INDEX;
# build index of primer MAPPED file
	    open(MAPPED,"<$Mapped") or die "Cannot open $Mapped for reading: $!\n";
	    open(MAPPEDINDEX, "+>$Mapped.idx") or die "Cannot open $Mapped.idx for read/write: $!\n";
	    &fileLoc('Output',$Mapped,'Mapped');
	    &fileLoc('Unlink',"$Mapped.idx");
	    &build_index(\*MAPPED,\*MAPPEDINDEX,'primer-mapped',$Mapped,\%RNAiloc);
# discard primer pairs that could not be mapped, for the others calculate sequences
	    my @IDSeqPrimerKeys = keys %IDSeqPrimer;
	    for (my $i=0;$i<scalar(@IDSeqPrimerKeys);$i++){
		if ($IDSeqPrimerKeys[$i]=~/(\S+)_\w+/){
		    my $ID = $1.'_1';
		    if (!exists $RNAiloc{$ID}{$Mapped}){
			delete $IDSeqPrimer{$IDSeqPrimerKeys[$i]};
			if (!exists $IDfail{$1}){
			    $discard++;
			    $IDfail{$1} = "Primer pair could not be mapped\n";
			    print FAILED "$1\tPrimer pair could not be mapped\n";
			}
		    }
		}
	    }
	    my $getSeq = &getSeq($options{"OUTPUT"}[0],$identifier,\@{ $options{"GENOMEFASTA"} },$Mapped,\%IDSeq,\*ERROR);
	    @IDSeqKeys = keys %IDSeq;
	    if ($getSeq eq 'Success'){
		if (scalar(keys(%IDSeq)) ne 0){
		    for (my $i=0;$i<scalar(@IDSeqPrimerKeys);$i++){
			if ($IDSeqPrimerKeys[$i]=~/(\S+)_\w+/){
			    my $ID = $1.'_1';
			    if (!exists $IDSeq{$1}){
				delete $IDSeqPrimer{$IDSeqPrimerKeys[$i]};
				if (exists $RNAiloc{$ID}){
				    delete $RNAiloc{$ID};
				}
				if (!exists $IDfail{$1}){
				    $discard++;
				    $IDfail{$1} = "Sequence could not be extracted after primer mapping\n";
				    print FAILED "$1\tSequence could not be extracted after primer mapping\n";
				}
			    }
			}
		    }
		    print REPORT "Mapping of primer pairs and calculation of amplified products done ($discard primer pairs could not be mapped or sequence could not be extracted)\n";
		    print "Mapping of primer pairs and calculation of amplified products done ($discard primer pairs could not be mapped or sequence could not be extracted)\n";
		    $mapping = 1;
		}
		else {
		    print ERROR "No sequence could be extracted from GENOMEFASTA file after mapping of primers\tCheck validity of GENOMEFASTA file (must be in correspondence to 'GENOMEBOWTIE' index/database)\n";
		    print "No sequence could be extracted from GENOMEFASTA file after mapping of primers. Check validity of GENOMEFASTA file (must be in correspondence to 'GENOMEBOWTIE' index/database). Start NEXT-RNAi with '-h' for help.\n";
		    close ERROR;
		    close REPORT;
		    close FAILED;
		    exit;
		}
	    }
	    else {
		print ERROR "No valid FASTA database for extraction of long dsRNA sequence found\tFASTA database location can be set with 'GENOMEFASTA' option in the additional options file (must be in correspondence to 'GENOMEBOWTIE' index/database)\n";
		print "No valid FASTA database for extraction of long dsRNA sequence found. FASTA database location can be set with 'GENOMEFASTA' option in the additional options file (must be in correspondence to 'GENOMEBOWTIE' index/database). Start NEXT-RNAi with '-h' for help.\n";
		close ERROR;
		close REPORT;
		close FAILED;
		exit;
	    }
	}
	else {
	    print ERROR "No valid Bowtie index/database for primer mapping found\tThe evaluation of long dsRNAs from primer sequences requires a Bowtie index/databases containing sequences from which long dsRNAs were amplified (e.g. genome sequence) by defining the 'GENOMEBOWTIE' option in the additional options file\n";
	    print "No valid Bowtie index/database for primer mapping defined in additional options file ('GENOMEBOWTIE' parameter). The evaluation of long dsRNAs from primer sequences requires a Bowtie index/databases containing sequences from which long dsRNAs were amplified (e.g. genome sequence). Start NEXT-RNAi with '-h' for help.\n";
	    close ERROR;
	    close REPORT;
	    close FAILED;
	    exit;
	}
    }
}

##
## parse target group information (to avoid wrong off-target annotation, which is ID dependent)
## create connections between target-ID and group to which target-ID belongs to
##

my %targetGroups = ();
my %Groupstarget = ();
my $tgcount = 0;
if ($options{"TARGETGROUPS"}[0] ne ""){
    for (my $i=0;$i<scalar(@{ $options{"TARGETGROUPS"} });$i++){
	if (-e $options{"TARGETGROUPS"}[$i]){
	    my %header = ();
	    my $header = 0;
	    open (TG, $options{"TARGETGROUPS"}[$i]) || die "Cannot open TG: $!\n";
	  TARGETGROUP:
	    while (my $line = <TG>){
		$line = &cleanLine($line);
		my @columns = ();
		@columns = split(/\t/, $line);
# get file headers
		if ($header eq 0){
		    for (my $j=0;$j<scalar(@columns);$j++){
			if (!exists $header{$columns[$j]}){
			    $header{$columns[$j]} = $j;
			}
		    }
		    if ((!exists $header{'Target'}) || (!exists $header{'TargetGroup'})){
			print ERROR "$options{TARGETGROUPS}[$i]\tTargetgroups file contains wrong header information ('Target' and 'TargetGroup' headers required)\n";
			print "$options{TARGETGROUPS}[$i] TARGETGROUPS file contains wrong header information ('Target' and 'TargetGroup' headers required) and is not considered. Start NEXT-RNAi with '-h' for help.\n";
			last TARGETGROUP;
		    }
		    else {
			$tgcount++;
			&fileLoc('Input',$options{"TARGETGROUPS"}[$i],"Targetgroups_$tgcount");
		    }
		}
		else {
		    if (!exists $targetGroups{$columns[$header{'Target'}]}){
			$targetGroups{$columns[$header{'Target'}]} = $columns[$header{'TargetGroup'}];
		    }
		    else {
			print ERROR "$columns[$header{Target}]\tAmbiguous ID in TARGETGROUPS file $options{TARGETGROUPS}[$i]\n";
		    }
		    if (!exists $Groupstarget{$columns[$header{'TargetGroup'}]}){
			$Groupstarget{$columns[$header{'TargetGroup'}]} = [ $columns[$header{'Target'}], ];
		    }
		    else {
			push (@{ $Groupstarget{$columns[$header{'TargetGroup'}]} }, $columns[$header{'Target'}]);
		    }
		}
		$header++;
	    }
	    close TG;
	}
	else {
	    print ERROR "$options{TARGETGROUPS}[$i]\tTARGETGROUPS file was not found\n";
	    print "$options{TARGETGROUPS}[$i] TARGETGROUPS file was not found. Start NEXT-RNAi with '-h' for help.\n";
	}
    }
}
else {
    print REPORT "No TARGETGROUPS file defined in additional options file ('TARGETGROUPS' parameter in additional options file), each target (from off-target mapping) is considered as targetgroup\n";
    print "No TARGETGROUPS file defined in additional options file ('TARGETGROUPS' parameter in additional options file), each target (from off-target mapping) is considered as targetgroup. Start NEXT-RNAi with '-h' for help.\n";
}

##
## Retrieve sequence identifiers from off-target database to be excluded as off-target but counted as unfavorable region
##

my %TargetExclude = ();
my $excludecount = 0;
if ($options{"EXCLUDED"}[0] ne ""){
    for (my $i=0;$i<scalar@{ $options{"EXCLUDED"} };$i++){
	if (-e $options{"EXCLUDED"}[$i]){
	    my %header = ();
	    my $header = 0;
	    open (EX, $options{"EXCLUDED"}[$i]) || die "Cannot open EX: $!\n";
	  EXCLUDED:
	    while (my $line = <EX>){
		$line = &cleanLine($line);
		my @columns = ();
		@columns = split(/\t/, $line);
# get file headers
		if ($header eq 0){
		    for (my $j=0;$j<scalar(@columns);$j++){
			if (!exists $header{$columns[$j]}){
			    $header{$columns[$j]} = $j;
			}
		    }
		    if (!exists $header{'Exclude'}){
			print ERROR "$options{EXCLUDED}[$i]\tFile with targets not to be considered as off-targets contains wrong header information ('Exclude' header required)\n";
			print "$options{EXCLUDED}[$i] EXCLUDED file contains wrong header information ('Exclude' header required) and is not considered. Start NEXT-RNAi with '-h' for help.\n";
			last EXCLUDED;
		    }
		    else {
			$excludecount++;
			&fileLoc('Input',$options{"EXCLUDED"}[$i],"Excluded_$excludecount");
		    }
		}
		else {
		    if (!exists $TargetExclude{$columns[$header{'Exclude'}]}){
			$TargetExclude{$columns[$header{'Exclude'}]} = "";
		    }
		    else {
			print ERROR "$columns[$header{Exclude}]\tAmbiguous ID in EXCLUDED file $options{EXCLUDED}[$i]\n";
		    }
		}
		$header++;
	    }
	    close EX;
	}
	else {
	    print ERROR "$options{EXCLUDED}[$i]\tEXCLUDED file was not found ('EXCLUDED' parameter in additional options file)\n";
	    print "$options{EXCLUDED}[$i] EXCLUDED file was not found ('EXCLUDED' parameter in additional options file). Start NEXT-RNAi with '-h' for help.\n";
	}
    }
}

##
## Retrieve sequence identifiers from off-target database to be counted as intended region
##

my %IntendedTarget = ();
my $intendcount = 0;
if ($options{"INTENDED"}[0] ne ""){
    for (my $i=0;$i<scalar@{ $options{"INTENDED"} };$i++){
	if (-e $options{"INTENDED"}[$i]){
	    my %header = ();
	    my $header = 0;
	    open (INT, $options{"INTENDED"}[$i]) || die "Cannot open INT: $!\n";
	  INTENDED:
	    while (my $line = <INT>){
		$line = &cleanLine($line);
		my @columns = ();
		@columns = split(/\t/, $line);
# get file headers
		if ($header eq 0){
		    for (my $j=0;$j<scalar(@columns);$j++){
			if (!exists $header{$columns[$j]}){
			    $header{$columns[$j]} = $j;
			}
		    }
		    if ((!exists $header{'Query'}) || (!exists $header{'Intended'})){
			print ERROR "$options{INTENDED}[$i]\tIntended target file contains wrong header information ('Intended' header required)\n";
			print "$options{INTENDED}[$i] INTENDED file contains wrong header information ('Intended' header required) and is not considered. Start NEXT-RNAi with '-h' for help.\n";
			last INTENDED;
		    }
		    else {
			$intendcount++;
			&fileLoc('Input',$options{"INTENDED"}[$i],"Intended_$intendcount");
		    }
		}
		else {
		    if (!exists $IntendedTarget{$columns[$header{'Query'}]}){
			$IntendedTarget{$columns[$header{'Query'}]} = $columns[$header{'Intended'}];
		    }
		    else {
			print ERROR "$columns[$header{Query}]\tAmbiguous ID in INTENDED file $options{INTENDED}[$i]\n";
		    }
		}
		$header++;
	    }
	    close INT;
	}
	else {
	    print ERROR "$options{INTENDED}[$i]\tINTENDED file was not found ('INTENDED' parameter in additional options file)\n";
	    print "$options{INTENDED}[$i] INTENDED file was not found ('INTENDED' parameter in additional options file). Start NEXT-RNAi with '-h' for help.\n";
	}
    }
}

##
## Split input for high number of queries
##

my @IDSeqKeysALL = @IDSeqKeys;
my $splitin = int((scalar(@IDSeqKeys)/$splitInput));
my $splitinmodulo = scalar(@IDSeqKeys) % $splitInput;
my %splitin = ();
my $splitstart = 0;
my $splitend = $splitInput - 1;
for (my $i=1;$i<=$splitin;$i++){
    if ($splitstart eq $splitend){
	$splitin{$i} = [ $IDSeqKeys[$splitstart], ];
    }
    else {
	my @subarray = @IDSeqKeys[ $splitstart .. $splitend ];
	$splitin{$i} = [ @subarray ];
    }
    $splitstart+= $splitInput;
    $splitend+= $splitInput;
}
if ($splitinmodulo ne 0){
    $splitin++;
    $splitend = scalar(@IDSeqKeys) - 1;
    if ($splitstart eq $splitend){
	$splitin{$splitin} = [ $IDSeqKeys[$splitstart], ];
    }
    else {
	my @subarray = @IDSeqKeys[ $splitstart .. $splitend ];
	$splitin{$splitin} = [ @subarray ];
    }
}
print REPORT "Splitted input into $splitin part(s) (by $splitInput feature(s))\n";
print "Splitted input into $splitin part(s) (by $splitInput feature(s))\n";

##
## Start looping over splitted input
##

#
# Define data structures calculated for each input part
# 

# low complexity
my $lowcomp = 0;
my $canrepeats = 0;
my %FilterPos = ();
my %LowCompRegions = ();
my %CANrepeats = ();

# seed matches
my $seedmatch = 0;
my %seedNum = ();
my %seedseq = ();
my $mirseed = 0;
my %mirSeed = ();

# efficiency
my %InputEffsiRNA = ();

# specificity
my %siRNATargetExclude = ();
my %InputTarget = ();
my %InputsiRNATarget = ();
my %InputtargetGroups = ();
my %InputSpecRegion = ();
my %InputSpecsiRNA = ();
my %InputSpecPrimer = ();
my %siRNAPos = ();

# assembly of results
my %Designs = ();
my %DesignsBest = ();
my %DesignsBad = ();
my %DesignsLeftover = ();
my %DesignsFailed = ();
my %DesignsOUTPUT = ();
my %notprinted = ();
my %DesignsPrint = ();
my %IDscovered = ();

for (my $z=1;$z<=$splitin;$z++){
    print REPORT "Processing part $z\n";
    print "Processing part $z\n";
    @IDSeqKeys = @{ $splitin{$z} };

##
## DICE sequences into all possible SIRNALENGTH fragments and write results in FASTA format to file
##

# for evaluation of siRNAs (oligos) DICER function is not required
    my $outEdicer = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'.dice';
    if (($evaluation eq 'OLIGO') && ($reagent eq 's')){
	&fileLoc('Looped',$outEdicer);
	open (SIRNAS, ">$outEdicer") || die "Cannot open SIRNAS: $!\n";
# write siRNAs as 'DICER products'
	for (my $i=0;$i<scalar(@IDSeqKeys);$i++){
	    print SIRNAS ">$IDSeqKeys[$i]\_1\n$IDSeq{$IDSeqKeys[$i]}\n";
	}
	close SIRNAS;
	print REPORT "Stored siRNAs to file for off-target evaluation\n";
	print "Stored siRNAs to file for off-target evaluation\n";
    }
    else {
# for all other inputs 'dice' input target sequences in siRNAs
	&fileLoc('Looped',$outEdicer);
	&edicer($outEdicer,$options{"SIRNALENGTH"}[0],\%IDSeq,\@IDSeqKeys);
	print REPORT "Input target sequences were cut into $options{\"SIRNALENGTH\"}[0] nt siRNAs\n";
	print "Input target sequences were cut into $options{\"SIRNALENGTH\"}[0] nt siRNAs\n";
    }
    
##
## Scan target sequences for regions of low complexity
##

# identify different kinds of low complexity regions with mDust
    if ($options{"LOWCOMPEVAL"}[0] ne "empty"){
	my $IDSeqInput = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_Input.fa';
	if (-e "$options{\"LOWCOMPEVAL\"}[0]mdust"){
	    &LowCompRegions($options{"OUTPUT"}[0],$identifier,\%IDSeq,$options{"LOWCOMPEVAL"}[0],$IDSeqInput,\%LowCompRegions,\%FilterPos);
	    print REPORT "Identification of low-complexity regions in RNAi reagents done\n";
	    print "Identification of low-complexity regions in RNAi reagents done\n";
	    $lowcomp = 1;
	}
	else {
	    print ERROR "$options{\"LOWCOMPEVAL\"}[0]\t'mdust' program not found at this location ('LOWCOMPEVAL' parameter in additional options file), sequences were not evaluated for low-complexity regions\n";
	    print "'mdust' program not found in $options{\"LOWCOMPEVAL\"}[0] ('LOWCOMPEVAL' parameter in additional options file), sequences were not evaluated for low-complexity regions. Start NEXT-RNAi with '-h' for help.\n";
	}
    }
# identify CAN repeats
    if ($options{"CANEVAL"}[0] ne "empty"){
	if (($options{"CANEVAL"}[0]=~/^\d+$/) && ($options{"CANEVAL"}[0] > 0)){
	    &CANrepeats(\%IDSeq,$options{"CANEVAL"}[0],\%CANrepeats,\%FilterPos);
	    print REPORT "Identification of $options{\"CANEVAL\"}[0]xCAN repeats in target regions done\n";
	    print "Identification of $options{\"CANEVAL\"}[0]xCAN repeats in target regions done\n";
	    $canrepeats = 1;
	}
	else {
	    print ERROR "CANEVAL\tOnly numbers >0 are allowed for this option ($options{\"CANEVAL\"}[0] is not a number) ('CANEVAL' parameter in additional options file)\n";
	    print "Only numbers >0 are allowed for CANEVAL option ($options{\"CANEVAL\"}[0] is not a number, 'CANEVAL' parameter in additional options file). Start NEXT-RNAi with '-h' for help.\n";
	}
    }

##
## Calculate number of siRNA seed matches to a sequence database
##

    if ($options{"SEEDMATCH"}[0] ne "empty"){
	my @seedOptions = split(/\,/,$options{"SEEDMATCH"}[0]);
	if (($seedOptions[0] >= 6) && ($seedOptions[0] <= 8) && ($seedOptions[1]=~/^\d+$/) && ($seedOptions[1] > 0)){
# if no bowtie index was provided, run bowtie-build
	    unless ((-e "$seedOptions[2]\.1\.ebwt") && (-e "$seedOptions[2]\.2\.ebwt") && (-e "$seedOptions[2]\.3\.ebwt") && (-e "$seedOptions[2]\.4\.ebwt") && (-e "$seedOptions[2]\.rev.1\.ebwt") && (-e "$seedOptions[2]\.rev.2\.ebwt")){
		unless (-e $seedOptions[2]){
		    print ERROR "SEEDMATCH\tInvalid input for seedmatch evaluation ('SEEDMATCH' parameter in additional options file)\n";
		    print "Invalid input for SEEDMATCH option ('SEEDMATCH' parameter in additional options file). Running seed match analysis requires either a valid FASTA database or a valid Bowtie index for mapping of seeds ($seedOptions[2] in options file). Start NEXT-RNAi with '-h' for help.";
		    exit;
		}
		if (-e "$options{\"BOWTIE\"}[0]bowtie-build"){
		    system ("$options{\"BOWTIE\"}[0]bowtie-build $seedOptions[2] $seedOptions[2]") eq 0 || die "Failed to open bowtie-build: $?\n";
		}
		else {
		    print ERROR "$options{\"BOWTIE\"}[0]\t'bowtie-build' was not found in this location ('BOWTIE' parameter in additional options file), for 'SEEDMATCH' analysis either provide a valid FASTA sequence file that requires 'bowtie-build' (and 'bowtie') or a valid Bowtie index/database file that only requires 'bowtie' to run\n";
		    print "'bowtie-build' was not found in $options{\"BOWTIE\"}[0] ('BOWTIE' parameter in additional options file). For 'SEEDMATCH' analysis either provide a valid FASTA sequence file that requires 'bowtie-build' (and 'bowtie') or a valid Bowtie index/database file that only requires 'bowtie' to run. Start NEXT-RNAi with '-h' for help.\n
";
		}
	    }
	    my @done = keys %seedseq;
	    if ((($seedOptions[0] eq 6) && (scalar(@done) < 4096)) || (($seedOptions[0] eq 7)&& (scalar(@done) < 16384)) || (($seedOptions[0] eq 8)&& (scalar(@done) < 65536))){
		&seedMapper($outEdicer,$identifier,$options{"OUTPUT"}[0],$options{"BOWTIE"}[0],$seedOptions[0],$seedOptions[1],$seedOptions[2],\%seedseq);
	    }
	    &seedMatcher($outEdicer,$identifier,$options{"OUTPUT"}[0],$seedOptions[0],$seedOptions[1],\%seedseq,\%seedNum,\%FilterPos);
	    print REPORT "Calculation of seed complement frequencies done\n";
	    print "Calculation of seed complement frequencies done\n";
	    $seedmatch = 1;
	}
	else {
	    print ERROR "SEEDMATCH\tInvalid input for seedmatch evaluation ('SEEDMATCH' parameter in additional options file)\n";
	    print "Invalid input for SEEDMATCH option ('SEEDMATCH' parameter in additional options file). Length of the analysed seed sequence must be a number between 6 and 8 [bp] ($seedOptions[0] in options file), maximum allowed seed complement frequency must be >0 ($seedOptions[1] in options file). Start NEXT-RNAi with '-h' for help.";
	}
    }

##
## Find conserved miRNA seeds in siRNAs
##

    if ($options{"MIRSEED"}[0] ne "empty"){
	my @mirOptions = split(/\,/,$options{"MIRSEED"}[0]);
	if (($mirOptions[0] >= 6) && ($mirOptions[0] <= 8) && (-e $mirOptions[1])){
	    &fileLoc('Output',$outmiRNASeed,'miRNASeeds');
	    &mirSeed($outEdicer,$mirOptions[0],$mirOptions[1],\%mirSeed,\%FilterPos,\*ERROR);
	    print REPORT "Calculation of conserved seed sequences done\n";
            print "Calculation of conserved seed sequences done.\n";
            $mirseed = 1;
	}
	else {
	    print ERROR "MIRSEED\tInvalid input for evaluation of conserved seeds in siRNA sequences ('MIRSEED' parameter in additional options file)\n";
            print "Invalid input for MIRSEED option ('MIRSEED' parameter in additional options file). Length of the analysed seed sequence must be a number between 6 and 8 [bp] ($mirOptions[0] in options file) and an existing FASTA file containing miRNA sequences must be provided ($mirOptions[1] in options file). Start NEXT-RNAi with '-h' for help.";
        }
    }

##
## Calculate efficiency for every siRNA 'diced' from input sequences
##
    
    my @effOptions = split(/\,/,$options{"EFFICIENCY"}[0]);
    if ($effOptions[0] ne "empty"){
	if (!defined $effOptions[0]){
	    $effOptions[0] = 'SIR';
	}
	if (!defined $effOptions[1]){
	    $effOptions[1] = 0;
	}
	for (my $i=0;$i<scalar(@IDSeqKeys);$i++){
# Efficiency calculation according to rational design from Reynolds et al. (2004)
	    if ($effOptions[0] eq 'RATIONAL'){
		if (-e "$options{\"VIENNA\"}[0]RNAfold.pl"){
		    my @sirnas = ();
		    my @hairpins = ();
# Calculate local foldings of siRNAs using the ViennaRNA package
		    &ViennaRNA($identifier,$IDSeq{$IDSeqKeys[$i]},$options{"SIRNALENGTH"}[0],$options{"OUTPUT"}[0],$options{"VIENNA"}[0],\@sirnas,\@hairpins);
		    for (my $j=0;$j<scalar(@sirnas);$j++){
			my $siRNANum = $j + 1;
# Efficiency calculations for each single siRNA
			my $scoresiRNA = &siRNAEfficiency($sirnas[$j],$hairpins[$j],$IDSeqKeys[$i],$siRNANum,$effOptions[1],\%FilterPos);
			if (!exists $InputEffsiRNA{$IDSeqKeys[$i]}){
			    $InputEffsiRNA{$IDSeqKeys[$i]} = [ $scoresiRNA, ];
			}
			else {
			    push (@{ $InputEffsiRNA{$IDSeqKeys[$i]} }, $scoresiRNA);
			}
		    }
		}
		else {
		    print ERROR "$options{\"VIENNA\"}[0]\t'RNAfold.pl' program not found at this location ('VIENNA' parameter in additional options file), sequences could not be evaluated for efficiency\n";
		    print "'RNAfold.pl' program not found in $options{\"VIENNA\"}[0] ('VIENNA' parameter in additional options file), sequences could  not be evaluated for efficiency. Start NEXT-RNAi with '-h' for help.\n";
		    $options{"EFFICIENCY"}[0] = 'empty,empty';
		    exit;
		}
	    }
	    else {
# Efficiency calculation according to weighted scoring system from Shah et al. (2007)
		for (my $j=0;$j+($options{"SIRNALENGTH"}[0]-1)<length($IDSeq{$IDSeqKeys[$i]});$j++){
		    my $siRNA = substr($IDSeq{$IDSeqKeys[$i]},$j,$options{"SIRNALENGTH"}[0]);
		    my $siRNANum = $j + 1;
		    my $FinalScore = &siR($siRNA,$IDSeqKeys[$i],$siRNANum,$effOptions[1],\%FilterPos);
		    if (!exists $InputEffsiRNA{$IDSeqKeys[$i]}){
			$InputEffsiRNA{$IDSeqKeys[$i]} = [ $FinalScore, ];
		    }
		    else {
			push (@{ $InputEffsiRNA{$IDSeqKeys[$i]} }, $FinalScore);
		    }
		}
	    }
	}
	print REPORT "Efficiency calculations done ($effOptions[0])\n";
	print "Efficiency calculations done ($effOptions[0])\n";
    }

##
## Run Bowtie of *.dice file on database file and parse Bowtie output for specificity
##

    my $outBowtie = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'.dice.bwt';
    if ($databaseFile ne 'nodb'){
	if (-e "$options{\"BOWTIE\"}[0]bowtie"){
	    system ("$options{\"BOWTIE\"}[0]bowtie -p 4 -f -v 0 -a $databaseFile $outEdicer > $outBowtie") eq 0 || die "Failed to open bowtie: $?\n";
	}
	else {
	    print ERROR "$options{\"BOWTIE\"}[0]\t'bowtie' program not found at this location ('BOWTIE' parameter in additional options file), siRNAs cannot be mapped to the off-target database\n";
	    print "'bowtie' program not found in $options{\"BOWTIE\"}[0] ('BOWTIE' parameter in additional options file), siRNAs cannot be mapped to the off-target database. Start NEXT-RNAi with '-h' for help.\n";
	    close ERROR;
	    close REPORT;
	    close FAILED;
	    exit;
	}
	&fileLoc('Looped',$outBowtie);
	print REPORT "Bowtie has finished running siRNAs against the off-target database\n";
	print "Bowtie has finished running all siRNAs against the off-target database\n";
    }
# Run Bowtie of *.dice file on regions that should be avoided
    if ($options{"INDEPENDENT"}[0] ne 'empty'){
	for (my $i=0;$i<scalar(@{ $options{"INDEPENDENT"} });$i++){
# if bowtie index was provided, run bowtie
	    if ((-e "$options{\"INDEPENDENT\"}[$i]\.1\.ebwt") && (-e "$options{\"INDEPENDENT\"}[$i]\.2\.ebwt") && (-e "$options{\"INDEPENDENT\"}[$i]\.3\.ebwt") && (-e "$options{\"INDEPENDENT\"}[$i]\.4\.ebwt") && (-e "$options{\"INDEPENDENT\"}[$i]\.rev\.1\.ebwt") && (-e "$options{\"INDEPENDENT\"}[$i]\.rev\.2\.ebwt")){
		system ("$options{\"BOWTIE\"}[0]bowtie -p 4 -f -v 0 -a $options{\"INDEPENDENT\"}[$i] $outEdicer >> $outBowtie") eq 0 || die "Failed to open bowtie: $?\n";
	    }
	    else {
# if sequence file was provided, build bowtie index first and then run bowtie
		if (-e $options{"INDEPENDENT"}[$i]){
		    if (-e "$options{\"BOWTIE\"}[0]bowtie-build"){
			system ("$options{\"BOWTIE\"}[0]bowtie-build $options{\"INDEPENDENT\"}[$i] $options{\"INDEPENDENT\"}[$i]") eq 0 || die "Failed to open bowtie-build: $?\n";
			system ("$options{\"BOWTIE\"}[0]bowtie -p 4 -f -v 0 -a $options{\"INDEPENDENT\"}[$i] $outEdicer >> $outBowtie") eq 0 || die "Failed to open bowtie: $?\n";
		    }
		    else {
			print ERROR "$options{\"BOWTIE\"}[0]\t'bowtie-build' was not found in this location ('BOWTIE' parameter in additional options file), for 'INDEPENDENT' designs either provide a valid FASTA sequence file that requires 'bowtie-build' (and 'bowtie') or a valid Bowtie index/database file that only requires 'bowtie' to run\n";
			print "'bowtie-build' was not found in $options{\"BOWTIE\"}[0] ('BOWTIE' parameter in additional options file). For 'INDEPENDENT' designs either provide a valid FASTA sequence file that requires 'bowtie-build' (and 'bowtie') or a valid Bowtie index/database file that only requires 'bowtie' to run. Start NEXT-RNAi with '-h' for help.\n";
		    }
		}
		else {
		    print ERROR "$options{\"INDEPENDENT\"}[$i]\tFASTA file for 'INDEPENDENT' designs (defined in additional options file) was not found\n";
		    print "FASTA file $options{\"INDEPENDENT\"}[$i] for 'INDEPENDENT' designs (defined in additional options file) was not found. Start NEXT-RNAi with '-h' for help.\n";
		}
	    }
	}
    }

# if no off-target database was provided set InputSpecRegions to full sequence for all identifiers

# identification of target for every input sequence
    &BowtieTarget(\@IDSeqKeys,$outBowtie,\%TargetExclude,\%InputTarget,\%InputsiRNATarget,\%siRNATargetExclude,$reagent,$evaluation,\%siRNAPos);
    
# annotate target groups for identified targets
    &targetGroups(\@IDSeqKeys,\%targetGroups,\%InputTarget,\%InputtargetGroups,\*ERROR);

# distinguish on- from off-target hits and record specific regions of every input sequence
# also considering regions of low complexity, low efficiency and high seed complement frequencies
    &SpecRegion(\%IDSeq,\@IDSeqKeys,\%InputsiRNATarget,\%targetGroups,\%InputtargetGroups,\%siRNATargetExclude,\%InputSpecRegion,\%InputSpecsiRNA,$options{"SIRNALENGTH"}[0],$evaluation,$options{"TARGETSEQ"}[0],\%FilterPos);
    
    print REPORT "Bowtie parsing and specificity calculations done\n";
    print "Bowtie parsing and specificity calculations done\n";
    
##
## Design primers on each specific region using primer3 software
##

# go through all identified specific regions of all input target sequences and design primers on them
    my @designwindow = split(/,/, $options{"DESIGNWINDOW"}[0]);
# no primer designs for siRNA design and evaluation options
    if ($reagent eq 's'){
	print REPORT "Primer design omitted for siRNA design/evaluation\n";
	print "Primer design omitted for siRNA design/evaluation\n";
    }
    else {
	if (-e "$options{\"PRIMER3\"}[0]primer3_core"){
	    &primer3($identifier,$options{"OUTPUT"}[0],$options{"PRIMER3"}[0],$options{"SIRNALENGTH"}[0],\%IDSeq,\@IDSeqKeys,\%InputSpecRegion,$designwindow[0],$designwindow[1],\%InputSpecPrimer,\*ERROR,$evaluation,\%IDSeqPrimer,$options{"DESIGNNUM"}[0],$options{"PRIMER3OPT"}[0]);
	    print REPORT "Primer calculations done\n";
	    print "Primer calculations done\n";
	}
	else {
	    print ERROR "$options{\"PRIMER3\"}[0]\t'primer3_core' not found in this location ('PRIMER3' parameter in additional options file), no primer design possible\n";
	    print "'primer3_core' not found in $options{\"PRIMER3\"}[0] ('PRIMER3' parameter in additional options file). Primer design is not possible.\n";
	    close ERROR;
	    close REPORT;
	    close FAILED;
	    exit;
	}
    }

##
## Identification of best designs for every input target sequence and assembly of all results
##

    if ($reagent eq "s"){
	&assembleResultsiRNA($options{"SIRNALENGTH"}[0],$options{"DESIGNNUM"}[0],\@designwindow,\%IDSeq,\@IDSeqKeys,\%InputSpecRegion,\%InputSpecsiRNA,\%InputEffsiRNA,$effOptions[0],\%seedNum,\%Designs,\%DesignsBest,\%targetGroups,\%Groupstarget,\%InputsiRNATarget,\*ERROR,$evaluation,"Design",$databaseFile,,$options{"TARGETTYPE"}[0]);
    }
    else {
	&assembleResults($options{"SIRNALENGTH"}[0],$options{"DESIGNNUM"}[0],\@designwindow,\%IDSeq,\@IDSeqKeys,\%InputSpecRegion,\%InputSpecsiRNA,\%InputEffsiRNA,\@effOptions,\%InputSpecPrimer,\%Designs,\%DesignsBest,\%DesignsBad,\%DesignsLeftover,\%DesignsFailed,\%targetGroups,\%Groupstarget,\%InputsiRNATarget,\*ERROR,$evaluation,$options{"INTRON"}[0],"Design",$databaseFile,$options{"RANKD"}[0],$options{"TARGETTYPE"}[0]);
    }

    print REPORT "Results for successful designs assembled\n";
    print "Results for successful designs assembled\n";

##
## Redesigns for input target sequences where no design was possible in first place
##

# redesign only for de novo design option and for long dsRNA designs (-r d)
    if (($evaluation eq "NO") && ($options{"REDESIGN"}[0] eq "ON")){
	my @DesignsLeftoverKeys = keys %DesignsLeftover;
	print ERROR "Redesign\tStart redesign\n";
	while (scalar(@DesignsLeftoverKeys) ne 0){
	    my %LeftoverSpecRegion = ();
	    my %LeftoverSpecPrimer = ();
	    &redesign(\%IDSeq,\%InputSpecRegion,\%DesignsLeftover,\%LeftoverSpecRegion,$options{"SIRNALENGTH"}[0]);
	    &primer3($identifier,$options{"OUTPUT"}[0],$options{"PRIMER3"}[0],$options{"SIRNALENGTH"}[0],\%DesignsLeftover,\@DesignsLeftoverKeys,\%LeftoverSpecRegion,$designwindow[0],$designwindow[1],\%LeftoverSpecPrimer,\*ERROR,$evaluation,\%IDSeqPrimer,$options{"DESIGNNUM"}[0],$options{"PRIMER3OPT"}[0]);
	    &assembleResults($options{"SIRNALENGTH"}[0],$options{"DESIGNNUM"}[0],\@designwindow,\%IDSeq,\@IDSeqKeys,\%LeftoverSpecRegion,\%InputSpecsiRNA,\%InputEffsiRNA,\@effOptions,\%LeftoverSpecPrimer,\%Designs,\%DesignsBest,\%DesignsBad,\%DesignsLeftover,\%DesignsFailed,\%targetGroups,\%Groupstarget,\%InputsiRNATarget,\*ERROR,$evaluation,$options{"INTRON"}[0],"Redesign",$databaseFile,$options{"RANKD"}[0],$options{"TARGETTYPE"}[0]);
	    @DesignsLeftoverKeys = keys %DesignsLeftover;
	    my $scalar = @DesignsLeftoverKeys;
	    print ERROR "Status of Redesigns\t$scalar designs leftover\n";
	}
	print ERROR "Redesign\tRedesign finished\n";
    }
    
    print REPORT "Redesigns done\n";
    print "Redesigns done\n";

##
## Assemble output designs (according to 'DESIGNNUM' option)
##
    my @IDscovered = ();
    my @notprinted = ();
    for (my $i=0;$i<scalar(@IDSeqKeys);$i++){
# successfull design
	if (defined $DesignsBest{$IDSeqKeys[$i]}[0][0]){
	    $DesignsOUTPUT{$IDSeqKeys[$i]} = $DesignsBest{$IDSeqKeys[$i]};
	    push (@IDscovered, $IDSeqKeys[$i]);
	    if (!exists $IDscovered{$IDSeqKeys[$i]}){
		$IDscovered{$IDSeqKeys[$i]} = "";
	    }
	}
	else {
# no design available
	    $notprinted{$IDSeqKeys[$i]} = 1;
	    push(@notprinted, $IDSeqKeys[$i]);
	}
    }
    for (my $i=0;$i<scalar(@notprinted);$i++){
# check for availability of suboptimal designs
	if ((exists $DesignsBad{$notprinted[$i]}) && ($options{"REDESIGN"}[0] eq "ON")){
	    for (my $j=0;$j<scalar(@{ $DesignsBad{$notprinted[$i]} });$j++){
		for (my $k=0;$k<$options{"DESIGNNUM"}[0];$k++){
		    if (defined $DesignsBad{$notprinted[$i]}[$j][$k]){
			if (!exists $DesignsOUTPUT{$notprinted[$i]}){
			    $DesignsOUTPUT{$notprinted[$i]} = [ [],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[] ];
			    push (@{ $DesignsOUTPUT{$notprinted[$i]}[$j] }, $DesignsBad{$notprinted[$i]}[$j][$k]);
			}
			else {
			    push (@{ $DesignsOUTPUT{$notprinted[$i]}[$j] }, $DesignsBad{$notprinted[$i]}[$j][$k]);
			}
		    }
		}
	    }
	    if (exists $DesignsOUTPUT{$notprinted[$i]}){
		print ERROR "$notprinted[$i]\tThere were only \"bad\" designs possible for that target\n";
		push (@IDscovered, $notprinted[$i]);
		if (!exists $IDscovered{$notprinted[$i]}){
		    $IDscovered{$notprinted[$i]} = "";
		}
	    }
	    else {
		print ERROR "$notprinted[$i]\tThere was absolutely no design possible for that target\n";
		if (!exists $IDfail{$notprinted[$i]}){
		    $IDfail{$notprinted[$i]} = "Design failed for that region";
		    print FAILED "$notprinted[$i]\tDesign failed for that region\n";
		}
	    }
	}
	else {
	    print ERROR "$notprinted[$i]\tThere was no design possible for that target\n";
	    if (!exists $IDfail{$notprinted[$i]}){
		$IDfail{$notprinted[$i]} = "Design failed for that region";
		print FAILED "$notprinted[$i]\tDesign failed for that region\n";
	    }
	}

    }
# sort all designs for all input target sequences (and over all identified specific regions) for output
    for (my $i=0;$i<scalar(@IDscovered);$i++){
	my @PrintThat = ();
	if (($reagent eq "s") && ($evaluation eq "NO")){
	    my $seedNumScal = scalar( keys %seedNum );
	    @PrintThat = @{ $DesignsOUTPUT{$IDscovered[$i]} };
	    if (($seedNumScal ne 0) && ($effOptions[0] ne "empty")){
# sorting siRNAs
# 1. Efficiency
# 2. Seed complement frequency
		my @effSort = @{ $DesignsOUTPUT{$IDscovered[$i]}[15] };
		my @SCFSort = @{ $DesignsOUTPUT{$IDscovered[$i]}[18] };
		for (my $k=0;$k<scalar(@PrintThat);$k++){
		    my $PrintThatlen = scalar(@{ $PrintThat[$k] }) - 1;
		    @{ $PrintThat[$k] } = @{ $PrintThat[$k] } [ sort {
			$effSort[$b] <=> $effSort[$a]
			    ||
			    $SCFSort[$a] <=> $SCFSort[$b]
								} 0 .. $PrintThatlen ];
		}
	    }
	    elsif (($seedNumScal eq 0) && ($effOptions[0] ne "empty")){
# sorting siRNA for efficiency
		my @effSort = @{ $DesignsOUTPUT{$IDscovered[$i]}[15] };
		for (my $k=0;$k<scalar(@PrintThat);$k++){
		    my $PrintThatlen = scalar(@{ $PrintThat[$k] }) - 1;
		    @{ $PrintThat[$k] } = @{ $PrintThat[$k] } [ sort {
			$effSort[$b] <=> $effSort[$a] } 0 .. $PrintThatlen ];
		}
	    }
	    elsif (($seedNumScal ne 0) && ($effOptions[0] eq "empty")){
# sorting siRNA for seed complement frequency
		my @SCFSort = @{ $DesignsOUTPUT{$IDscovered[$i]}[18] };
		for (my $k=0;$k<scalar(@PrintThat);$k++){
		    my $PrintThatlen = scalar(@{ $PrintThat[$k] }) - 1;
		    @{ $PrintThat[$k] } = @{ $PrintThat[$k] } [ sort {
			$SCFSort[$a] <=> $SCFSort[$b] } 0 .. $PrintThatlen ];
		}
	    }
	}
	else {
# sorting long dsRNA designs
# 1. Percent specificity
# 2. Number of efficient siRNAs (or avarage efficiency if cutoff 0)
# or (if efficiency is not available or specificity was selected)
# 2. Absolute specificity
	    my @relSort = @{ $DesignsOUTPUT{$IDscovered[$i]}[18] };
	    my @specSort = @{ $DesignsOUTPUT{$IDscovered[$i]}[17] };
	    my @specSort2 = @{ $DesignsOUTPUT{$IDscovered[$i]}[17] };
	    for (my $k=0;$k<scalar(@specSort);$k++){
		my @specSplit = split(/\//,$specSort[$k]);
		$specSort[$k] = $specSplit[1];
		$specSort2[$k] = $specSplit[2];
	    }
	    if ($effOptions[0] eq "empty"){
		@PrintThat = @{ $DesignsOUTPUT{$IDscovered[$i]} };
		for (my $k=0;$k<scalar(@PrintThat);$k++){
		    my $PrintThatlen = scalar(@{ $PrintThat[$k] }) - 1;
		    if ($options{"TARGETTYPE"}[0] eq 'NA'){
			@{ $PrintThat[$k] } = @{ $PrintThat[$k] } [ sort {
			    $specSort[$a] <=> $specSort[$b]
				||
				$specSort2[$a] <=> $specSort2[$b]
								    } 0 .. $PrintThatlen ];
		    }
		    else {
			@{ $PrintThat[$k] } = @{ $PrintThat[$k] } [ sort {
                            $relSort[$b] <=> $relSort[$a]
                                ||
                                $specSort[$b] <=> $specSort[$a]
                                                                    } 0 .. $PrintThatlen ];
		    }
		}
	    }
	    else {
		my @effSort = @{ $DesignsOUTPUT{$IDscovered[$i]}[15] };
		for (my $k=0;$k<scalar(@effSort);$k++){
		    my @effSplit = split(/\|/,$effSort[$k]);
# if efficiency cutoff is 0, sort according to percent efficiency
		    if ($effOptions[1] eq 0){
			$effSort[$k] = $effSplit[1];
		    }
		    else {
			$effSort[$k] = $effSplit[0];
		    }
		}
		@PrintThat = @{ $DesignsOUTPUT{$IDscovered[$i]} };
		if ($options{"RANKD"}[0] eq 'SPEC'){
		    for (my $k=0;$k<scalar(@PrintThat);$k++){
			my $PrintThatlen = scalar(@{ $PrintThat[$k] }) - 1;
			if ($options{"TARGETTYPE"}[0] eq 'NA'){
			    @{ $PrintThat[$k] } = @{ $PrintThat[$k] } [ sort {
				$specSort[$a] <=> $specSort[$b]
				    ||
				    $specSort2[$a] <=> $specSort2[$b]
				    ||
				    $effSort[$b] <=> $effSort[$a]
									} 0 .. $PrintThatlen ];
			}
			else {
			    @{ $PrintThat[$k] } = @{ $PrintThat[$k] } [ sort {
                                $relSort[$b] <=> $relSort[$a]
                                    ||
                                    $specSort[$b] <=> $specSort[$a]
                                    ||
                                    $effSort[$b] <=> $effSort[$a]
                                                                        } 0 .. $PrintThatlen ];
			}
		    }
		}
		elsif ($options{"RANKD"}[0] eq 'EFF'){
		    for (my $k=0;$k<scalar(@PrintThat);$k++){
                        my $PrintThatlen = scalar(@{ $PrintThat[$k] }) - 1;
                        if ($options{"TARGETTYPE"}[0] eq 'NA'){
			    @{ $PrintThat[$k] } = @{ $PrintThat[$k] } [ sort {
				$specSort[$a] <=> $specSort[$b]
				    ||
				    $specSort2[$a] <=> $specSort2[$b]
				    ||
				    $effSort[$b] <=> $effSort[$a]
									} 0 .. $PrintThatlen ];
			}
			else {
			    @{ $PrintThat[$k] } = @{ $PrintThat[$k] } [ sort {
                                $relSort[$b] <=> $relSort[$a]
                                    ||
                                    $effSort[$b] <=> $effSort[$a]
                                    ||
                                    $specSort[$b] <=> $specSort[$a]
                                                                        } 0 .. $PrintThatlen ];
			}
		    }
		}
	    }
	}
	my $outputnum = 0;
	if ($options{"OUTPUTNUM"}[0] <= scalar(@{ $PrintThat[17] })){
	    $outputnum = $options{"OUTPUTNUM"}[0];
	}
	else {
	    $outputnum = scalar(@{ $PrintThat[17] });
	}
	for (my $j=0;$j<scalar(@PrintThat);$j++){
	    for (my $k=0;$k<$outputnum;$k++){
		if (!exists $DesignsPrint{$IDscovered[$i]}){
		    $DesignsPrint{$IDscovered[$i]} = [ [],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[] ];
		    push (@{ $DesignsPrint{$IDscovered[$i]}[$j] }, $PrintThat[$j][$k]);
		}
		else {
		    push (@{ $DesignsPrint{$IDscovered[$i]}[$j] }, $PrintThat[$j][$k]);
		}
	    }
	}
    }

# empty data structures no longer used
    undef %DesignsOUTPUT;
    undef %Designs;
    undef %DesignsBest;
    undef %DesignsBad;
    undef %DesignsLeftover;
    undef %DesignsFailed;
    undef %InputSpecRegion;
    undef %InputSpecsiRNA;
    undef %InputEffsiRNA;
    undef %seedNum;
    undef %FilterPos;
    undef %siRNATargetExclude;
    undef %InputTarget;
    undef %InputsiRNATarget;
    undef %InputSpecPrimer;

# unlink files required no more
    for (my $i=0;$i<scalar(@{ $fileLocs{'Looped'} });$i++){
	unlink ($fileLocs{'Looped'}[$i]);
    }
    delete $fileLocs{'Looped'};
}

###
### End looping over splitted input
###


# empty data structures no longer used
#undef %targetGroups;
undef %Groupstarget;
undef %TargetExclude;
undef %seedseq;

# write FASTA file for best designs
my @IDscovered = keys %IDscovered;
my %IDSeqBest = ();
my $IDSeqBestFASTA = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_Probe.fa';
&fileLoc('Output',$IDSeqBestFASTA,'DesignsFASTA');
open (PROBES, ">$IDSeqBestFASTA") || die "Cannot open PROBES: $!\n";
for (my $i=0;$i<scalar(@IDscovered);$i++){
    my $index = 1;
    for (my $j=0;$j<scalar(@{ $DesignsPrint{$IDscovered[$i]}[0] });$j++){
	my $IDsub = $IDscovered[$i].'_'.$index;
	if (!exists $IDSeqBest{$IDsub}){
	    $IDSeqBest{$IDsub} = $DesignsPrint{$IDscovered[$i]}[12][$j];
	    for (my $k=0;$k<length($DesignsPrint{$IDscovered[$i]}[12][$j]);$k+=50){
		my $seqpart = substr($DesignsPrint{$IDscovered[$i]}[12][$j],$k,50);
		if ($k eq 0){
		    print PROBES ">$IDsub\n$seqpart\n";
		}
		else {
		    print PROBES "$seqpart\n";
		}
	    }
	}
	$index++;
    }
}
close PROBES;

print REPORT "Scoring and ranking of RNAi reagents done\n";
print "Scoring and ranking of RNAi reagents done\n";

##
## Additional quality evaluation and outputs
##

print "Enter additional quality evaluation now\n";

my $FeatureCalc = 0;
my %FeatureNum = ();
my %FeatureName = ();
my %OTE = ();

# mapping of designed reagents (e.g. to genomic sources)
if ((($options{"GENOMEBOWTIE"}[0] ne "empty") || ($options{"GENOMEFASTA"}[0] ne "empty")) && ($mapping eq 0)){
# write primer FASTA and tab file needed for MUMmer and mapping
    &fileLoc('Unlink',$outPrimer);
    open (OUTPRIME, ">$outPrimer") || die "Cannot open OUTPRIME: $!\n";
    &fileLoc('Unlink',$outPrimer2);
    open (OUTPRIME2, ">$outPrimer2") || die "Cannot open OUTPRIME2: $!\n";
    my $outPrimer3 = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_PrimerLong.fa';
    &fileLoc('Unlink',$outPrimer3);
    open (OUTPRIME3, ">$outPrimer3") || die "Cannot open OUTPRIME3: $!\n";
    my $outPrimer4 = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_PrimerLong.txt';
    &fileLoc('Unlink',$outPrimer4);
    open (OUTPRIME4, ">$outPrimer4") || die "Cannot open OUTPRIME4: $!\n";
    my $lenMax = 0;
    for (my $i=0;$i<scalar(@IDscovered);$i++){
	my $index = 1;
	if (($reagent eq "d") && ($evaluation ne "DSRNA") && ($options{"SOURCE"}[0] eq "GENOMIC")){
	    for (my $j=0;$j<scalar(@{ $DesignsPrint{$IDscovered[$i]}[0] });$j++){
		print OUTPRIME ">$IDscovered[$i]\_$index\_f\n$DesignsPrint{$IDscovered[$i]}[1][$j]\n>$IDscovered[$i]\_$index\_r\n$DesignsPrint{$IDscovered[$i]}[2][$j]\n";
		print OUTPRIME2 "$IDscovered[$i]\_$index\_f\t$DesignsPrint{$IDscovered[$i]}[1][$j]\n$IDscovered[$i]\_$index\_r\t$DesignsPrint{$IDscovered[$i]}[2][$j]\n";
		$index++;
	    }
	}
	else {
	    for (my $j=0;$j<scalar(@{ $DesignsPrint{$IDscovered[$i]}[0] });$j++){
		if (length($DesignsPrint{$IDscovered[$i]}[12][$j]) < 1024){
		    print OUTPRIME ">$IDscovered[$i]\_$index\n$DesignsPrint{$IDscovered[$i]}[12][$j]\n";
		    print OUTPRIME2 "$IDscovered[$i]\_$index\t$DesignsPrint{$IDscovered[$i]}[12][$j]\n";
		}
		else {
# Bowtie cannot map sequences >= 1024 bp, BLAT needs to be used for mapping then		    
		    if (length($DesignsPrint{$IDscovered[$i]}[12][$j]) > $lenMax){
			$lenMax = length($DesignsPrint{$IDscovered[$i]}[12][$j]);
		    }
		    print OUTPRIME3 ">$IDscovered[$i]\_$index\n$DesignsPrint{$IDscovered[$i]}[12][$j]\n";
                    print OUTPRIME4 "$IDscovered[$i]\_$index\t$DesignsPrint{$IDscovered[$i]}[12][$j]\n";
		}
		$index++;
            }
	}
    }
    close OUTPRIME;
    close OUTPRIME2;
    close OUTPRIME3;
    close OUTPRIME4;
    
# mapping of primers/reagents, check before if database is already compiled and if multiple database files were queried
    my $mappingDB = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_MAPDB.fasta';
    my $PrimerMapped = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_Primer.Mapped';
    my $PrimerNotMapped = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_Primer.NotMapped';
    my $dsRNAMapped = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_dsRNA.Mapped';
    my $dsRNANotMapped = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_dsRNA.NotMapped';
    my %siRNAExt = ();
# for genomic sources and sequences < 1024 bp use Bowtie, for CDS sources use BLAT
    if ($options{"SOURCE"}[0] eq "GENOMIC"){
# Mapping with Bowtie
	my $outBowtie = $outPrimer.'.bwt';
	my $BowtieMapping = &BowtieMapping(\@{ $options{"GENOMEBOWTIE"} },$options{"BOWTIE"}[0],$outPrimer,$outBowtie,\*ERROR);
# parse primer Bowtie search
	if ($BowtieMapping eq 'Success'){
	    &fileLoc('Unlink',$PrimerMapped);
	    &fileLoc('Output',$PrimerNotMapped,'PrimerNotMapped');
# generate index of bowtie output
	    open(OUTBO,"<$outBowtie") or die "Cannot open $outBowtie for reading: $!\n";
	    open(INDEX, "+>$outBowtie.idx") or die "Cannot open $outBowtie.idx for read/write: $!\n";
	    &fileLoc('Unlink',"$outBowtie.idx");
	    &build_index(\*OUTBO,\*INDEX,'bowtie',$outBowtie,\%RNAiloc);
# parse mapping via index
	    &ParseBowtieMap($outPrimer2,$outBowtie,$PrimerMapped,$PrimerNotMapped,\%RNAiloc,\%NOTmapped,\*OUTBO,\*INDEX);
	    close OUTBO;
            close INDEX;
	}
	else {
	    print ERROR "No valid Bowtie index/database for reagent mappings found\tThe mapping of long dsRNAs or siRNAs requires a Bowtie index/database (with e.g. genomic sequences) by defining the 'GENOMEBOWTIE' option in the additional options file\n";
	    print "No valid Bowtie index/database for reagent mappings found. The mapping of long dsRNAs or siRNAs requires a Bowtie index/database (with e.g. genomic sequences) by defining the 'GENOMEBOWTIE' option in the additional options file. Start NEXT-RNAi with '-h' for help.\n";
	}
	if ($lenMax >= 1024){
# mapping with BLAT
	    if (-e "$options{\"BLAT\"}[0]$options{\"BLATPROGRAM\"}[0]"){
		my $outBLAT = $outPrimer3.'.psl';
		my $BLATMapping = &BLATMapping($options{"BLAT"}[0],\@{ $options{"GENOMEFASTA"} },$mappingDB,$options{"BLATSPLIT"}[0],$outPrimer3,$outBLAT,\*ERROR,$options{"BLATPROGRAM"}[0],$options{"BLATHOST"}[0],$options{"BLATPORT"}[0]);
# parse BLAT search
		if ($BLATMapping eq 'Success'){
# generate index of blat output
		    open(OUTBLAT,"<$outBLAT") or die "Cannot open $outBLAT for reading: $!\n";
		    open(INDEX, "+>$outBLAT.idx") or die "Cannot open $outBLAT.idx for read/write: $!\n";
		    &fileLoc('Unlink',"$outBLAT.idx");
		    &build_index(\*OUTBLAT,\*INDEX,'blat',$outBLAT,\%RNAiloc);
# parse mapping via index
		    &ParseBLAT($outPrimer4,$outBLAT,$PrimerMapped,$PrimerNotMapped,\%RNAiloc,\%NOTmapped,\%siRNAExt,$options{"BLATALIGN"}[0],\*OUTBLAT,\*INDEX);
		    close OUTBLAT;
		    close INDEX;
		}
		else {
		    print ERROR "No valid FASTA database for reagent mappings with 'blat' found\tThe mapping of long dsRNAs or siRNAs requires a FASTA database (with e.g. genomic sequences) by defining the 'GENOMEFASTA' option in the additional options file\n";
		    print "No valid FASTA database for reagent mappings with 'blat' found. The mapping of long dsRNAs or siRNAs requires a FASTA database (with e.g. genomic sequences) by defining the 'GENOMEFASTA' option in the additional options file. Start NEXT-RNAi with '-h' for help.\n";
		}
	    }
	    else {
		print ERROR "$options{\"BLAT\"}[0]\t$options{\"BLATPROGRAM\"}[0] program not found at this location ('BLAT' and 'BLATPROGRAM' parameters in additional options file), reagents cannot be mapped to the database(s) defined in 'GENOMFASTA'\n";
		print "$options{\"BLATPROGRAM\"}[0] program not found in $options{\"BLAT\"}[0] ('BLAT' and 'BLATPROGRAM' parameters in additional options file), reagents cannot be mapped to the database(s) defined in 'GENOMEFASTA'. Start NEXT-RNAi with '-h' for help.\n";
	    }
	}
    }
    else {
# mapping with BLAT
	if (-e "$options{\"BLAT\"}[0]$options{\"BLATPROGRAM\"}[0]"){
	    my $outBLAT = $outPrimer.'.psl';
	    my $BLATMapping = &BLATMapping($options{"BLAT"}[0],\@{ $options{"GENOMEFASTA"} },$mappingDB,$options{"BLATSPLIT"}[0],$outPrimer,$outBLAT,\*ERROR,$options{"BLATPROGRAM"}[0],$options{"BLATHOST"}[0],$options{"BLATPORT"}[0]);
# parse BLAT search
	    if ($BLATMapping eq 'Success'){
		&fileLoc('Unlink',$PrimerMapped);
		&fileLoc('Output',$PrimerNotMapped,'PrimerNotMapped');
# generate index of blat output
		open(OUTBLAT,"<$outBLAT") or die "Cannot open $outBLAT for reading: $!\n";
		open(INDEX, "+>$outBLAT.idx") or die "Cannot open $outBLAT.idx for read/write: $!\n";
		&fileLoc('Unlink',"$outBLAT.idx");
		&build_index(\*OUTBLAT,\*INDEX,'blat',$outBLAT,\%RNAiloc);
# parse mapping via index
		&ParseBLAT($outPrimer2,$outBLAT,$PrimerMapped,$PrimerNotMapped,\%RNAiloc,\%NOTmapped,\%siRNAExt,$options{"BLATALIGN"}[0],\*OUTBLAT,\*INDEX);
		close OUTBLAT;
		close INDEX;
	    }
	    else {
		print ERROR "No valid FASTA database for reagent mappings with 'blat' found\tThe mapping of long dsRNAs or siRNAs requires a FASTA database (with e.g. genomic sequences) by defining the 'GENOMEFASTA' option in the additional options file\n";
		print "No valid FASTA database for reagent mappings with 'blat' found. The mapping of long dsRNAs or siRNAs requires a FASTA database (with e.g. genomic sequences) by defining the 'GENOMEFASTA' option in the additional options file. Start NEXT-RNAi with '-h' for help.\n";
	    }
	}
	else {
	    print ERROR "$options{\"BLAT\"}[0]\t$options{\"BLATPROGRAM\"}[0] program not found at this location ('BLAT' and 'BLATPROGRAM' parameters in additional options file), reagents cannot be mapped to the database(s) defined in 'GENOMFASTA'\n";
	    print "$options{\"BLATPROGRAM\"}[0] program not found in $options{\"BLAT\"}[0] ('BLAT' and 'BLATPROGRAM' parameters in additional options file), reagents cannot be mapped to the database(s) defined in 'GENOMEFASTA'. Start NEXT-RNAi with '-h' for help.\n";
	}
    }
    print REPORT "Mapping of primers/oligos done\n";
    print "Mapping of primers/oligos done\n";

# build index of primer MAPPED file
    open(PRIMERMAPPED,"<$PrimerMapped") or die "Cannot open $PrimerMapped for reading: $!\n";
    open(PRIMERINDEX, "+>$PrimerMapped.idx") or die "Cannot open $PrimerMapped.idx for read/write: $!\n";
    &fileLoc('Unlink',"$PrimerMapped.idx");
    &build_index(\*PRIMERMAPPED,\*PRIMERINDEX,'mapped',$PrimerMapped,\%RNAiloc);

# designs that could not be mapped using primers are remapped using the dsRNA sequence
    my $dsRNAFASTA = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_dsRNA.fa';
    my $dsRNATAB = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_dsRNA.txt';
    my $dsRNAFASTA2 = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_dsRNALong.fa';
    my $dsRNATAB2 = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_dsRNALong.txt';
    my @NOTmapped = keys %NOTmapped;
    my $remap = scalar(@NOTmapped);
    $lenMax = 0;
# in case of long dsRNAs full sequences are used for remapping
    if ($reagent eq 'd'){
	if (($evaluation ne "DSRNA") && ($options{"SOURCE"}[0] eq "GENOMIC")){
	    open (NOTMAPPED,"<$PrimerNotMapped") || die "Cannot open NOTMAPPED: $!\n";
	    &fileLoc('Unlink',$dsRNAFASTA);
	    &fileLoc('Unlink',$dsRNATAB);
	    &fileLoc('Unlink',$dsRNAFASTA2);
            &fileLoc('Unlink',$dsRNATAB2);
	    open (DSRNAFASTA,">$dsRNAFASTA") || die "Cannot open DSRNAFASTA: $!\n";
	    open (DSRNATAB,">$dsRNATAB") || die "Cannot open DSRNATAB: $!\n";
	    open (DSRNAFASTA2,">$dsRNAFASTA2") || die "Cannot open DSRNAFASTA2: $!\n";
            open (DSRNATAB2,">$dsRNATAB2") || die "Cannot open DSRNATAB2: $!\n";
	    for (my $i=0;$i<scalar(@NOTmapped);$i++){
		if (length($IDSeqBest{$NOTmapped[$i]}) < 1024){
		    print DSRNAFASTA ">$NOTmapped[$i]\n$IDSeqBest{$NOTmapped[$i]}\n";
		    print DSRNATAB "$NOTmapped[$i]\t$IDSeqBest{$NOTmapped[$i]}\n";
		}
		else {
# Bowtie cannot map sequences >= 1024 bp, BLAT needs to be used for mapping then
		    if (length($IDSeqBest{$NOTmapped[$i]}) > $lenMax){
			$lenMax = length($IDSeqBest{$NOTmapped[$i]});
		    }
		    print DSRNAFASTA2 ">$NOTmapped[$i]\n$IDSeqBest{$NOTmapped[$i]}\n";
                    print DSRNATAB2 "$NOTmapped[$i]\t$IDSeqBest{$NOTmapped[$i]}\n";
		}
	    }
	    close NOTMAPPED;
	    close DSRNAFASTA;
	    close DSRNATAB;
	    close DSRNAFASTA2;
            close DSRNATAB2;
	}
    }
    else {
# siRNAs are extended and then remapped using BLAT
	if ($options{"SOURCE"}[0] eq "CDS"){
	    if ($evaluation eq 'NO'){
# siRNAs are extended from query sequences
		&fileLoc('Unlink',$dsRNAFASTA);
		&fileLoc('Unlink',$dsRNATAB);
		open (DSRNAFASTA,">$dsRNAFASTA") || die "Cannot open DSRNAFASTA: $!\n";
		open (DSRNATAB,">$dsRNATAB") || die "Cannot open DSRNATAB: $!\n";
		for (my $i=0;$i<scalar(@NOTmapped);$i++){
# exclude reagents with multiple mappings for re-mapping
		    if (($NOTmapped{$NOTmapped[$i]}[1]!~/times/) && ($NOTmapped{$NOTmapped[$i]}[1]!~/products/)){
			my $seq = $NOTmapped{$NOTmapped[$i]}[0];
			if ($NOTmapped[$i]=~/^(\S+)_(\d+)$/){
			    my $index = $2 - 1;
			    my $start = 0;
# length is siRNA length + overalle extension by 30 nt
			    my $length = $DesignsPrint{$1}[11][$index] + 30;
			    if (!exists $siRNAExt{$NOTmapped[$i]}){
				$siRNAExt{$NOTmapped[$i]} = [ 15, 15, ];
			    }
# check, if 15 nt are available on the left end
			    if (($DesignsPrint{$1}[10][$index] - 1 - 15) > 0){
				$start = $DesignsPrint{$1}[10][$index] - 1 - 15;
			    }
			    else {
				$length+= $DesignsPrint{$1}[10][$index] - 1 - 15;
				$start = 0;
				my $startNew = 15;
				$startNew+= $DesignsPrint{$1}[10][$index] - 1 - 15;
				$siRNAExt{$NOTmapped[$i]}[0] = $startNew;
			    }
# check, if 15 nt are available on the right end
			    if (length($IDSeq{$1}) < ($start + $length)){
				$length+= length($IDSeq{$1}) - $start - $length;
				my $endNew = 15;
				$endNew+= length($IDSeq{$1}) - $start - $length;
				$siRNAExt{$NOTmapped[$i]}[1] = $endNew;
			    }
# final extended sequence
			    $seq = substr($IDSeq{$1},$start,$length);		    
			}
			print DSRNAFASTA ">$NOTmapped[$i]\n$seq\n";
			print DSRNATAB "$NOTmapped[$i]\t$seq\n";
		    }
		}
		close DSRNAFASTA;
		close DSRNATAB;
	    }
	    else {
# siRNAs are extended from target sequences
		if (($options{"TXNFASTA"}[0] ne 'empty') && (-e $options{"TXNFASTA"}[0])){
		    &fileLoc('Unlink',$dsRNAFASTA);
		    &fileLoc('Unlink',$dsRNATAB);
		    open (DSRNAFASTA,">$dsRNAFASTA") || die "Cannot open DSRNAFASTA: $!\n";
		    open (DSRNATAB,">$dsRNATAB") || die "Cannot open DSRNATAB: $!\n";
		    my %remap = ();
		    my %remapTarget = ();
		    for (my $i=0;$i<scalar(@NOTmapped);$i++){
			if (($NOTmapped{$NOTmapped[$i]}[1]!~/times/) && ($NOTmapped{$NOTmapped[$i]}[1]!~/products/)){
			    if (!exists $remap{$NOTmapped[$i]}){
				$remap{$NOTmapped[$i]} = "";
				if (exists $siRNAPos{$NOTmapped[$i]}){
				    my @targets = keys %{ $siRNAPos{$NOTmapped[$i]}};
				    for (my $i=0;$i<scalar(@targets);$i++){
					if (!exists $remapTarget{$targets[$i]}){
					    $remapTarget{$targets[$i]} = "";
					}
				    }
				}
			    }
			}
		    }
# iterate over FASTA off-target database and save sequences for targets where siRNA could not be mapped to genome
		    open (OTEDB,"<$options{\"TXNFASTA\"}[0]") || die "Cannot open OTEDB: $!\n";
		    my $fastHead = "";
		    while (my $line = <OTEDB>){
			$line = &cleanLine($line);
			if ($line=~/^>(\S+)$/){
			    if (exists $remapTarget{$1}){
				$fastHead = $1;
			    }
			    else {
				$fastHead = "";
			    }
			}
			else {
			    if ($fastHead ne ""){
				$remapTarget{$fastHead}.= $line;
			    }
			}
		    }
		    close OTEDB;
		
		    my @remap = keys %remap;
		    for (my $i=0;$i<scalar(@remap);$i++){
			my @targets = keys %{ $siRNAPos{$remap[$i]}};
			my %seqDone = ();
			for (my $j=0;$j<scalar(@targets);$j++){
			    my @pos = @{ $siRNAPos{$remap[$i]}{$targets[$j]} };
			    for (my $k=0;$k<scalar(@pos);$k++){
				my $start = 0;
# length is siRNA length + overall extension by 30 nt
				my $length = $options{"SIRNALENGTH"}[0] + 30;
				if (!exists $siRNAExt{$remap[$i]}){
				    $siRNAExt{$remap[$i]} = [ 15, 15, ];
				}
# check, if 15 nt are available on the left end
				if (($pos[$k] - 1 - 15) > 0){
				    $start = $pos[$k] - 1 - 15;
				}
				else {
				    $length+= $pos[$k] - 1 - 15;
				    $start = 0;
				    my $startNew = 15;
				    $startNew+= $pos[$k] - 1 - 15;
				    $siRNAExt{$remap[$i]}[0] = $startNew;
				}
# check, if 15 nt are available on the right end
				if (length($remapTarget{$targets[$j]}) < ($start + $length)){
				    $length+= length($remapTarget{$targets[$j]}) - $start - $length;
				    my $endNew = 15;
				    $endNew+= length($remapTarget{$targets[$j]}) - $start - $length;
				    $siRNAExt{$remap[$i]}[1] = $endNew;
				}
# final extended sequence			    
				my $seq = substr($remapTarget{$targets[$j]},$start,$length);
# only take unique sequences
				if (!exists $seqDone{$seq}){
				    print DSRNAFASTA ">$remap[$i]\n$seq\n";
				    print DSRNATAB "$remap[$i]\t$seq\n";
				    $seqDone{$seq} = "";
				}
			    }
			}
			undef %seqDone;
		    }
		    close DSRNAFASTA;
		    close DSRNATAB;
		}
		else {
		    print ERROR "TXNFASTA\tNo valid 'TXNFASTA' file defined in additional options file. siRNAs that could not be mapped by their full sequence can be extended with flanking sequences to allow their mapping. To do so the 'TXNFASTA' file must be provided. It must be the same file (but in FASTA format) as used for off-target evaluation ('-d' options during start of NEXT-RNAi)\n";
		    print "No valid 'TXNFASTA' file defined in additional options file. siRNAs that could not be mapped by their full sequence can be extended with flanking sequences to allow their mapping. To do so the 'TXNFASTA' file must be provided. It must be the same file (but in FASTA format) as used for off-target evaluation ('-d' options during start of NEXT-RNAi)\n";
		}
	    }
	}
    }
    print REPORT "$remap design(s) could not be mapped or multiple mappings were found\n";
    print "$remap design(s) could not be mapped or multiple mappings were found\n";
# remapping for long dsRNAs and siRNAs
    if ($remap ne 0){
	if ($options{"SOURCE"}[0] eq "GENOMIC"){
	    if (($reagent eq 'd') && ($evaluation ne "DSRNA")){
# check, whether re-mapping is required
		my $map = 0;
		open (DSRNAFASTA, $dsRNAFASTA) || die "Cannot open DSRNAFASTA: $!\n";
		if (<DSRNAFASTA> ne ''){
		    $map = 1;
		}
		close DSRNAFASTA;
		if ($map eq 1){
# Mapping with Bowtie
		    my $dsRNABO = $dsRNAFASTA.'.bwt';
		    my $BowtieMapping = &BowtieMapping(\@{ $options{"GENOMEBOWTIE"} },$options{"BOWTIE"}[0],$dsRNAFASTA,$dsRNABO,\*ERROR);
		    if ($BowtieMapping eq 'Success'){
# parse Bowtie search
			&fileLoc('Unlink',$dsRNAMapped);
			&fileLoc('Output',$dsRNANotMapped,'dsRNANotMapped');
# generate index of bowtie output
			open(OUTBO,"<$dsRNABO") or die "Cannot open $dsRNABO for reading: $!\n";
			open(INDEX, "+>$dsRNABO.idx") or die "Cannot open $dsRNABO.idx for read/write: $!\n";
			&fileLoc('Unlink',"$dsRNABO.idx");
			&build_index(\*OUTBO,\*INDEX,'bowtie',$dsRNABO,\%RNAiloc);
# parse mapping via index
			&ParseBowtieMap($dsRNATAB,$dsRNABO,$dsRNAMapped,$dsRNANotMapped,\%RNAiloc,\%NOTmapped,\*OUTBO,\*INDEX);
			close OUTBO;
			close INDEX;
		    }
		    else {
			print ERROR "No valid Bowtie index/database for reagent mappings found\tThe mapping of long dsRNAs or siRNAs requires a Bowtie index/database (with e.g. genomic sequences) by defining the 'GENOMEBOWTIE' option in the additional options file\n";
			print "No valid Bowtie index/database for reagent mappings found. The mapping of long dsRNAs or siRNAs requires a Bowtie index/database (with e.g. genomic sequences) by defining the 'GENOMEBOWTIE' option in the additional options file. Start NEXT-RNAi with '-h' for help.\n";
		    }
		}
		if ($lenMax >= 1024){
# mapping with BLAT
		    if (-e "$options{\"BLAT\"}[0]$options{\"BLATPROGRAM\"}[0]"){
			my $dsRNABLAT = $dsRNAFASTA.'.psl';
			my $BLATMapping = &BLATMapping($options{"BLAT"}[0],\@{ $options{"GENOMEFASTA"} },$mappingDB,$options{"BLATSPLIT"}[0],$dsRNAFASTA,$dsRNABLAT,\*ERROR,$options{"BLATPROGRAM"}[0],$options{"BLATHOST"}[0],$options{"BLATPORT"}[0]);
# parse BLAT search
			if ($BLATMapping eq 'Success'){
# generate index of blat output
			    open(OUTBLAT,"<$dsRNABLAT") or die "Cannot open $dsRNABLAT for reading: $!\n";
			    open(INDEX, "+>$dsRNABLAT.idx") or die "Cannot open $dsRNABLAT.idx for read/write: $!\n";
			    &fileLoc('Unlink',"$dsRNABLAT.idx");
			    &build_index(\*OUTBLAT,\*INDEX,'blat',$dsRNABLAT,\%RNAiloc);
# parse mapping via index
			    &ParseBLAT($dsRNATAB,$dsRNABLAT,$dsRNAMapped,$dsRNANotMapped,\%RNAiloc,\%NOTmapped,\%siRNAExt,$options{"BLATALIGN"}[0],\*OUTBLAT,\*INDEX);
			    close OUTBLAT;
			    close INDEX;
			}
			else {
			    print ERROR "No valid FASTA database for reagent mappings with 'blat' found\tThe mapping of long dsRNAs or siRNAs requires a FASTA database (with e.g. genomic sequences) by defining the 'GENOMEFASTA' option in the additional options file\n";
			    print "No valid FASTA database for reagent mappings with 'blat' found. The mapping of long dsRNAs or siRNAs requires a FASTA database (with e.g. genomic sequences) by defining the 'GENOMEFASTA' option in the additional options file. Start NEXT-RNAi with '-h' for help.\n";
			}
		    }
		    else {
			print ERROR "$options{\"BLAT\"}[0]\t$options{\"BLATPROGRAM\"}[0] program not found at this location ('BLAT' and 'BLATPROGRAM' parameters in additional options file), reagents cannot be mapped to the database(s) defined in 'GENOMFASTA'\n";
			print "$options{\"BLATPROGRAM\"}[0] program not found in $options{\"BLAT\"}[0] ('BLAT' and 'BLATPROGRAM' parameters in additional options file), reagents cannot be mapped to the database(s) defined in 'GENOMEFASTA'. Start NEXT-RNAi with '-h' for help.\n";
		    }
		}
		undef @NOTmapped;
		@NOTmapped = keys %NOTmapped;
		$remap = scalar(@NOTmapped);
		print REPORT "Remapping done, $remap designs could not be mapped or multiple mappings were found\n";
		print "Remapping done, $remap designs could not be mapped or multiple mappings were found\n";
# build index of dsRNA MAPPED file
		open(DSRNAMAPPED,"<$dsRNAMapped") or die "Cannot open $dsRNAMapped for reading: $!\n";
		open(DSRNAINDEX, "+>$dsRNAMapped.idx") or die "Cannot open $dsRNAMapped.idx for read/write: $!\n";
		&fileLoc('Unlink',"$dsRNAMapped.idx");
		&build_index(\*DSRNAMAPPED,\*DSRNAINDEX,'mapped',$dsRNAMapped,\%RNAiloc);
	    }
	}
	else {
	    if (($evaluation ne "DSRNA") && ($evaluation ne "DSRNA+OLIGO") && ($reagent ne 'd')){
		my $map = 0;
# check, whether re-mapping is required
		open (DSRNAFASTA, $dsRNAFASTA) || die "Cannot open DSRNAFASTA: $!\n";
		if (<DSRNAFASTA> ne ''){
		    $map = 1;
		}
		close DSRNAFASTA;
# mapping with BLAT
		if ($map eq 1){
		    if (-e "$options{\"BLAT\"}[0]$options{\"BLATPROGRAM\"}[0]"){
			my $dsRNABLAT = $dsRNAFASTA.'.psl';
			my $BLATMapping = &BLATMapping($options{"BLAT"}[0],\@{ $options{"GENOMEFASTA"} },$mappingDB,$options{"BLATSPLIT"}[0],$dsRNAFASTA,$dsRNABLAT,\*ERROR,$options{"BLATPROGRAM"}[0],$options{"BLATHOST"}[0],$options{"BLATPORT"}[0]);
# parse BLAT search
			if ($BLATMapping eq 'Success'){
			    print "I am right again\n";
			    &fileLoc('Unlink',$dsRNAMapped);
			    &fileLoc('Output',$dsRNANotMapped,'dsRNANotMapped');
# generate index of blat output
			    open(OUTBLAT,"<$dsRNABLAT") or die "Cannot open $dsRNABLAT for reading: $!\n";
			    open(INDEX, "+>$dsRNABLAT.idx") or die "Cannot open $dsRNABLAT.idx for read/write: $!\n";
			    &fileLoc('Unlink',"$dsRNABLAT.idx");
			    &build_index(\*OUTBLAT,\*INDEX,'blat',$dsRNABLAT,\%RNAiloc);
# parse mapping via index
			    &ParseBLAT($dsRNATAB,$dsRNABLAT,$dsRNAMapped,$dsRNANotMapped,\%RNAiloc,\%NOTmapped,\%siRNAExt,$options{"BLATALIGN"}[0],\*OUTBLAT,\*INDEX);
			    close OUTBLAT;
			    close INDEX;
			}
			else {
			    print ERROR "No valid FASTA database for reagent mappings with 'blat' found\tThe mapping of long dsRNAs or siRNAs requires a FASTA database (with e.g. genomic sequences) by defining the 'GENOMEFASTA' option in the additional options file\n";
			    print "No valid FASTA database for reagent mappings with 'blat' found. The mapping of long dsRNAs or siRNAs requires a FASTA database (with e.g. genomic sequences) by defining the 'GENOMEFASTA' option in the additional options file. Start NEXT-RNAi with '-h' for help.\n";
			}
		    }
		    else {
			print ERROR "$options{\"BLAT\"}[0]\t$options{\"BLATPROGRAM\"}[0] program not found at this location ('BLAT' and 'BLATPROGRAM' parameters in additional options file), reagents cannot be mapped to the database(s) defined in 'GENOMFASTA'\n";
			print "$options{\"BLATPROGRAM\"}[0] program not found in $options{\"BLAT\"}[0] ('BLAT' and BLATPROGRAM parameters in additional options file), reagents cannot be mapped to the database(s) defined in 'GENOMEFASTA'. Start NEXT-RNAi with '-h' for help.\n";
		    }
		}
		undef @NOTmapped;
		@NOTmapped = keys %NOTmapped;
		$remap = scalar(@NOTmapped);
		print REPORT "Remapping done, $remap designs could not be mapped or multiple mappings were found\n";
		print "Remapping done, $remap designs could not be mapped or multiple mappings were found\n";
# build index of dsRNA MAPPED file
		open(DSRNAMAPPED,"<$dsRNAMapped") or die "Cannot open $dsRNAMapped for reading: $!\n";
		open(DSRNAINDEX, "+>$dsRNAMapped.idx") or die "Cannot open $dsRNAMapped.idx for read/write: $!\n";
		&fileLoc('Unlink',"$dsRNAMapped.idx");
		&build_index(\*DSRNAMAPPED,\*DSRNAINDEX,'mapped',$dsRNAMapped,\%RNAiloc);
	    }
	}
    }
    $mapping = 1;
# consolidate mappings in one file
    $Mapped = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'.Mapped';
    open(MAPPED,">$Mapped") or die "Cannot open $Mapped for writing: $!\n";
    my @RNAiloc = keys %RNAiloc;
    for (my $i=0;$i<scalar(@RNAiloc);$i++){
	my %mapping = ();
# first check for success in second mapping
	if (exists $RNAiloc{$RNAiloc[$i]}{$dsRNAMapped}){
	    my @lines = keys %{ $RNAiloc{$RNAiloc[$i]}{$dsRNAMapped} };
# only 3 mappings are allowed per reagents
	    if (scalar(@lines <= 3)){
		for (my $j=0;$j<scalar(@lines);$j++){
		    my $line = &line_with_index(\*DSRNAMAPPED,\*DSRNAINDEX,$lines[$j]);
		    $line = &cleanLine($line);
		    my @columns = split(/\t/, $line);
# differentiate between 'full' and 'partial' mappings
		    if (!exists $mapping{$columns[12]}){
			$mapping{$columns[12]} = [ $line, ];
		    }
		    else {
			push (@{ $mapping{$columns[12]} },$line);
		    }
		    delete $RNAiloc{$RNAiloc[$i]}{$dsRNAMapped}{$lines[$j]};
		}
		if (exists $mapping{'full'}){
		    for (my $j=0;$j<scalar(@{ $mapping{'full'} });$j++){
			print MAPPED "$mapping{\"full\"}[$j]\n";
		    }
		}
		else {
		    if (exists $mapping{'partial'}){
			for (my $j=0;$j<scalar(@{ $mapping{'partial'} });$j++){
			    print MAPPED "$mapping{\"partial\"}[$j]\n";
			}
		    }
		}
	    }
	    else {
		delete $RNAiloc{$RNAiloc[$i]}{$dsRNAMapped};
	    }
	}
	else {
	    if (exists $RNAiloc{$RNAiloc[$i]}{$PrimerMapped}){
		my @lines = keys %{ $RNAiloc{$RNAiloc[$i]}{$PrimerMapped} };
# only 5 mappings are allowed per reagents
		if (scalar(@lines <= 3)){
		    for (my $j=0;$j<scalar(@lines);$j++){
			my $line = &line_with_index(\*PRIMERMAPPED,\*PRIMERINDEX,$lines[$j]);
			$line = &cleanLine($line);
			my @columns = split(/\t/, $line);
# differentiate between 'full' and 'partial' mappings
			if (!exists $mapping{$columns[12]}){
			    $mapping{$columns[12]} = [ $line, ];
			}
			else {
			    push (@{ $mapping{$columns[12]} },$line);
			}
			delete $RNAiloc{$RNAiloc[$i]}{$PrimerMapped}{$lines[$j]};
		    }
		    if (exists $mapping{'full'}){
			for (my $j=0;$j<scalar(@{ $mapping{'full'} });$j++){
			    print MAPPED "$mapping{\"full\"}[$j]\n";
			}
		    }
		    else {
			if (exists $mapping{'partial'}){
			    for (my $j=0;$j<scalar(@{ $mapping{'partial'} });$j++){
				print MAPPED "$mapping{\"partial\"}[$j]\n";
			    }
			}
		    }
		}
		else {
		    delete $RNAiloc{$RNAiloc[$i]}{$PrimerMapped};
		}
	    }
	}
    }
    close PRIMERMAPPED;
    close PRIMERINDEX;
    close DSRNAMAPPED;
    close DSRNAINDEX;
    close MAPPED;
    open(MAPPED,"<$Mapped") or die "Cannot open $Mapped for reading: $!\n";
    open(MAPPEDINDEX, "+>$Mapped.idx") or die "Cannot open $Mapped.idx for read/write: $!\n";
    &fileLoc('Output',$Mapped,'Mapped');
    &fileLoc('Unlink',"$Mapped.idx");
    &build_index(\*MAPPED,\*MAPPEDINDEX,'mapped',$Mapped,\%RNAiloc);
}

# Generation of GFF file for visualization of designs
if (($mapping eq 1) && ($options{"GFF"}[0] ne "NO")){
    if (($options{"GFF"}[0] eq 'GFF2') || ($options{"GFF"}[0] eq 'GFF3')){
	&GFFGenerator($identifier,\%DesignsPrint,\%RNAiloc,$outGff,$options{"GFF"}[0],\*MAPPED,\*MAPPEDINDEX,$Mapped);
	print REPORT "GFF file generation for RNAi reagents done\n";
	print "GFF file generation for RNAi reagents done\n";
    }
    else {
	print ERROR "GFF\tValid options for 'GFF' file output in additional options file are 'GFF2' and 'GFF3' (not $options{\"GFF\"}[0])\n";
	print "Valid options for 'GFF' file output in additional options file are 'GFF2' and 'GFF3' (not $options{\"GFF\"}[0]). Start NEXT-RNAi with '-h' for help.\n";
    }
}

# Generation of an annotation file format for direct uploads to GBrowse
if (($mapping eq 1) && ($options{"AFF"}[0] ne "NO")){
    &AFFGenerator($identifier,\%RNAiloc,$outAff,\*MAPPED,\*MAPPEDINDEX,$Mapped);
    print REPORT "Annotation file generation for RNAi reagents done\n";
    print "Annotation file generation for RNAi reagents done\n";
}

# Evaluates for target features (UTRs, SNPs etc.)
if (($mapping eq 1) && ($options{"FEATURE"}[0] ne "empty")){
    if (-e $options{"FEATURE"}[0]){
	&FeatureContent(\%DesignsPrint,\%RNAiloc,$options{"FEATURE"}[0],\%FeatureName,\%FeatureNum,\*ERROR,\*MAPPED,\*MAPPEDINDEX,$Mapped);
	print REPORT "Calculation of feature contents done\n";
	print "Calculation of feature contents done\n";
	$FeatureCalc = 1;
    }
    else {
	print ERROR "$options{\"FEATURE\"}[0]\t'FEATURE' file defined in additional options file was not found\n";
	print "'FEATURE' file $options{\"FEATURE\"}[0] defined in additional options file was not found. Start NEXT-RNAi with '-h' for help.\n";
    }
}

# off-target evaluation by location of RNAi reagent
my $oteeval = 0;
my @OTEHTML = ();
if (($mapping eq 1) && ($options{"OTEEVAL"}[0] ne "empty")){
# for each queried off-target evaluation go through three steps: dicing, Bowtie, Bowtie parsing
    my @IDsub = keys %IDSeqBest;
    for (my $i=0;$i<scalar(@{ $options{"OTEEVAL"} });$i++){
	my @OTEparam = split(/\,/,$options{"OTEEVAL"}[$i]);
	if ($OTEparam[2] eq "pos"){
	    if ((-e "$OTEparam[0]\.1\.ebwt") && (-e "$OTEparam[0]\.2\.ebwt") && (-e "$OTEparam[0]\.3\.ebwt") && (-e "$OTEparam[0]\.4\.ebwt") && (-e "$OTEparam[0]\.rev\.1\.ebwt") && (-e "$OTEparam[0]\.rev\.2\.ebwt")){
		if (($OTEparam[1]=~/^\d+$/) && ($OTEparam[1] > 0)){
		    push (@OTEHTML, $OTEparam[0]);
		    print "Perform off-target evaluation by comparing absolute mapping positions\n";
		    my $outEdicer = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.$OTEparam[1].'_'.$i.'.OTEdicePos';
		    my @IDSeqBestKeys = keys %IDSeqBest;
		    &edicer($outEdicer,$OTEparam[1],\%IDSeqBest,\@IDSeqBestKeys);
		    my $outBowtie = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.$OTEparam[1].'_'.$i.'.OTEdicePos.bwt';
		    system ("$options{\"BOWTIE\"}[0]bowtie -p 4 -f -v 0 -a $OTEparam[0] $outEdicer > $outBowtie") eq 0 || die "Failed to open bowtie: $?\n";
		    &fileLoc('Unlink',$outBowtie);
		    &BowtieParsePos(\@IDsub,$outBowtie,\%RNAiloc,\%OTE,\*MAPPED,\*MAPPEDINDEX,$Mapped);
		    print REPORT "Off-target evaluation $i for RNAi reagents done\n";
		    print "Off-target evaluation $i for RNAi reagents done\n";
		    $oteeval++;
		    &fileLoc('Input',$OTEparam[0],"OTEEVAL_$oteeval");
		}
		else {
		    print ERROR "OTEEVAL\tsiRNA length selected for additional off-target evaluation in 'OTEEVAL' option must be a number > 0 (not $OTEparam[1]), reagents cannot be evaluated for additional off-target effects\n";
		    print "siRNA length selected for additional off-target evaluation in 'OTEEVAL' option must be a number > 0 (not $OTEparam[1]). Reagents cannot be evaluated for additional off-target effects. Start NEXT-RNAi with '-h' for help.\n";
		}
	    }
	    else {
		print ERROR "OTEEVAL\tBowtie index/database $OTEparam[0] for additional off-target evaluation  not found in 'OTEEVAL', reagents cannot be evaluated for additional off-target effects";
		print "Bowtie index/database $OTEparam[0] for additional off-target evaluation in 'not found in 'OTEEVAL'. Reagents cannot be evaluated for additional off-target effects. Start NEXT-RNAi with '-h' for help.\n";
	    }
	}
    }
}

# off-target evaluation by target of RNAi reagent
if ($options{"OTEEVAL"}[0] ne "empty"){
# for each queried off-target evaluation go through three steps: dicing, Bowtie, Bowtie parsing
    my @IDsub = keys %IDSeqBest;
    for (my $i=0;$i<scalar(@{ $options{"OTEEVAL"} });$i++){
	my @OTEparam = split(/\,/,$options{"OTEEVAL"}[$i]);
	my %probeTarget = ();
	my %siRNATarget = ();
	my %probeTargetGroups = ();
	if ($OTEparam[2] eq "target"){
	    if ((-e "$OTEparam[0]\.1\.ebwt") && (-e "$OTEparam[0]\.2\.ebwt") && (-e "$OTEparam[0]\.3\.ebwt") && (-e "$OTEparam[0]\.4\.ebwt") && (-e "$OTEparam[0]\.rev\.1\.ebwt") && (-e "$OTEparam[0]\.rev\.2\.ebwt")){
                if (($OTEparam[1]=~/^\d+$/) && ($OTEparam[1] > 0)){
		    push (@OTEHTML, $OTEparam[0]);
		    print "Perform off-target evaluation by comparing targets\n";
		    my $outEdicer = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.$OTEparam[1].'_'.$i.'.OTEdiceTarget';
		    my @IDSeqBestKeys =keys %IDSeqBest;
		    &edicer($outEdicer,$OTEparam[1],\%IDSeqBest,\@IDSeqBestKeys);
# run Bowtie
		    my $outBowtie = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.$OTEparam[1].'_'.$i.'.OTEdiceTarget.bwt';
		    system ("$options{\"BOWTIE\"}[0]bowtie -p 4 -f -v 0 -a $OTEparam[0] $outEdicer > $outBowtie") eq 0 || die "Failed to open bowtie: $?\n";
		    &fileLoc('Unlink',$outBowtie);
		    &BowtieTarget(\@IDsub,$outBowtie,"",\%probeTarget,\%siRNATarget,"","","","");
		    &targetGroups(\@IDsub,\%targetGroups,\%probeTarget,\%probeTargetGroups,\*ERROR);
		    &BowtieParseTarget(\@IDsub,$outBowtie,\%targetGroups,\%probeTargetGroups,\%OTE);
		    print REPORT "Off-target evaluation $i for RNAi reagents done\n";
		    print "Off-target evaluation $i for RNAi reagents done\n";
		    $oteeval++;
		    &fileLoc('Input',$OTEparam[0],"OTEEVAL_$oteeval");
		}
		else {
		    print ERROR "OTEEVAL\tsiRNA length selected for additional off-target evaluation in 'OTEEVAL' option must be a number > 0 (not $OTEparam[1]), reagents cannot be evaluated for additional off-target effects\n";
		    print "siRNA length selected for additional off-target evaluation in 'OTEEVAL' option must be a number > 0 (not $OTEparam[1]). Reagents cannot be evaluated for additional off-target effects. Start NEXT-RNAi with '-h' for help.\n";
		}
	    }
	    else {
		print ERROR "OTEEVAL\tBowtie index/database $OTEparam[0] for additional off-target evaluation  not found in 'OTEEVAL', reagents cannot be evaluated for additional off-target effects";
		print "Bowtie index/database $OTEparam[0] for additional off-target evaluation in 'not found in 'OTEEVAL'. Reagents cannot be evaluated for additional off-target effects. Start NEXT-RNAi with '-h' for help.\n";
	    }
	}
    }
}

# evaluation of homology of designed RNAi reagents 
my %Homology = ();
my $homology = 0;
if ($options{"HOMOLOGY"}[0] ne "empty"){
    my $outBLAST = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_Homology.bl';
    my @BLASTparam = split(/\,/,$options{"HOMOLOGY"}[0]);
    if (-e "$BLASTparam[0]blastall"){
	if ((-e "$BLASTparam[1]\.nhr") && (-e "$BLASTparam[1]\.nin") && (-e "$BLASTparam[1]\.nsd") && (-e "$BLASTparam[1]\.nsi") && (-e "$BLASTparam[1]\.nsq") && (-e $BLASTparam[1])){
	    if ((($BLASTparam[2]=~/^\d+$/) || ($BLASTparam[2]=~/^\d+\.\d+$/) || ($BLASTparam[2]=~/^\d+e-\d+$/)) && ($BLASTparam[2] >= 0)){
		&BLASTHomology($BLASTparam[0],$BLASTparam[1],$BLASTparam[2],$outBLAST,$IDSeqBestFASTA,\%Homology,\%targetGroups,\*ERROR);
		&fileLoc('Output',$outBLAST,'Homology');
		$homology = 1;
		print REPORT "Homology evaluation done\n";
		print "Homology evaluation done\n";
	    }
	    else {
		print ERROR "HOMOLOGY\tInteger numbers, floating point numbers and scientific numbers >= 0 are allowed as homology cut-off in the 'HOMOLOGY' option (not $BLASTparam[2])\n";
		print "Integer numbers, floating point numbers and scientific numbers >= 0 are allowed as homology cut-off in the 'HOMOLOGY' option only (not $BLASTparam[2]). Start NEXT-RNAi with '-h' for help.\n";
	    }
	}
	else {
	    print ERROR "$BLASTparam[1]\tBlast database not found or incomplete ('HOMOLOGY' parameter in additional options file), reagents cannot be evaluated for homology, Blast database consists of FASTA file, *.nhr file, *.nin file, *.nsd file, *.nsi file and *.nsq file and can be obtained by the use of the 'formatdb' program coming with blast\n";
	    print "Blast database $BLASTparam[1] not found or incomplete ('HOMOLOGY' parameter in additional options file). Reagents cannot be evaluated for homology. A Blast database consists of FASTA file, *.nhr file, *.nin file, *.nsd file, *.nsi file and *.nsq file and can be obtained by the use of the 'formatdb' program coming with Blast. Start NEXT-RNAi with '-h' for help.\n";
	}
    }
    else {
	print ERROR "$BLASTparam[0]\t'blastall' program not found at this location ('HOMOLOGY' parameter in additional options file), reagents cannot be evaluated for homology\n";
	print "'blastall' program not found in $BLASTparam[0] ('HOMOLOGY' parameter in additional options file), reagents cannot be evaluated for homology. Start NEXT-RNAi with '-h' for help.\n";
    }
}

# evaluation of miRNA seeds of designed RNAi reagents

if ($mirseed eq 1){
    &fileLoc('Output',$outmiRNASeed,'miRNASeed');
    open (MIRSEED, ">$outmiRNASeed") || die "Cannot open MIRSEED: $!\n";
    for (my $i=0;$i<scalar(@IDscovered);$i++){
	if (exists $mirSeed{$IDscovered[$i]}){
	    my $index = 1;
	    for (my $j=0;$j<scalar(@{ $DesignsPrint{$IDscovered[$i]}[3] });$j++){
		my %seed = ();
# for long dsRNAs
		if ($reagent eq 'd'){
		    my $start = $DesignsPrint{$IDscovered[$i]}[3][$j];
		    my $end = $DesignsPrint{$IDscovered[$i]}[5][$j] - $options{"SIRNALENGTH"}[0] + 1;
		    for (my $k=$start;$k<=$end;$k++){
			if (exists $mirSeed{$IDscovered[$i]}{$k}){
			    for (my $l=0;$l<scalar(@{ $mirSeed{$IDscovered[$i]}{$k} });$l++){
				if (!exists $seed{$mirSeed{$IDscovered[$i]}{$k}[$l]}){
				    $seed{$mirSeed{$IDscovered[$i]}{$k}[$l]} = 1;			    
				}
				else {
				    $seed{$mirSeed{$IDscovered[$i]}{$k}[$l]}++;
				}
			    }
			}
		    }
		}
		else {
# for siRNAs
		    if (exists $mirSeed{$IDscovered[$i]}{$DesignsPrint{$IDscovered[$i]}[10][$j]}){
			for (my $l=0;$l<scalar(@{ $mirSeed{$IDscovered[$i]}{$DesignsPrint{$IDscovered[$i]}[10][$j]} });$l++){
			    if (!exists $seed{$mirSeed{$IDscovered[$i]}{$DesignsPrint{$IDscovered[$i]}[10][$j]}[$l]}){
				$seed{$mirSeed{$IDscovered[$i]}{$DesignsPrint{$IDscovered[$i]}[10][$j]}[$l]} = 1;
			    }
			    else {
				$seed{$mirSeed{$IDscovered[$i]}{$DesignsPrint{$IDscovered[$i]}[10][$j]}[$l]}++;
			    }
			}
		    }
		}
		my @seeds = keys %seed;
		my $seedOut = '';
		for (my $k=0;$k<scalar(@seeds);$k++){
		    if ($k eq 0){
			$seedOut.= "$seeds[$k]\($seed{$seeds[$k]}\)";
		    }
		    else {
			$seedOut.= "\,$seeds[$k]\($seed{$seeds[$k]}\)";
		    }
		}
		if ($seedOut ne ''){
		    print MIRSEED "$IDscovered[$i]\_$index\t$seedOut\n";
		}
		$index++;
	    }
	}
    }
    close MIRSEED;
}

##
## POOL summarization option for siRNA evaluation
##

my %poolsiRNA = ();
my %poolResults = ();
my %undefsiRNA = ();
my $poolsiRNA = 0;
if (($reagent eq 's') && ($evaluation eq 'OLIGO') && ($options{"POOL"}[0] ne 'empty')){
    if (-e $options{"POOL"}[0]){
	my %header = ();
	my $header = 0;
	&fileLoc('Input',$options{"POOL"}[0],'siRNAPOOLS');
	open (POOL, $options{"POOL"}[0]) || die "Cannot open POOL: $!\n";
      POOLS: 
	while (my $line = <POOL>){
	    $line = &cleanLine($line);
	    my @columns = ();
	    @columns = split(/\t/, $line);
# get file headers
	    if ($header eq 0){
		for (my $i=0;$i<scalar(@columns);$i++){
		    if (!exists $header{$columns[$i]}){
			$header{$columns[$i]} = $i;
		    }
		}
		if ((!exists $header{'POOLID'}) || (!exists $header{'siRNAID'})){
		    print ERROR "$options{\"POOL\"}[0]\tsiRNA 'POOL' file contains wrong header information ('siRNAID' and 'POOLID' headers required)\n";
		    print "siRNA 'POOL' file $options{\"POOL\"}[0] contains wrong header information ('siRNAID' and 'POOLID' headers required). Start NEXT-RNAi with '-h' for help.";
		    last POOLS;
		}
	    }
	    else {
		if (!exists $poolsiRNA{$columns[$header{'POOLID'}]}){
		    $poolsiRNA{$columns[$header{'POOLID'}]} = [ $columns[$header{'siRNAID'}], ];
		}
		else {
		    push (@{ $poolsiRNA{$columns[$header{'POOLID'}]} }, $columns[$header{'siRNAID'}]);
		}
	    }
	    $header++;
	}
	close POOL;
	my @poolsiRNA = keys %poolsiRNA;
	if (scalar(@poolsiRNA) > 0){
	    $poolsiRNA = 1;
	}
    }
    else {
	print ERROR "$options{\"POOL\"}[0]\tsiRNA 'POOL' file does not exist, summary for siRNA pools cannot be generated\n";
        print "siRNA 'POOL' file $options{\"POOL\"}[0] does not exist, summary for siRNA pools cannot be generated. Start NEXT-RNAi with '-h' for help.\n";
    }
}
if ($poolsiRNA eq 1){
    my @poolsiRNA = keys %poolsiRNA;
    for (my $i=0;$i<scalar(@poolsiRNA);$i++){
	my %eff = ();
	my %spec = ();
	my %target = ();
	for (my $j=0;$j<scalar(@{ $poolsiRNA{$poolsiRNA[$i]} });$j++){
	    if (exists $DesignsPrint{$poolsiRNA{$poolsiRNA[$i]}[$j]}){
		if (!exists $poolResults{$poolsiRNA[$i]}){
# collect siRNA results: position, length, sequence, efficiency method, efficiency (placeholder for averaging),
# specificity (placeholder), seed complement frequencies, targetgroups (placeholder), target (placeholder), target hits (placeholder)
		    $poolResults{$poolsiRNA[$i]} = [ $poolsiRNA{$poolsiRNA[$i]}[$j], $DesignsPrint{$poolsiRNA{$poolsiRNA[$i]}[$j]}[10][0], $DesignsPrint{$poolsiRNA{$poolsiRNA[$i]}[$j]}[11][0], $DesignsPrint{$poolsiRNA{$poolsiRNA[$i]}[$j]}[12][0], $DesignsPrint{$poolsiRNA{$poolsiRNA[$i]}[$j]}[14][0], '', '', $DesignsPrint{$poolsiRNA{$poolsiRNA[$i]}[$j]}[18][0], '', '', '', ];
		    $undefsiRNA{$poolsiRNA{$poolsiRNA[$i]}[$j]} = $poolsiRNA[$i];
# count efficiencies
		    if ($options{"EFFICIENCY"}[0] ne 'empty,empty'){
			$eff{$poolsiRNA[$i]} = [ $DesignsPrint{$poolsiRNA{$poolsiRNA[$i]}[$j]}[15][0], 1, ];
		    }
# separate different specificity parameters
		    my @unspec = split(/\//, $DesignsPrint{$poolsiRNA{$poolsiRNA[$i]}[$j]}[17][0]);
		    $spec{$poolsiRNA[$i]} = [ @unspec, ];
# collect targetgroups, targets, hits
		    my @gene = split(/\&/, $DesignsPrint{$poolsiRNA{$poolsiRNA[$i]}[$j]}[19][0]);
		    my @txn = split(/\&/, $DesignsPrint{$poolsiRNA{$poolsiRNA[$i]}[$j]}[20][0]);
		    my @hits = split(/\&/, $DesignsPrint{$poolsiRNA{$poolsiRNA[$i]}[$j]}[21][0]);
		    for (my $k=0;$k<scalar(@gene);$k++){
			my @txn2 = split(/\+/, $txn[$k]);
			my @hits2 = split(/\+/, $hits[$k]);
			for (my $l=0;$l<scalar(@txn2);$l++){
			    if ($hits2[$l] ne 'NA'){
				$target{$poolsiRNA[$i]}{$gene[$k]}{$txn2[$l]} = $hits2[$l];
			    }
			}
		    }
		}
		else {
# siRNA ID concatenated by '&'
		    $poolResults{$poolsiRNA[$i]}[0].= '&'.$poolsiRNA{$poolsiRNA[$i]}[$j];
# position concatenated by '&'
		    $poolResults{$poolsiRNA[$i]}[1].= '&'.$DesignsPrint{$poolsiRNA{$poolsiRNA[$i]}[$j]}[10][0];
# length concatenated by '&'
		    $poolResults{$poolsiRNA[$i]}[2].= '&'.$DesignsPrint{$poolsiRNA{$poolsiRNA[$i]}[$j]}[11][0];
# sequence concatenated by '&'
		    $poolResults{$poolsiRNA[$i]}[3].= '&'.$DesignsPrint{$poolsiRNA{$poolsiRNA[$i]}[$j]}[12][0];
# efficiencies concatenated by '&'
		    if ($options{"EFFICIENCY"}[0] ne 'empty,empty'){
			$eff{$poolsiRNA[$i]}[0].= '&'.$DesignsPrint{$poolsiRNA{$poolsiRNA[$i]}[$j]}[15][0];
			$eff{$poolsiRNA[$i]}[1]++;
		    }
# seed complement frequencies concatenated by '&'
		    $poolResults{$poolsiRNA[$i]}[7].= '&'.$DesignsPrint{$poolsiRNA{$poolsiRNA[$i]}[$j]}[18][0];
# addition of specificity parameters
		    my @unspec = split(/\//, $DesignsPrint{$poolsiRNA{$poolsiRNA[$i]}[$j]}[17][0]);
		    for (my $k=0;$k<scalar(@unspec);$k++){
# 
			if (($k eq 5) || ($k eq 6) || ($k eq 7)){
			    $spec{$poolsiRNA[$i]}[$k].= '&'.$unspec[$k];
			}
			else {
			    $spec{$poolsiRNA[$i]}[$k]+= $unspec[$k];
			}
		    }
# collect targetgroups, targets, hits
		    my @gene = split(/\&/, $DesignsPrint{$poolsiRNA{$poolsiRNA[$i]}[$j]}[19][0]);
		    my @txn = split(/\&/, $DesignsPrint{$poolsiRNA{$poolsiRNA[$i]}[$j]}[20][0]);
		    my @hits = split(/\&/, $DesignsPrint{$poolsiRNA{$poolsiRNA[$i]}[$j]}[21][0]);
		    for (my $k=0;$k<scalar(@gene);$k++){
			my @txn2 = split(/\+/, $txn[$k]);
			my @hits2 = split(/\+/, $hits[$k]);
			for (my $l=0;$l<scalar(@txn2);$l++){
			    if ($hits2[$l] ne 'NA'){
				if (!exists $target{$poolsiRNA[$i]}{$gene[$k]}{$txn2[$l]}){
				    $target{$poolsiRNA[$i]}{$gene[$k]}{$txn2[$l]} = $hits2[$l];
				}
				else {
				    $target{$poolsiRNA[$i]}{$gene[$k]}{$txn2[$l]}+= $hits2[$l];
				}
			    }
			}
		    }
		    $undefsiRNA{$poolsiRNA{$poolsiRNA[$i]}[$j]} = $poolsiRNA[$i];
		}
	    }
	}
# add efficiency, specificity and target information to pool results
	if (exists $poolResults{$poolsiRNA[$i]}){
	    $poolResults{$poolsiRNA[$i]}[5] = $eff{$poolsiRNA[$i]}[0];
	    my $spec = join('/', @{ $spec{$poolsiRNA[$i]}});
	    $poolResults{$poolsiRNA[$i]}[6] = $spec;
	    
# sort targets according to number of hits
	    my @gene = keys %{ $target{$poolsiRNA[$i]} };
	    my @bestgene = ();
	    for (my $j=0;$j<scalar(@gene);$j++){
		my @txn = keys %{ $target{$poolsiRNA[$i]}{$gene[$j]} };
		my @hits = ();
		for (my $k=0;$k<scalar(@txn);$k++){
		    push (@hits,$target{$poolsiRNA[$i]}{$gene[$j]}{$txn[$k]});
		}
# sort targets within a certain group
		@txn = @txn[ sort {$hits[$b] <=> $hits[$a]} 0 .. $#txn ];
		@hits = sort {$b<=>$a}(@hits);
		push (@bestgene, $hits[0]);
	    }
# sort groups according to number of hits to best target within a group, to identify primary/intended target
	    @gene = @gene[ sort {$bestgene[$b] <=> $bestgene[$a]} 0 .. $#gene ];
	    my $groupTargets = "";
	    my $Target = "";
	    my $TargetHits = "";
	    for (my $j=0;$j<scalar(@gene);$j++){
		if ($j eq 0){
		    $groupTargets = $gene[$j];
		}
		else {
		    $groupTargets.= '&'.$gene[$j];
		}
		my @txn = keys %{ $target{$poolsiRNA[$i]}{$gene[$j]} };
		my @hits = ();
		for (my $k=0;$k<scalar(@txn);$k++){
		    push (@hits,$target{$poolsiRNA[$i]}{$gene[$j]}{$txn[$k]});
		}
# sort targets of target group according to number of hits
		@txn = @txn[ sort {$hits[$b] <=> $hits[$a]} 0 .. $#txn ];
		@hits = sort {$b<=>$a}(@hits);
		for (my $k=0;$k<scalar(@txn);$k++){
		    if ($j eq 0){
			if ($k eq 0){
			    $Target = $txn[$k];
			    $TargetHits = $hits[$k];
			}
			else {
			    $Target.= '+'.$txn[$k];
			    $TargetHits.= '+'.$hits[$k];
			}
		    }
		    else {
			if ($k eq 0){
			    $Target.= '&'.$txn[$k];
			    $TargetHits.= '&'.$hits[$k];
			}
			else {
			    $Target.= '+'.$txn[$k];
			    $TargetHits.= '+'.$hits[$k];
			}
		    }
		}
	    }
	    $poolResults{$poolsiRNA[$i]}[8] = $groupTargets;
	    $poolResults{$poolsiRNA[$i]}[9] = $Target;
	    $poolResults{$poolsiRNA[$i]}[10] = $TargetHits;
	}
    }
# delete entries in design hash and add the POOL calculations instead
    my @delete = keys %undefsiRNA;
    for (my $i=0;$i<scalar(@delete);$i++){
	delete $DesignsPrint{$delete[$i]};
    }
    my @POOLS = keys %poolResults;
    for (my $i=0;$i<scalar(@POOLS);$i++){
	if (!exists $DesignsPrint{$POOLS[$i]}){
	    $DesignsPrint{$POOLS[$i]} = [ [ 'POOL', ],[$poolResults{$POOLS[$i]}[0]],[],[],[],[],[],[],[],[],[ $poolResults{$POOLS[$i]}[1], ],[ $poolResults{$POOLS[$i]}[2], ],[ $poolResults{$POOLS[$i]}[3], ],[],[ $poolResults{$POOLS[$i]}[4], ],[ $poolResults{$POOLS[$i]}[5], ],[],[ $poolResults{$POOLS[$i]}[6], ],[ $poolResults{$POOLS[$i]}[7], ],[ $poolResults{$POOLS[$i]}[8], ],[ $poolResults{$POOLS[$i]}[9], ],[ $poolResults{$POOLS[$i]}[10], ] ];
	}
    }
}

##
## OUTPUT of results as tab delimited file and HTML report
##

&fileLoc('Output',$outTab,'DesignsTAB');
open (OUTTAB, ">$outTab") || die "Cannot open OUTTAB: $!\n";
# index HTML file
open (OUTHTML, ">$outHTML") || die "Cannot open OUTHTML: $!\n";
print OUTTAB "QueryID\tQuerySubID";
# statistics output file
my $statoutTAB = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'.stats';
&fileLoc('Output',$statoutTAB,'STATS');

# get HTML header and footer, print header to index file
my $header = &HTMLhead($identifier);
my $footer = &HTMLfoot();
print OUTHTML "$header\n";
# print HTML page heading, some statistics on successfull and failed designs
my $designopt = "";
if ($evaluation eq 'NO'){
    $designopt = "design(s)";
}
else {
    $designopt = "evaluation(s)";
}
my $success = scalar(@IDscovered);
my $failure = scalar(keys %IDfail);
my $query = $success + $failure;
my $succesperc = sprintf("%.2f", (($success / $query)*100));
my $failureperc = sprintf("%.2f", (($failure / $query)*100));

# updated IDscovered if POOL analysis was queried
if (($reagent eq 's') && ($evaluation eq 'OLIGO') && ($options{"POOL"}[0] ne 'empty')){
    @IDscovered = keys %DesignsPrint;
}

# add exception for E-RNAi
if ($identifier ne 'E-RNAi'){
    print OUTHTML <<"";
<table width="800" border="0" cellpadding="0" cellspacing="5" class="news">
    <tr>
        <td align="center" valign="middle"><strong>NEXT-RNAi results for $identifier $designopt</strong></td>
    </tr>
</table>


}
else {
    print OUTHTML <<"";
<table width="800" border="0" cellpadding="0" cellspacing="5" class="news">
    <tr>
        <td align="center" valign="middle"><strong>Output for $identifier $designopt <a href="http://b110-wiki.dkfz.de/signaling/wiki/display/ernai/E-RNAi+output" target="_blank" title="E-RNAi Wiki" sty
le="text-decoration:none">[Help]</a></strong></td>
    </tr>
</table>


}

print OUTHTML <<"";
<table width="800" border="0" cellpadding="0" cellspacing="5" class="main">
    <tr>
        <td class="main">&nbsp;</td>
    </tr>
    <tr>
        <td class="main"><strong>Number of queries: $query<br>Queries covered by $designopt: $success ($succesperc %)<br>Queries not coverered by $designopt: $failure ($failureperc %)<br></strong></td>
    </tr>
    <tr>
        <td class="main">More statistics on designs are <a href="#stats">here</a></td>
    </tr>
    <tr>
        <td class="main">&nbsp;</td>
    </tr>


# list designs
print OUTHTML <<'';
    <tr>
        <td class="style8"><strong>Links to HTML results</strong></td>
    </tr>


# siRNA design/evaluation header output
if ($reagent eq "s"){
    print OUTTAB "\tPosition\tLength[nt]\tSequence\tSpecificity[Abs]";
    if ($seedmatch eq 1){
	print OUTTAB "\tSCF";
    }
    if ($mirseed eq 1){
	print OUTTAB "\tmirSeed";
    }
    if ($options{"EFFICIENCY"}[0] ne 'empty,empty'){
	print OUTTAB "\tEfficiencyScore";
    }
    print OUTTAB "\tIntendedGene\tIntendedTxn\tIntendedTxnHits\tOtherGene\tOtherTxn\tOthertTxnHits";
    if ($mapping eq 1){
	print OUTTAB "\tLocation";
    }
    if ($FeatureCalc eq 1){
	my @features = keys %FeatureName;
	@features = sort {$a cmp $b} (@features);
	for (my $i=0;$i<scalar(@features);$i++){
	    print OUTTAB "\t$features[$i]";
	}
    }
    if ($lowcomp eq 1){
	print OUTTAB "\tLowComplexRegions";
    }
    if ($canrepeats eq 1){
	print OUTTAB "\tCANRepeats";
    }
    for (my $i=0;$i<$oteeval;$i++){
	my $num = $i + 1;
	print OUTTAB "\tOTEEVAL_$num";
    }
    if ($homology eq 1){
	print OUTTAB "\tHomology";
    }
    print OUTTAB "\n";
}
else {
# long dsRNA design/evaluation header output
    print OUTTAB "\tLength[nt]\tSeqFor\tPosFor\tLenFor\tGCFor[%]\tTmFor[*C]\tSeqRev\tPosRev\tLenRev\tGCRev[%]\tTmRev[*C]\tForRevPenalty\tSpecificity[Abs]\tSpecificity[%]";
    if ($options{"EFFICIENCY"}[0] ne 'empty,empty'){
	print OUTTAB "\tEfficientsiRNAs\tAvgEfficiencyScore";
    }
    print OUTTAB "\tSequence\tIntendedGene\tIntendedTxn\tIntendedTxnHits\tOtherGene\tOtherTxn\tOthertTxnHits";
    if ($mapping eq 1){
	print OUTTAB "\tLocation";
    }
    if ($FeatureCalc eq 1){
	my @features = keys %FeatureName;
	@features = sort {$a cmp $b} (@features);
	for (my $i=0;$i<scalar(@features);$i++){
            print OUTTAB "\t$features[$i]";
        }
    }
    if ($seedmatch eq 1){
	print OUTTAB "\thighSCF";
    }
    if ($mirseed eq 1){
	print OUTTAB "\tmirSeed";
    }
    if ($lowcomp eq 1){
	print OUTTAB "\tLowComplexRegions";
    }
    if ($canrepeats eq 1){
	print OUTTAB "\tCANRepeats";
    }
    for (my $i=0;$i<$oteeval;$i++){
	my $num = $i + 1;
	print OUTTAB "\tOTEEval_$num";
    }
    if ($homology eq 1){
	print OUTTAB "\tHomology";
    }
    print OUTTAB "\n";
}

my $HTMLindex = 0;
for (my $i=0;$i<scalar(@IDscovered);$i++){
    my $helpLink = "";
    if ($identifier eq 'E-RNAi'){
	$helpLink = '<a href="http://b110-wiki.dkfz.de/signaling/wiki/display/ernai/E-RNAi+output#E-RNAioutput-LinkstoHTMLresults" target="_blank" title="E-RNAi Wiki" style="text-decoration:none">[Help]</a>';
    }
    my $index = 1;
# HTML output for each design
    my $designHTML = $HTMLoutfolder.$IDscovered[$i].'.html';
    open (DESIGNHTML, ">$designHTML") || die "Cannot open DESIGNHTML ($designHTML): $!\n";
    print DESIGNHTML "$header\n";
    print DESIGNHTML <<"";
<table width="800" border="0" cellpadding="0" cellspacing="5" class="news">
    <tr>
	<td align="center" valign="middle"><strong>Query ID: $IDscovered[$i] $helpLink</strong></td>
    </tr>
</table>
<table border="0" cellpadding="0" cellspacing="0">
    <tr>
        <td>&nbsp;</td>
    </tr>
</table>


    for (my $j=0;$j<scalar(@{ $DesignsPrint{$IDscovered[$i]}[17] });$j++){
# check if output is POOL output
	my @IDsub = ();
	if ($DesignsPrint{$IDscovered[$i]}[0][$j] eq 'POOL'){
	    @IDsub = split(/\&/,$DesignsPrint{$IDscovered[$i]}[1][$j]);
	    for (my $k=0;$k<scalar(@IDsub);$k++){
                $IDsub[$k].= '_'.$index;
	    }
	}
# prepare specificity output
	my @unspec = split(/\//,$DesignsPrint{$IDscovered[$i]}[17][$j]);
	$DesignsPrint{$IDscovered[$i]}[17][$j] = $unspec[0].'/'.$unspec[1].'/'.$unspec[2].'/'.$unspec[3];
# ID ouptut
	my $IDsub = $IDscovered[$i].'_'.$index;
	if ($DesignsPrint{$IDscovered[$i]}[0][$j] eq 'POOL'){
	    print OUTTAB "$IDscovered[$i]\t$DesignsPrint{$IDscovered[$i]}[1][$j]";
	}
	else {
	    print OUTTAB "$IDscovered[$i]\t$IDsub";
	}
# prepare RNAiloc output
	my $RNAiloc = "";
# for siRNA POOLs
	if (scalar(@IDsub) ne 0){
	    for (my $k=0;$k<scalar(@IDsub);$k++){
		if ($k eq 0){
		    if (exists $RNAiloc{$IDsub[$k]}{$Mapped}){
# check for multiple mappings of RNAi reagent
			my @chroms = ();
			my @starts = ();
			my @ends = ();
			my @orientations = ();
			&ParseMAPPING(\%RNAiloc,$IDsub[$k],\*MAPPED,\*MAPPEDINDEX,$Mapped,\@chroms,\@starts,\@ends,\@orientations);
			for (my $l=0;$l<scalar(@chroms);$l++){
			    if ($l eq 0){
				$RNAiloc = $chroms[$l].':'.$starts[$l].'..'.$ends[$l].'('.$orientations[$l].')';
			    }
			    else {
				$RNAiloc.= '|'.$chroms[$l].':'.$starts[$l].'..'.$ends[$l].'('.$orientations[$l].')';
			    }
			}
		    }
		    else {
			$RNAiloc = 'NA';
		    }
		}
		else {
# check for multiple mappings of RNAi reagent
		    if (exists $RNAiloc{$IDsub[$k]}{$Mapped}){
			my @chroms = ();
                        my @starts = ();
                        my @ends = ();
                        my @orientations = ();
                        &ParseMAPPING(\%RNAiloc,$IDsub[$k],\*MAPPED,\*MAPPEDINDEX,$Mapped,\@chroms,\@starts,\@ends,\@orientations);
			for (my $l=0;$l<scalar(@chroms);$l++){
			    if ($l eq 0){
				$RNAiloc.= '&'.$chroms[$l].':'.$starts[$l].'..'.$ends[$l].'('.$orientations[$l].')';
			    }
			    else {
				$RNAiloc.= '|'.$chroms[$l].':'.$starts[$l].'..'.$ends[$l].'('.$orientations[$l].')';
			    }
			}
		    }
		    else {
			$RNAiloc.= '&NA';
		    }
		}
	    }
	}
	else {
# for single sequences
	    if (exists $RNAiloc{$IDsub}{$Mapped}){
# check for multiple mappings of RNAi reagent
		my @chroms = ();
		my @starts = ();
		my @ends = ();
		my @orientations = ();
		&ParseMAPPING(\%RNAiloc,$IDsub,\*MAPPED,\*MAPPEDINDEX,$Mapped,\@chroms,\@starts,\@ends,\@orientations);
		for (my $k=0;$k<scalar(@chroms);$k++){
		    if ($k eq 0){
			$RNAiloc = $chroms[$k].':'.$starts[$k].'..'.$ends[$k].'('.$orientations[$k].')';
		    }
		    else {
			$RNAiloc.= '|'.$chroms[$k].':'.$starts[$k].'..'.$ends[$k].'('.$orientations[$k].')';
		    }
		}
	    }
	    else {
		$RNAiloc = 'NA';
	    }
	}
# prepare off-target effect output
	my $OTE = "";
	my @OTE = ();
	my @OTE1 = ();
	my @OTE2 = ();
# for siRNA POOLs
	if (scalar(@IDsub) ne 0){
	    for (my $k=0;$k<scalar(@IDsub);$k++){
		if (exists $OTE{$IDsub[$k]}){
		    for (my $l=0;$l<scalar(@{ $OTE{$IDsub[$k]} });$l++){
			my @OTEsplit = split(/\//,$OTE{$IDsub[$k]}[$l]);
			if ($k eq 0){
			    $OTE[$l][0] = $OTEsplit[0];
			    $OTE[$l][1] = $OTEsplit[1];
			}
			else {
			    $OTE[$l][0]+= $OTEsplit[0];
                            $OTE[$l][1]+= $OTEsplit[1];
			}
		    }
		}
	    }
# no off-target information found
	    if (scalar (@OTE) eq 0){
		$OTE = 'NA';
		push (@OTE1,'NA');
		push (@OTE2,'NA');
	    }
	    else {
# off-target information found
		for (my $k=0;$k<scalar(@OTE);$k++){
		    push (@OTE1,$OTE[$k][0]);
                    push (@OTE2,$OTE[$k][1]);
		    if ($k eq 0){
                        $OTE = $OTE[$k][0].'/'.$OTE[$k][1];
                    }
                    else {
                        $OTE.= "\t$OTE[$k][0]/$OTE[$k][1]";
                    }
		}
	    }
	}
	else {
# for single sequences
	    if (exists $OTE{$IDsub}){
		my @OTE = @{ $OTE{$IDsub} };
		for (my $k=0;$k<scalar(@{ $OTE{$IDsub} });$k++){
		    my @OTEsplit = split(/\//,$OTE{$IDsub}[$k]);
		    if ($k eq 0){
			$OTE = $OTE{$IDsub}[$k];
		    }
		    else {
			$OTE.= "\t$OTE{$IDsub}[$k]";
		    }
		    push (@OTE1,$OTEsplit[0]);
		    push (@OTE2,$OTEsplit[1]);
		}
	    }
	    else {
		$OTE = 'NA';
		push (@OTE1,'NA');
		push (@OTE2,'NA');
	    }
	}
# prepare feature output
	my $FeatureNum = "";
	my @features = keys %FeatureName;
        @features = sort {$a cmp $b} (@features);
	my @featuresHTML = ();
	for (my $k=0;$k<scalar(@features);$k++){
# for siRNA pools
	    if (scalar (@IDsub) ne 0){
		for (my $l=0;$l<scalar(@IDsub);$l++){
		    if (exists $FeatureNum{$features[$k]}{$IDsub[$l]}){
			if ($FeatureNum eq ""){
			    $FeatureNum = $FeatureNum{$features[$k]}{$IDsub[$l]};
			    push (@featuresHTML,$FeatureNum{$features[$k]}{$IDsub[$l]});
			}
			else {
			    if ($l eq 0){
				$FeatureNum.= "\t".$FeatureNum{$features[$k]}{$IDsub[$l]};
				push (@featuresHTML,$FeatureNum{$features[$k]}{$IDsub[$l]});
			    }
			    else {
				$FeatureNum.= '&'.$FeatureNum{$features[$k]}{$IDsub[$l]};
				$featuresHTML[$k].= '&'.$FeatureNum{$features[$k]}{$IDsub[$l]};
			    }
			}
		    }
		    else {
			if ($FeatureNum eq ""){
			    $FeatureNum = 0;
			    push (@featuresHTML,0);
			}
			else {
			    if ($l eq 0){
				$FeatureNum.= "\t0";
				push (@featuresHTML,0);
			    }
			    else {
				$FeatureNum.= "&0";
				$featuresHTML[$k].= '&0';
			    }
			}
		    }
		}
	    }
	    else {
# for single sequences
		if (exists $FeatureNum{$features[$k]}{$IDsub}){
		    if ($FeatureNum eq ""){
			$FeatureNum = $FeatureNum{$features[$k]}{$IDsub};
		    }
		    else {
			$FeatureNum.= "\t".$FeatureNum{$features[$k]}{$IDsub};
		    }
		    push (@featuresHTML,$FeatureNum{$features[$k]}{$IDsub});
		}
		else {
		    if ($FeatureNum eq ""){
			$FeatureNum = "0";
		    }
		    else {
			$FeatureNum.= "\t0";
		    }
		    push (@featuresHTML,0);
		}
	    }
	}

# prepare homology output
	my $hom = "";
	my $homHTML = "";
# for siRNA pools
	if (scalar (@IDsub) ne 0){
	    for (my $l=0;$l<scalar(@IDsub);$l++){
		if (exists $Homology{$IDsub[$l]}){
		    if ($l eq 0){
			$hom = join('&', @{ $Homology{$IDsub[$l]} });
		    }
		    else {
			$hom.= '|'.join('&', @{ $Homology{$IDsub[$l]} });
		    }
		}
		else {
		    if ($l eq 0){
			$hom = 'NA';
		    }
		    else {
			$hom.= '|NA';
		    }
		}
	    }
	}
	else {
# for single sequences
	    if (exists $Homology{$IDsub}){
		$hom = join('&', @{ $Homology{$IDsub} });
	    }
	    else {
		$hom = 'NA';
	    }
	}
	$homHTML = $hom;
	$homHTML =~ s/\|/\<br\>/g;

# output of R plot for efficiencies of long dsRNAs
#	if (($reagent eq 'd') && ($options{"EFFICIENCY"}[0] ne 'empty,empty')){
#	    my @effOptions = split(/\,/,$options{"EFFICIENCY"}[0]);
#	    my @effValues = @{ $DesignsPrint{$IDscovered[$i]}[13][$j] };
#	    my $scalar = scalar(@effValues);
#	    my $rscript = $options{"OUTPUT"}[0].'NEXT-RNAi_'.$identifier.'_eff.r';
#	    my $routput = $HTMLoutfolder.$IDsub.'_eff.png';
# generate efficiency vector for R
#	    my $effVector = join(',',@effValues);
#	    open (RSCRIPT, ">$rscript") || die "Cannot open RSCRIPT ($rscript): $!\n";
#	    print RSCRIPT "y = c($effVector)\n";
#	    print RSCRIPT "x = c(1:$scalar)\n";
#	    print RSCRIPT "png(\"$routput\",width=700,height=300)\n";
#	    print RSCRIPT "plot(x,y,xlab=\"Position [bp]\",ylab=\"Efficiency [%]\",type=\"n\")\n";
#	    print RSCRIPT "lines(x,y,type=\"l\")\n";
#	    print RSCRIPT "abline(a=$effOptions[1],b=0,col=\"red\")";
#	    close RSCRIPT;
#	    system ("$options{\"R\"}[0]R --vanilla --slave < $rscript") eq 0 || die "Failed to run R script $rscript: $?\n";
#	}

# distinguish intended from other targets (for tab-delimited and HTML output)
	my $intgene = 'NA';
	my $intgeneHTML = 'NA';
	my $inttxn = 'NA';
	my $inttxnHTML = 'NA';
	my $inthits = 'NA';
	my $othergene = 'NA';
	my $othergeneHTML = 'NA';
	my $othertxn = 'NA';
	my $othertxnHTML = 'NA';
	my $otherhits = 'NA';
	my @targets = split(/\&/,$DesignsPrint{$IDscovered[$i]}[19][$j]);
	my @txn = split(/\&/,$DesignsPrint{$IDscovered[$i]}[20][$j]);
	my @hits = split(/\&/,$DesignsPrint{$IDscovered[$i]}[21][$j]);
# get best targets (intended targets)
	my %inttargets = ();
	my $best = 0;
	for (my $k=0;$k<scalar(@targets);$k++){
	    my @hits2 = split(/\+/,$hits[$k]);
	    for (my $l=0;$l<scalar(@hits2);$l++){
		if (($k eq 0) && ($l eq 0)){
		    $best = $hits2[0];
		}
		if (($l eq 0) && ($hits2[$l] eq $best)){
		    if (!exists $inttargets{$targets[$k]}){
			$inttargets{$targets[$k]} = "";
		    }
		}
	    }
	}
	my $inttarget = "";
	if (exists $IntendedTarget{$IDscovered[$i]}){
	    $inttarget = $IntendedTarget{$IDscovered[$i]};
	}
	if ((scalar(@targets) eq 1) && ((exists $inttargets{$inttarget}) || ($inttarget eq ""))){
	    $intgene = $DesignsPrint{$IDscovered[$i]}[19][$j];
	    $inttxn = $DesignsPrint{$IDscovered[$i]}[20][$j];
	    $inthits = $DesignsPrint{$IDscovered[$i]}[21][$j];
	    $intgeneHTML = $DesignsPrint{$IDscovered[$i]}[19][$j];
	    my @txn2 = split(/\+/,$DesignsPrint{$IDscovered[$i]}[20][$j]);
	    my @hits2 = split(/\+/,$DesignsPrint{$IDscovered[$i]}[21][$j]);
	    for (my $k=0;$k<scalar(@txn2);$k++){
		if ($k eq 0){
		    $inttxnHTML = $txn2[$k].' ('.$hits2[$k].')';
		}
		else {
		    $inttxnHTML.= ', '.$txn2[$k].' ('.$hits2[$k].')';
		}
	    }
	}
	else {
	    my $int = 0;
	    for (my $k=0;$k<scalar(@targets);$k++){
		my @txn2 = split(/\+/,$txn[$k]);
		my @hits2 = split(/\+/,$hits[$k]);
		if (scalar(@txn2) > 1){ 
		    @txn2 = @txn2[ sort {$hits2[$b] <=> $hits2[$a]} 0 .. $#txn2];
		    @hits2 = sort {$b <=> $a} (@hits2);
		}
		if (($k eq 0) && ((exists $inttargets{$inttarget}) || ($inttarget eq ""))){
		    $int = $hits2[0];
		    $intgene = $targets[$k];
		    $inttxn = $txn[$k];
		    $inthits = $hits[$k];
		    $intgeneHTML = $targets[$k];
		    for (my $l=0;$l<scalar(@txn2);$l++){
			if ($l eq 0){
			    $inttxnHTML = $txn2[$l].' ('.$hits2[$l].')';
			}
			else {
			    $inttxnHTML.= ', '.$txn2[$l].' ('.$hits2[$l].')';
			}
		    }
		}
		else {
		    if (($int eq $hits2[0]) && ((exists $inttargets{$inttarget}) || ($inttarget eq ""))){
			$intgene.= '&'.$targets[$k];
			$inttxn.= '&'.$txn[$k];
			$inthits.= '&'.$hits[$k];
			$intgeneHTML.= '; '.$targets[$k];
                        for (my $l=0;$l<scalar(@txn2);$l++){
			    if ($l eq 0){
				$inttxnHTML.='; '.$txn2[$l].' ('.$hits2[$l].')';
			    }
			    else {
				$inttxnHTML.= ', '.$txn2[$l].' ('.$hits2[$l].')';
			    }
			}
		    }
		    else {
			if ($othergene eq "NA"){
			    $othergene = $targets[$k];
			    $othertxn = $txn[$k];
			    $otherhits = $hits[$k];
			    $othergeneHTML = $targets[$k];
			    for (my $l=0;$l<scalar(@txn2);$l++){
				if ($l eq 0){
				    $othertxnHTML = $txn2[$l].' ('.$hits2[$l].')';
				}
				else {
				    $othertxnHTML.= ', '.$txn2[$l].' ('.$hits2[$l].')';
				}
			    }
			}
			else {
			    $othergene.= '&'.$targets[$k];
                            $othertxn.= '&'.$txn[$k];
                            $otherhits.= '&'.$hits[$k];
			    $othergeneHTML.= '; '.$targets[$k];
			    for (my $l=0;$l<scalar(@txn2);$l++){
                                if ($l eq 0){
                                    $othertxnHTML.= '; '.$txn2[$l].' ('.$hits2[$l].')';
                                }
                                else {
                                    $othertxnHTML.= ', '.$txn2[$l].' ('.$hits2[$l].')';
                                }
                            }
			}
		    }
		}
	    }
	}
# siRNA design/evaluation output
	if ($reagent eq "s"){
# HTML reagent info table for siRNAs
	    my $IDsubprint = "";
	    my $positions = "";
	    my $lengths = "";
	    my $sequences = "";
	    if (scalar(@IDsub) ne 0){
		$IDsubprint = join(', ',@IDsub);
		$positions = $DesignsPrint{$IDscovered[$i]}[10][$j];
		$positions =~s/\&/<br>/g;
                $lengths = $DesignsPrint{$IDscovered[$i]}[11][$j];
		$lengths =~s/\&/<br>/g;
                $sequences = $DesignsPrint{$IDscovered[$i]}[12][$j];
		$sequences =~s/\&/<br>/g;
	    }
	    else {
		$IDsubprint = $IDsub;
		$positions = $DesignsPrint{$IDscovered[$i]}[10][$j];
		$lengths = $DesignsPrint{$IDscovered[$i]}[11][$j];
		$sequences = $DesignsPrint{$IDscovered[$i]}[12][$j];
	    }
	    print DESIGNHTML <<"";
<table border="0" cellpadding="0" cellspacing="0" class="main">
    <tr>
        <td><h3>Design $index: $IDsubprint</h3></td>
    </tr>
</table>
<table width="800" border="0" cellpadding="0" cellspacing="0" class="main">
    <tr>
        <td><h3 class="style8">siRNA information</h3></td>
    </tr>
</table>
<table width="800" border="0" cellpadding="0" cellspacing="0" class="main" style="border:1px dotted black;">
    <tr valign="top">
        <td width="800">
            <table border="0" cellpadding="0" cellspacing="5" class="main">
                <tr>
                    <th scope="row" width="250"><div align="left">siRNA sequence</div></th>
                    <td width="550" class="style6">$sequences</td>
                </tr>
                <tr>
                    <th scope="row"><div align="left">Position in queried target</div></th>
                    <td>$positions</td>
                </tr>
                <tr>
                    <th scope="row" width="80"><div align="left">Length [nt]</div></th>
                    <td>$lengths</td>
                </tr>


            if ($mapping eq 1){
		my $locations = "";
		if (scalar(@IDsub) ne 0){
		    my @locs = split(/\&/,$RNAiloc);
		    for (my $k=0;$k<scalar(@IDsub);$k++){
			if ($k eq 0){
			    $locations = $locs[$k];
			    if (exists $NOTmapped{$IDsub[$k]}){
				$locations.= ', <b>'.$NOTmapped{$IDsub[$k]}[1].'</b>';
			    }
			}
			else {
			    $locations.= '<br>'.$locs[$k];
			    if (exists $NOTmapped{$IDsub[$k]}){
				$locations.= ', <b>'.$NOTmapped{$IDsub[$k]}[1].'</b>';
                            }
			}
		    }
		}
		else {
		    $locations = $RNAiloc;
		    if (exists $NOTmapped{$IDsub}){
			$locations.= ', <b>'.$NOTmapped{$IDsub}[1].'</b>';
		    }
		}
                print DESIGNHTML <<"";
                <tr>
                    <th scope="row"><div align="left">siRNA location(s)</div></th>
                    <td>$locations</td>
                </tr>


	    }

	    print DESIGNHTML <<"";
            </table>
        </td>
    </tr>
</table>


            print OUTTAB "\t$DesignsPrint{$IDscovered[$i]}[10][$j]\t$DesignsPrint{$IDscovered[$i]}[11][$j]\t$DesignsPrint{$IDscovered[$i]}[12][$j]\t$DesignsPrint{$IDscovered[$i]}[17][$j]";
	    if ($seedmatch eq 1){
		print OUTTAB "\t$DesignsPrint{$IDscovered[$i]}[18][$j]";
	    }
	    if ($mirseed eq 1){
		print OUTTAB "\t$unspec[7]";
	    }
	    if ($options{"EFFICIENCY"}[0] ne 'empty,empty'){
		print OUTTAB "\t$DesignsPrint{$IDscovered[$i]}[15][$j]";
	    }
	    print OUTTAB "\t$intgene\t$inttxn\t$inthits\t$othergene\t$othertxn\t$otherhits";
	    if ($mapping eq 1){
		print OUTTAB "\t$RNAiloc";
	    }
	    if ($FeatureCalc eq 1){
		print OUTTAB "\t$FeatureNum";
	    }
	    if ($lowcomp eq 1){
		print OUTTAB "\t$unspec[5]";
	    }
	    if ($canrepeats eq 1){
		print OUTTAB "\t$unspec[6]";
	    }
	    if ($oteeval ne 0){
		print OUTTAB "\t$OTE";
	    }
	    if ($homology ne 0){
		print OUTTAB "\t$hom";
	    }
	    print OUTTAB "\n";
	}
	else {
# HTML reagent info table for long dsRNAs
	    my $primerF = "";
            my $primerR = "";
            if ($options{"PRIMERTAG"}[0] ne 'none'){
                $primerF = $options{"PRIMERTAG"}[0].$DesignsPrint{$IDscovered[$i]}[1][$j];
                $primerR = $options{"PRIMERTAG"}[0].$DesignsPrint{$IDscovered[$i]}[2][$j];
            }
            else {
                $primerF = $DesignsPrint{$IDscovered[$i]}[1][$j];
                $primerR = $DesignsPrint{$IDscovered[$i]}[2][$j];
            }
	    print DESIGNHTML <<"";
<table border="0" cellpadding="0" cellspacing="0">
    <tr>
        <td><h5>Design $index: $IDsub</h5></td>
    </tr>
</table>
<table width="800" border="0" cellpadding="0" cellspacing="0" class="main">
    <tr>
        <td><h3 class="style8">dsRNA information</h3></td>
    </tr>
</table>
<table width="800" border="0" cellpadding="0" cellspacing="0" class="main" style="border:1px dotted black;">
    <tr valign="top">
        <td width="365">
            <table border="0" cellpadding="0" cellspacing="5" class="main">
                <tr>
                    <th><div align="left">Primer forward</div></th>
                </tr>
            </table>
            <table border="0" cellpadding="0" cellspacing="5" class="main">
                <tr>
                    <th scope="row" width="80"><div align="left">Sequence</div></th>
                    <td width="220" class="style6">$primerF</td>
               </tr>
               <tr>
               <th scope="row"><div align="left">Length [nt]</div></th>
                   <td>$DesignsPrint{$IDscovered[$i]}[4][$j]</td>
               </tr>
               <tr>
                   <th scope="row"><div align="left">Tm[&deg;C]</div></th>
                   <td>$DesignsPrint{$IDscovered[$i]}[7][$j]</td>
               </tr>
               <tr>
                   <th scope="row"><div align="left">GC[%]</div></th>
                   <td>$DesignsPrint{$IDscovered[$i]}[9][$j]</td>
               </tr>
            </table>
            <table border="0" cellpadding="0" cellspacing="5" class="main">
                <tr>
                    <th><div align="left">Primer reverse</div></th>
                </tr>
            </table>
            <table border="0" cellpadding="0" cellspacing="5" class="main">
                <tr>
                    <th scope="row" width="80"><div align="left">Sequence</div></th>
                    <td width="220" class="style6">$primerR</td>
                </tr>
                <tr>
                    <th scope="row"><div align="left">Length [nt]</div></th>
                    <td>$DesignsPrint{$IDscovered[$i]}[6][$j]</td>
                </tr>
                <tr>
                    <th scope="row"><div align="left">Tm[&deg;C]</div></th>
                    <td>$DesignsPrint{$IDscovered[$i]}[8][$j]</td>
                </tr>
                <tr>
                    <th scope="row"><div align="left">GC[%]</div></th>
                    <td>$DesignsPrint{$IDscovered[$i]}[10][$j]</td>
                </tr>
            </table>
            <table border="0" cellpadding="0" cellspacing="5" class="main">
                <tr>
                    <th scope="row"><div align="left">Primer pair penalty</div></th>
                    <td><div align="left">$DesignsPrint{$IDscovered[$i]}[0][$j]</div></td>
                </tr>
            </table>
        </td>
        <td width="5">&nbsp;</td>
        <td width="430">
            <table border="0" cellpadding="0" cellspacing="5" class="main">
                <tr>
                    <th><div align="left">Amplicon sequence</div></th>
                </tr>
                <tr>
                    <td class="style6">


	    for (my $k=0;$k<length($DesignsPrint{$IDscovered[$i]}[12][$j]);$k+=50){
		my $seq = substr($DesignsPrint{$IDscovered[$i]}[12][$j],$k,50);
		print DESIGNHTML <<"";
		$seq<br>


	    }

	    print DESIGNHTML <<"";
                    </td>
                </tr>
                <tr>
                    <th><div align="left">Amplicon length [nt]</div></th>
                </tr>
                <tr>
                    <td>$DesignsPrint{$IDscovered[$i]}[11][$j]</td>
                </tr>


	    if ($mapping eq 1){
		my $locations = $RNAiloc;
		if (exists $NOTmapped{$IDsub}){
		    $locations.= '<br><b>'.$NOTmapped{$IDsub}[1].'</b>';
		}
	        print DESIGNHTML <<"";
		<tr>
                    <th><div align="left">Amplicon location</div></th>
                </tr>
                <tr>
                    <td>$locations</td>
                </tr>


	    }

            print DESIGNHTML <<"";
            </table>
        </td>
    </tr>
</table>


# long dsRNA design/evaluation output
	    print OUTTAB "\t$DesignsPrint{$IDscovered[$i]}[11][$j]\t$primerF\t$DesignsPrint{$IDscovered[$i]}[3][$j]\t$DesignsPrint{$IDscovered[$i]}[4][$j]\t$DesignsPrint{$IDscovered[$i]}[9][$j]\t$DesignsPrint{$IDscovered[$i]}[7][$j]\t$primerR\t$DesignsPrint{$IDscovered[$i]}[5][$j]\t$DesignsPrint{$IDscovered[$i]}[6][$j]\t$DesignsPrint{$IDscovered[$i]}[10][$j]\t$DesignsPrint{$IDscovered[$i]}[8][$j]\t$DesignsPrint{$IDscovered[$i]}[0][$j]\t$DesignsPrint{$IDscovered[$i]}[17][$j]\t$DesignsPrint{$IDscovered[$i]}[18][$j]";
	    if ($options{"EFFICIENCY"}[0] ne 'empty,empty'){
		my @effPrint = split(/\|/,$DesignsPrint{$IDscovered[$i]}[15][$j]);
		print OUTTAB "\t$effPrint[0]\t$effPrint[1]";
	    }
##
## re-evaluate intended targets: shift to other targets, if only very few siRNAs hit the intended target (< 50%)
##
	    if ($intgene ne "NA"){
		my @intGene = split(/&/,$intgene);
		my @intTxn= split(/&/,$inttxn);
		my @intHits = split(/&/,$inthits);
# consider only best intended overlap (because they are sorted)
		my @intHits2 = split(/\+/,$intHits[0]);
# calculate siRNA length
		my @siRNAs = split(/\//, $DesignsPrint{$IDscovered[$i]}[17][$j]);
		my $siRNALen = $DesignsPrint{$IDscovered[$i]}[11][$j] - $siRNAs[0] + 1;
		my $overlap = ($intHits2[0] + $siRNALen - 1) / $DesignsPrint{$IDscovered[$i]}[11][$j];
		if ($overlap < 0.5){
		    if ($othergene eq "NA"){
			$othergene = $intgene;
			$othertxn = $inttxn;
			$otherhits = $inthits;
			$othergeneHTML = $intgeneHTML;
			$othertxnHTML = $inttxnHTML;
		    }
		    else {
			$othergene = $intgene.'&'.$othergene;
			$othertxn = $inttxn.'&'.$othertxn;
			$otherhits = $inthits.'&'.$otherhits;
			$othergeneHTML = $intgeneHTML.'; '.$othergeneHTML;
			$othertxnHTML = $inttxnHTML.'; '.$othertxnHTML;
		    }
		    $intgene = "NA";
		    $inttxn = "NA";
		    $inthits = "NA";
		    $intgeneHTML = "NA";
		    $inttxnHTML = "NA";
		}
	    }
	    print OUTTAB "\t$DesignsPrint{$IDscovered[$i]}[12][$j]\t$intgene\t$inttxn\t$inthits\t$othergene\t$othertxn\t$otherhits";
	    if ($mapping eq 1){
		print OUTTAB "\t$RNAiloc";
	    }
	    if ($FeatureCalc eq 1){
		print OUTTAB "\t$FeatureNum";
	    }
	    if ($seedmatch eq 1){
		print OUTTAB "\t$unspec[4]";
	    }
	    if ($mirseed eq 1){
		print OUTTAB "\t$unspec[7]";
	    }
	    if ($lowcomp eq 1){
		print OUTTAB "\t$unspec[5]";
	    }
	    if ($canrepeats eq 1){
		print OUTTAB "\t$unspec[6]";
	    }
	    if ($oteeval ne 0){
                print OUTTAB "\t$OTE";
            }
	    if ($homology ne 0){
		print OUTTAB "\t$hom";
            }
	    print OUTTAB "\n";
	}

# HTML target information table
	print DESIGNHTML <<"";
<table width="800" border="0" cellpadding="0" cellspacing="0" class="main">
    <tr>
    <td>&nbsp;</td>
    </tr>
    <tr>
        <td><h3 class="style8">Target information</h3></td>
    </tr>
    <tr>
        <td style="border:1px dotted black;">
            <table border="0" cellpadding="0" cellspacing="5" class="main">
                <tr>
                    <th scope="row" width="250" valign="top"><div align="left">Intended target gene</div></th>
                    <td><div align="left">$intgeneHTML</div></td>
                </tr>
                <tr>
                    <th scope="row" width="250"><div align="left">Intended target transcripts (hits)</div></th>
                    <td><div align="left">$inttxnHTML</div></td>
                </tr>
                <tr>
                    <th scope="row" width="250" valign="top"><div align="left">Other targeted gene(s)</div></th>
                    <td><div align="left">$othergeneHTML</div></td>
                </tr>
                <tr>
                    <th scope="row" width="250" valign="top"><div align="left">Other targeted transcripts (hits)</div></th>
                    <td><div align="left">$othertxnHTML</div></td>
                </tr>
            </table>
        </td>
    </tr>
</table>


# HTML reagent quality table
        print DESIGNHTML <<"";
<table width="800" border="0" cellpadding="0" cellspacing="0" class="main">
    <tr>
    <td>&nbsp;</td>
    </tr>
    <tr>
    <td><h3 class="style8">Reagent quality</h3></td>
    </tr>
</table>
<table width="800" border="0" cellpadding="0" cellspacing="5" class="main" style="border:1px dotted black;">
    <tr>
        <th scope="col">siRNAs [$options{SIRNALENGTH}[0] nt]</th>
        <th scope="col">On-target</th>
        <th scope="col">Off-target</th>
        <th scope="col">No-target</th>


	if ($seedmatch eq 1){
	    if ($reagent eq "s"){
		print DESIGNHTML <<"";    
        <th scope="col">SCF</th>


	    }
	    elsif ($reagent eq "d"){
		print DESIGNHTML <<"";
        <th scope="col">highSCF</th>


	    }
        }
	if ($mirseed eq 1){
	    print DESIGNHTML <<"";
        <th scope="col">mirSeed</th>
	    

	}
	if (($reagent eq "s") && ($options{"EFFICIENCY"}[0] ne 'empty,empty')){
	    print DESIGNHTML <<"";
        <th scope="col">Efficiency score</th>


	}
        elsif (($reagent eq "d") && ($options{"EFFICIENCY"}[0] ne 'empty,empty')){
	    print DESIGNHTML <<"";
        <th scope="col">Efficient siRNAs</th>
	<th scope="col">Avg efficiency score</th>


	}
        if ($lowcomp eq 1){
           print DESIGNHTML <<"";
        <th scope="col">LowComplexRegions</th>


	}
	if ($canrepeats eq 1){
	    print DESIGNHTML <<"";
        <th scope="col">CAN</th>

        
	}


	print DESIGNHTML <<"";
    </tr>
    <tr>
        <td><div align="center">$unspec[0]</div></td>
        <td><div align="center">$unspec[1]</div></td>
        <td><div align="center">$unspec[2]</div></td>
        <td><div align="center">$unspec[3]</div></td>


	if ($seedmatch eq 1){
	    if ($reagent eq "s"){
		my $SCFs = "";
		if (scalar(@IDsub) ne 0){
		    $SCFs = $DesignsPrint{$IDscovered[$i]}[18][$j];
		    $SCFs =~s/\&/<br>/g;
		}
		else {
		    $SCFs = $DesignsPrint{$IDscovered[$i]}[18][$j];
		}
		print DESIGNHTML <<"";
	    <td><div align="center">$SCFs</div></td>


	    }
	    elsif ($reagent eq "d"){
		print DESIGNHTML <<"";
            <td><div align="center">$unspec[4]</div></td>


	    }
        }
	if ($mirseed eq 1){
	    my $mirlink = "";
	    if ($fileLocs{'Output'}{'miRNASeed'}=~/.*\/(.*\.\S+)$/){
		$mirlink = $1;
	    }
	    my $mirseeds = "";
	    if (scalar(@IDsub) ne 0){
                $mirseeds = $unspec[7];
                $mirseeds =~s/\&/<br>/g;
            }
            else {
                $mirseeds = $unspec[7];
            }
	    if ($mirseeds ne 0){
            print DESIGNHTML <<"";
        <td><div align="center"><a href="../$mirlink" title="miRNA seeds" target="_blank">$mirseeds</a></div></td>


	    }
	    else {
		print DESIGNHTML <<"";
        <td><div align="center">$mirseeds</div></td>


	    }
	}
	if (($reagent eq "s") && ($options{"EFFICIENCY"}[0] ne 'empty,empty')){
	    my $effs = "";
	    if (scalar(@IDsub) ne 0){
		$effs = $DesignsPrint{$IDscovered[$i]}[15][$j];
		$effs =~s/\&/<br>/g;
	    }
	    else {
		$effs = $DesignsPrint{$IDscovered[$i]}[15][$j];
	    }
	    print DESIGNHTML <<"";
	<td><div align="center">$effs</div></td>


	}
	elsif (($reagent eq "d") && ($options{"EFFICIENCY"}[0] ne 'empty,empty')){
	    my @effPrint = split(/\|/,$DesignsPrint{$IDscovered[$i]}[15][$j]);
            print DESIGNHTML <<"";
        <td><div align="center">$effPrint[0]</div></td>
	<td><div align="center">$effPrint[1]</div></td>


        }
        if ($lowcomp eq 1){
	    my $lowcomps = "";
            if (scalar(@IDsub) ne 0){
                $lowcomps = $unspec[5];
                $lowcomps =~s/\&/<br>/g;
            }
            else {
		$lowcomps = $unspec[5];
            }
            print DESIGNHTML <<"";
        <td><div align="center">$lowcomps</div></td>


	}

	if ($canrepeats eq 1){
	    my $cans = "";
            if (scalar(@IDsub) ne 0){
                $cans = $unspec[6];
                $cans =~s/\&/<br>/g;
            }
            else {
		$cans = $unspec[6];
            }
	    print DESIGNHTML <<"";
        <td><div align="center">$cans</div></td>


	}

	print DESIGNHTML <<"";
    </tr>
</table>
<table width="800" border="0" cellpadding="0" cellspacing="5">
    <tr>
        <td>&nbsp;</td>
    </tr>
</table>


# HTML reagent additional quality table
        if ((scalar(@features) ne 0) || ($oteeval ne 0) || ($homology ne 0)){
	    print DESIGNHTML <<"";
<table width="800" border="0" cellpadding="0" cellspacing="0" class="main">
    <tr>
        <td><h3 class="style8">Additional quality evaluation</h3></td>
    </tr>
    <tr>
        <td width="800">
            <table border="0" cellpadding="0" cellspacing="5" class="main" style="border:1px dotted black;">

            
	    for (my $k=0;$k<scalar(@features);$k++){
		my $feats = "";
		if (scalar(@IDsub) ne 0){
		    $feats = $featuresHTML[$k];
		    $feats =~s/\&/<br>/g;
		}
		else {
		    $feats = $featuresHTML[$k];
		}
		print DESIGNHTML <<"";
                <tr>
                    <th scope="row" width="250"><div align="left">$features[$k]</div></th>
                    <td width="550">$feats</td>
                </tr>


	    }

	    for (my $k=0;$k<$oteeval;$k++){
		print DESIGNHTML <<"";
                <tr>
                    <th scope="row" width="250"><div align="left">Off target database<br>$OTEHTML[$k]</div></th>
                    <td width="550" class="style6">$OTE1[$k] siRNAs have $OTE2[$k] other targets</td>
                </tr>


	    }
	    if ($homology ne 0){
		my $homlink = "";
		if ($fileLocs{'Output'}{'Homology'}=~/.*\/(.*\.\S+)$/){
		    $homlink = $1;
		}
		if ($homHTML ne 'NA'){
		    print DESIGNHTML <<"";
                <tr>
                    <th scope="row" width="250"><div align="left">Sequence homology (e-value)</div></th>
                    <td width="550"><a href="../$homlink" title="Homolgy" target="_blank">$homHTML</a></td>
                </tr>


		}
		else {
		    print DESIGNHTML <<"";
                <tr>
                    <th scope="row" width="250"><div align="left">Sequence homology (e-value)</div></th>
                    <td width="550">$homHTML</td>
                </tr>


		}


	    }
	    print DESIGNHTML <<"";
            </table>
        </td>
    </tr>
</table>


        }
# HTML link to GBrowse image
	my @GBrowse = @IDsub;
        if (scalar(@GBrowse) eq 0){
	    push (@GBrowse, $IDsub);
	}	    
	my @chrom = ();
	my @start = ();
	my @end = ();
	my @add = ();
	my $refChrom = "";
	my $refStart = 0;
	my $refEnd = 0;
	for (my $k=0;$k<scalar(@GBrowse);$k++){
	    if ((exists $RNAiloc{$GBrowse[$k]}{$Mapped}) && ($options{"GBROWSEBASE"}[0] ne "")){
# check for multiple mappings of RNAi reagent
		my @chroms = ();
		my @starts = ();
		my @ends = ();
		my @orientations = ();
		&ParseMAPPING(\%RNAiloc,$GBrowse[$k],\*MAPPED,\*MAPPEDINDEX,$Mapped,\@chroms,\@starts,\@ends,\@orientations);
		if ($k eq 0){
		    $refChrom = $chroms[$k];
		    $refStart = 0;
		    $refEnd = 0;
		    for (my $l=0;$l<scalar(@chroms);$l++){
# check for mappings containing gaps
			my @starts2 = split(/\,/,$starts[$l]);
			my @ends2 = split(/\,/,$ends[$l]);
			my $loc = "";
			for (my $m=0;$m<scalar(@starts2);$m++){
			    if ($m eq 0){
				$loc = $starts2[$m].'..'.$ends2[$m];
			    }
			    else {
				$loc.= ','.$starts2[$m].'..'.$ends2[$m];
			    }
			}
			if ($l eq 0){
			    $refStart = $starts2[0];
			    $refEnd = $ends2[-1];
			    my $add = 'add='.$chroms[$l].'+RNAi+'.$GBrowse[$k].'+'.$loc;
			    push (@add, $add);
			    push (@chrom, $chroms[$l]);
			    push (@start, $starts2[0] - 2500);
			    push (@end, $ends2[-1] + 2500);
			}
			else {
			    if (($chroms[$l] eq $refChrom) && (($starts2[0] - $refStart) < 2500) && (($starts2[0] - $refStart) > -2500)){
				$add[0].= ';add='.$chroms[$l].'+RNAi+'.$GBrowse[$k].'+'.$loc;
			    }
			    else {
				my $add = 'add='.$chroms[$l].'+RNAi+'.$GBrowse[$k].'+'.$loc;
				push (@add, $add);
				push (@chrom, $chroms[$l]);
				push (@start, $starts2[0] - 2500);
				push (@end, $ends2[-1] + 2500);
			    }
			}
		    }
		}
		else {
		    for (my $l=0;$l<scalar(@chroms);$l++){
# check for mappings containing gaps
			my @starts2 = split(/\,/,$starts[$l]);
			my @ends2 = split(/\,/,$ends[$l]);
			my $loc = "";
			for (my $m=0;$m<scalar(@starts2);$m++){
			    if ($m eq 0){
				$loc = $starts2[$m].'..'.$ends2[$m];
			    }
			    else {
				$loc.= ','.$starts2[$m].'..'.$ends2[$m];
			    }
			}
			if (($chroms[$l] eq $refChrom) && (($starts2[0] - $refStart) < 2500) && (($starts2[0] - $refStart) > -2500)){
			    $add[0].= ';add='.$chroms[$l].'+RNAi+'.$GBrowse[$k].'+'.$loc;
			}
			else {
			    my $add = 'add='.$chroms[$l].'+RNAi+'.$GBrowse[$k].'+'.$loc;
			    push (@add, $add);
			    push (@chrom, $chroms[$l]);
			    push (@start, $starts2[0] - 2500);
			    push (@end, $ends2[-1] + 2500);
			}
		    }
		}
	    }
	}
# only if mapping was successfull
	for (my $k=0;$k<scalar(@add);$k++){
	    if ($k eq 0){
	    print DESIGNHTML <<"";
<table width="800" border="0" cellpadding="0" cellspacing="0" class="main">
    <tr>
        <td>&nbsp;</td>
    </tr>
    <tr>
        <td><h3 class="style8">Genome Browser</h3></td>
    </tr>
</table>


	    }
#		my $routput = $IDsub.'_eff.png';
#		my $routputLink = $routput;
#		$routputLink=~s/\:/\%3A/g;
		print DESIGNHTML <<"";
<table width="800" border="0" cellpadding="0" cellspacing="0" class="main">
    <tr>
        <td style="border:1px dotted black;" align="center"><img src="$options{GBROWSEBASE}[0]?name=$chrom[$k]:$start[$k]..$end[$k];type=$options{GBROWSETRACK}[0];width=460;keystyle=between;grid=on;$add[$k];style=RNAi+glyph=segments+fgcolor=black+bgcolor=plum;" alt="GBrowse Image" name="GBrowse" align="center" title="GBrowse Image"/></td>
    </tr>
</table>


#    <tr>
#        <td style="border:1px dotted black;" align="center"><img src="$routputLink" alt="$routput" name="Efficiency" align="center" title="Efficiency"/></td>
#    </tr>
#</table>


	    print DESIGNHTML <<"";
<table width="800" border="0" cellpadding="0" cellspacing="5" class="main">
    <tr>
        <td>&nbsp;</td>
    </tr>
</table>


	}
	$index++;
    }    
# links to HTML reports on designs
    my $designnum = $index - 1;

    if ($HTMLindex eq 0){
	print OUTHTML <<"";
    <tr>
        <td class="main">


    }

    if ($designnum eq 1){
	print OUTHTML <<"";
            <a href="HTML/$IDscovered[$i].html" title="$IDscovered[$i]" target="_parent">$IDscovered[$i]</a>&nbsp;


    }
    else {
	print OUTHTML <<"";
	    <a href="HTML/$IDscovered[$i].html" title="$IDscovered[$i]" target="_parent">$IDscovered[$i] ($designnum result(s))</a>&nbsp;


    }

    $HTMLindex++;
    if ($HTMLindex eq 5){
	print OUTHTML <<"";
        </td>
    </tr>


        $HTMLindex = 0;
    }

    print DESIGNHTML <<"";
<table width="800" border="0" cellpadding="0" cellspacing="5" class="news">
    <tr>
        <td>&nbsp;</td>
    </tr>
    <tr>
        <td align="center"><a href="../index.html">Back to overview</a></td>
    </tr>
    <tr>
        <td>&nbsp;</td>
    </tr>
</table>


    print DESIGNHTML "$footer\n";
    close DESIGNHTML;
}


# print error message if design was not successfull
if (scalar(@IDscovered) eq 0){
    print OUTHTML <<"";
    <tr>
        <td class="main"><strong><p style="color:red">No design possible with the current input / settings</p></strong></td>
    </tr>


}
else {
    if ($HTMLindex ne 0){
        print OUTHTML <<"";
        </td>
    </tr>


    }
}

# copy input files in output folder
system ("cp $outError $options{OUTPUT}[0]") eq 0 || die "Failed to copy error output file $outError to output folder $options{OUTPUT}[0]: $?\n";
if ((defined $optionsFile) && (-e $optionsFile)){
    system ("cp $optionsFile $options{OUTPUT}[0]") eq 0 || die "Failed to copy options output file $optionsFile to output folder $options{OUTPUT}[0]: $?\n";
}
if ((defined $inputFile1) && (-e $inputFile1)){
    system ("cp $inputFile1 $options{OUTPUT}[0]") eq 0 || die "Failed to copy input file $inputFile1 to output folder $options{OUTPUT}[0]: $?\n";
}
if ((defined $inputFile2) && (-e $inputFile2)){
    system ("cp $inputFile2 $options{OUTPUT}[0]") eq 0 || die "Failed to copy input file $inputFile2 to output folder $options{OUTPUT}[0]: $?\n";
}
if ((defined $options{"POOL"}[0]) && (-e $options{"POOL"}[0])){
    system ("cp $options{POOL}[0] $options{OUTPUT}[0]") eq 0 || die "Failed to copy input file $options{POOL}[0] to output folder $options{OUTPUT}[0]: $?\n";
}
for (my $i=0;$i<scalar@{ $options{"TARGETGROUPS"} };$i++){
    if ((defined $options{"TARGETGROUPS"}[$i]) && (-e $options{"TARGETGROUPS"}[$i])){
	system ("cp $options{TARGETGROUPS}[$i] $options{OUTPUT}[0]") eq 0 || die "Failed to copy targetgroups file $options{TARGETGROUPS}[$i] to output folder $options{OUTPUT}[0]: $?\n";
    }
}
for (my $i=0;$i<scalar@{ $options{"EXCLUDED"} };$i++){
    if ((defined $options{"EXCLUDED"}[$i]) && (-e $options{"EXCLUDED"}[$i])){
	system ("cp $options{EXCLUDED}[$i] $options{OUTPUT}[0]") eq 0 || die "Failed to copy excluded file $options{EXCLUDED}[$i] to output folder $options{OUTPUT}[0]: $?\n";
    }
}
for (my $i=0;$i<scalar@{ $options{"INTENDED"} };$i++){
    if ((defined $options{"INTENDED"}[$i]) && (-e $options{"INTENDED"}[$i])){
        system ("cp $options{INTENDED}[$i] $options{OUTPUT}[0]") eq 0 || die "Failed to copy intended file $options{INTENDED}[$i] to output folder $options{OUTPUT}[0]: $?\n";
    }
}
close OUTTAB;
close MAPPED;
close MAPPEDINDEX;

# input/report/output files
# output
print OUTHTML <<'';
    <tr>
    <td class="main">&nbsp;</td>
    </tr>
    <tr>
        <td class="style8"><strong>Links to result files</strong></td>
    </tr>


if (exists $fileLocs{'Output'}{'DesignsTAB'}){
    &HTMLfiles('Output','DesignsTAB','Designs (tab-delimited)','<strong>Tab-delimited result file</strong>',\*OUTHTML);
}
if (exists $fileLocs{'Output'}{'STATS'}){
    &HTMLfiles('Output','STATS','Statistics','Statistics result file',\*OUTHTML);
}
if (exists $fileLocs{'Output'}{'DesignsFASTA'}){
    &HTMLfiles('Output','DesignsFASTA','Designs (FASTA)','FASTA result file',\*OUTHTML);
}
if (exists $fileLocs{'Output'}{'GFF'}){
    &HTMLfiles('Output','GFF','Designs (GFF)','GFF result file',\*OUTHTML);
}
if (exists $fileLocs{'Output'}{'AFF'}){
    &HTMLfiles('Output','AFF','Designs (AFF)','Annotations result file',\*OUTHTML);
}
if (exists $fileLocs{'Output'}{'Mapped'}){
    &HTMLfiles('Output','Mapped','Mapped reagents','Location(s) of mapped reagents',\*OUTHTML);
}
if (exists $fileLocs{'Output'}{'PrimerNotMapped'}){
    &HTMLfiles('Output','PrimerNotMapped','Primer not mapped','Oligo(s) that could not be mapped',\*OUTHTML);
}
if (exists $fileLocs{'Output'}{'dsRNANotMapped'}){
    &HTMLfiles('Output','dsRNANotMapped','dsRNA not mapped','dsRNAs that could not be mapped',\*OUTHTML);
}
if (exists $fileLocs{'Output'}{'Homology'}){
    &HTMLfiles('Output','Homology','Homology','Homology of RNAi reagents',\*OUTHTML);
}
if (exists $fileLocs{'Output'}{'miRNASeed'}){
    &HTMLfiles('Output','miRNASeed','miRNA seeds','miRNA seeds in RNAi reagents',\*OUTHTML);
}

# input
print OUTHTML <<'';
    <tr>
        <td class="main">&nbsp;</td>
    </tr>
    <tr>
        <td class="style8"><strong>Links to input text files</strong></td>
    </tr>


if ($identifier ne 'E-RNAi'){
    print OUTHTML <<"";
    <tr>
        <td class="main">Database file used for off-target evaluation: $databaseFile</td>
    </tr>


    if ($mapping eq 1){
        my $mappingFile = join(', ', @{ $options{"GENOMEBOWTIE"} });
        print OUTHTML <<"";
    <tr>
        <td class="main">Database file used for mapping of reagents: $mappingFile</td>
    </tr>


    }
}

if (exists $fileLocs{'Input'}{'Input1'}){
    &HTMLfiles('Input','Input1','Reagent sequences','Reagent sequence input file (FASTA)',\*OUTHTML);
}
if ((exists $fileLocs{'Input'}{'Input1validatedFASTA'}) && (!exists $fileLocs{'Input'}{'Input1validatedTAB'})){
    &HTMLfiles('Input','Input1validatedFASTA','Validated reagent sequences','Validated reagent sequence input file (FASTA)',\*OUTHTML);
}
if (exists $fileLocs{'Input'}{'Input2'}){
    &HTMLfiles('Input','Input2','Primer sequences','Primer sequence input file (FASTA)',\*OUTHTML);
}
if ((exists $fileLocs{'Input'}{'Input1validatedFASTA'})  && (exists $fileLocs{'Input'}{'Input1validatedTAB'})){
    &HTMLfiles('Input','Input1validatedFASTA','Validated primer sequences','Validated primer sequence input file (FASTA)',\*OUTHTML);
}
if ((exists $fileLocs{'Input'}{'Input1validatedTAB'})  && (exists $fileLocs{'Input'}{'Input1validatedFASTA'})){
    &HTMLfiles('Input','Input1validatedTAB','Validated primer sequences','Validated primer sequence input file (tab-delimited)',\*OUTHTML);
}
if (exists $fileLocs{'Input'}{'Options'}){
    &HTMLfiles('Input','Options','Options','Options input file',\*OUTHTML);
}
if (exists $fileLocs{'Input'}{'siRNAPOOLS'}){
    &HTMLfiles('Input','siRNAPOOLS','siRNAPOOLS','File assigning siRNAs to POOLs',\*OUTHTML);
}
if ($identifier ne 'E-RNAi'){
    for (my $i=1;$i<=$tgcount;$i++){
        my $tgname = 'Targetgroups_'.$i;
        &HTMLfiles('Input',$tgname,'Targetgroups','Targetgroups input file',\*OUTHTML);
    }
}
for (my $i=1;$i<=$excludecount;$i++){
    my $exname = 'Excluded_'.$i;
    &HTMLfiles('Input',$exname,'Excluded off-targets','Input file for off-targets database IDs not considered as real off-targets',\*OUTHTML);
}
for (my $i=1;$i<=$intendcount;$i++){
    my $intname = 'Intended_'.$i;
    &HTMLfiles('Input',$intname,'Intended targets','Input file for intended targets',\*OUTHTML);
}

if (($FeatureCalc eq 1) && ($identifier ne 'E-RNAi')){
    print OUTHTML <<"";
    <tr>
        <td class="main">File used for calculation of feature contents: $options{"FEATURE"}[0]</td>
    </tr>


}
if (($oteeval > 0) && ($identifier ne 'E-RNAi')){
    for (my $i=0;$i<$oteeval;$i++){
        my $j = $i + 1;
        print OUTHTML <<"";
    <tr>
        <td class="main">File $j for additional off-target evaluation: $OTEHTML[$i]</td>
    </tr>


    }
}
if (($homology ne 0) && ($identifier ne 'E-RNAi')){
    my @hom = split(/\,/,$options{"HOMOLOGY"}[0]);
    print OUTHTML <<"";
    <tr>
        <td class="main">Database file for homology evaluation: $hom[1]</td>
    </tr>


}

# report
print OUTHTML <<'';
    <tr>
    <td class="main">&nbsp;</td>
    </tr>
    <tr>
        <td class="style8"><strong>Links to output report files</strong></td>
    </tr>


if ((exists $fileLocs{'Output'}{'Error'}) && ($identifier ne 'E-RNAi')){
    &HTMLfiles('Output','Error','Error','Error log file',\*OUTHTML);
}
if ((exists $fileLocs{'Output'}{'Report'}) && ($identifier ne 'E-RNAi')){
    &HTMLfiles('Output','Report','Report','NEXT-RNAi report file',\*OUTHTML);
}
if (exists $fileLocs{'Output'}{'Failed'}){
    &HTMLfiles('Output','Failed','Failed',"Failed $designopt",\*OUTHTML);
}


##
## Apply some statistics analysis on the output file
##

my %stats = ();
open (OUTTAB, "<$outTab") || die "Cannot open OUTTAB: $!\n";
my $headout = 0;
my %header = ();
my @length = ();
my @lenFor = ();
my @lenRev = ();
my @GCFor = ();
my @GCRev = ();
my @TmFor = ();
my @TmRev = ();
my @primerpen = ();
my @effstat = ();
my @specstat = (0, 0);
my $noint = 0;
my $singleint = 0;
my $multint = 0;
my $intother = 0;
my $other = 0;
my $nointother = 0;
my $locstat = 0;
my $lowcompstat = 0;
my $canstat = 0;
my %featurestat = ();
my %otestat = ();
my $designsall = 0;
# output of IDs for statistics
my $otefile = 'NEXT-RNAi_'.$identifier.'_OTE.txt';
my $notargetfile = 'NEXT-RNAi_'.$identifier.'_notarget.txt';
my $lowcompfile = 'NEXT-RNAi_'.$identifier.'_lowcomp.txt';
my $CANfile = 'NEXT-RNAi_'.$identifier.'_CAN.txt';
my $singlefile = 'NEXT-RNAi_'.$identifier.'_singletarget.txt';
my $multifile = 'NEXT-RNAi_'.$identifier.'_multitarget.txt';
my $intotherfile = 'NEXT-RNAi_'.$identifier.'_intother.txt';
my $otherfile = 'NEXT-RNAi_'.$identifier.'_othertarget.txt';
my $nointotherfile = 'NEXT-RNAi_'.$identifier.'_nointother.txt';
my $featurefile = 'NEXT-RNAi_'.$identifier.'_feature.txt';
my $oteevalfile = 'NEXT-RNAi_'.$identifier.'_OTEEVAL.txt';
open (OTEFILE, ">$options{OUTPUT}[0]$otefile") || die "Cannot open OTEFILE: $!\n";
open (NO, ">$options{OUTPUT}[0]$notargetfile") || die "Cannot open NO: $!\n";
open (SINGLE, ">$options{OUTPUT}[0]$singlefile") || die "Cannot open SINGLE: $!\n";
open (MULTI, ">$options{OUTPUT}[0]$multifile") || die "Cannot open MULTI: $!\n";
open (INTOTHER, ">$options{OUTPUT}[0]$intotherfile") || die "Cannot open INTOTHER: $!\n";
open (OTHER, ">$options{OUTPUT}[0]$otherfile") || die "Cannot open OTHER: $!\n";
open (NOINTOTHER, ">$options{OUTPUT}[0]$nointotherfile") || die "Cannot open NOINTOTHER: $!\n";
if ($lowcomp eq 1){
    open (LOW, ">$options{OUTPUT}[0]$lowcompfile") || die "Cannot open LOW: $!\n";    
}
if ($canrepeats eq 1){
    open (CAN, ">$options{OUTPUT}[0]$CANfile") || die "Cannot open CAN: $!\n";
}
if ($FeatureCalc eq 1){
    open (FEAT, ">$options{OUTPUT}[0]$featurefile") || die "Cannot open FEAT: $!\n";
}
if ($oteeval ne 0){
    open (OTEEVAL, ">$options{OUTPUT}[0]$oteevalfile") || die "Cannot open OTEEVAL: $!\n";
}

while (my $line=<OUTTAB>){
    $line = &cleanLine($line);
    my @columns = ();
    @columns = split(/\t/,$line);
# get available headers
    if ($headout eq 0){
	for (my $i=0;$i<scalar(@columns);$i++){
	    if (!exists $header{$columns[$i]}){
		$header{$columns[$i]} = $i;
	    }
	}
    }
    else {
	$designsall++;
# parameters specific for design of long dsRNAs
	if (exists $header{'LenFor'}){
            push (@lenFor, $columns[$header{'LenFor'}]);
	    if (!exists $stats{'LenFor'}){
		$stats{'LenFor'} = 1;
	    }
        }
	if (exists $header{'LenRev'}){
            push (@lenRev, $columns[$header{'LenRev'}]);
	    if (!exists $stats{'LenRev'}){
		$stats{'LenRev'} = 1;
            }
	}
	if (exists $header{'GCFor[%]'}){
            push (@GCFor, $columns[$header{'GCFor[%]'}]);
	    if (!exists $stats{'GCFor[%]'}){
		$stats{'GCFor[%]'} = 1;
            }
	}
	if (exists $header{'GCRev[%]'}){
            push (@GCRev, $columns[$header{'GCRev[%]'}]);
	    if (!exists $stats{'GCRev[%]'}){
		$stats{'GCRev[%]'} = 1;
            }
	}
	if (exists $header{'TmFor[*C]'}){
            push (@TmFor, $columns[$header{'TmFor[*C]'}]);
	    if (!exists $stats{'TmFor[*C]'}){
		$stats{'TmFor[*C]'} = 1;
            }
	}
	if (exists $header{'TmRev[*C]'}){
            push (@TmRev, $columns[$header{'TmRev[*C]'}]);
	    if (!exists $stats{'TmRev[*C]'}){
		$stats{'TmRev[*C]'} = 1;
            }
	}
	if (exists $header{'ForRevPenalty'}){
            push (@primerpen, $columns[$header{'ForRevPenalty'}]);
	    if (!exists $stats{'ForRevPenalty'}){
		$stats{'ForRevPenalty'} = 1;
            }
        }

# general parameters
	if (exists $header{'Length[nt]'}){
            push (@length, $columns[$header{'Length[nt]'}]);
	    if (!exists $stats{'Length[nt]'}){
		$stats{'Lengnt[nt]'} = 1;
            }
        }
	if (exists $header{'EfficiencyScore'}){
# consider siRNA pools
            if ($options{"POOL"}[0] ne "empty"){
		my @siRNAs = split(/\&/,$columns[$header{'EfficiencyScore'}]);
		for (my $i=0;$i<scalar(@siRNAs);$i++){
		    push (@effstat, $siRNAs[$i]);
                }
	    }
	    else {
		push (@effstat, $columns[$header{'EfficiencyScore'}]);
	    }		
	    if (!exists $stats{'EfficiencyScore'}){
		$stats{'EfficiencyScore'} = 1;
	    }
        }
	elsif (exists $header{'EfficientsiRNAs'}){
            push (@effstat, $columns[$header{'EfficientsiRNAs'}]);
            if (!exists $stats{'EfficientsiRNAs'}){
                $stats{'EfficientsiRNAs'} = 1;
            }
        }
	if (exists $header{'Location'}){
# consider siRNA pools
	    if ($options{"POOL"}[0] ne "empty"){
		my @siRNAs = split(/\&/,$columns[$header{'Location'}]);
		my $mapped = 1;
		for (my $i=0;$i<scalar(@siRNAs);$i++){
		    if ($siRNAs[$i] eq 'NA'){
			$mapped = 0;
		    }
		}
		if ($mapped eq 1){
		    $locstat++;
		}
	    }
	    else {
# single sequences
		if ($columns[$header{'Location'}] ne 'NA'){
		    $locstat++;
		}
	    }
	    if (!exists $stats{'Location'}){
		$stats{'Location'} = 1;
	    }
	}
	if (exists $header{'LowComplexRegions'}){
# consider siRNA pools
            if ($options{"POOL"}[0] ne "empty"){
                my @siRNAs = split(/\&/,$columns[$header{'LowComplexRegions'}]);
                my $mapped = 1;
		for (my $i=0;$i<scalar(@siRNAs);$i++){
                    if ($siRNAs[$i] > 0){
                        $mapped = 0;
                    }
                }
                if ($mapped eq 0){
                    $lowcompstat++;
                }
            }
	    else {
# single sequences
		if ($columns[$header{'LowComplexRegions'}] > 0){
		    $lowcompstat++;
		    print LOW "$columns[$header{QuerySubID}]\n";
		}
	    }
	    if (!exists $stats{'LowComplexRegions'}){
		$stats{'LowComplexRegions'} = 1;
            }
	}
	if (exists $header{'CANRepeats'}){
# consider siRNA pools
            if ($options{"POOL"}[0] ne "empty"){
                my @siRNAs = split(/\&/,$columns[$header{'CANRepeats'}]);
                my $mapped = 1;
                for (my $i=0;$i<scalar(@siRNAs);$i++){
                    if ($siRNAs[$i] > 0){
                        $mapped = 0;
                    }
                }
                if ($mapped eq 0){
                    $canstat++;
                }
            }
            else {
# single sequences
		if ($columns[$header{'CANRepeats'}] > 0){
		    $canstat++;
		    print CAN "$columns[$header{QuerySubID}]\n";
		}
	    }
	    if (!exists $stats{'CANRepeats'}){
		$stats{'CANRepeats'} = 1;
            }
	}
	if (exists $header{'Specificity[Abs]'}){
	    my @unspec = split(/\//,$columns[$header{'Specificity[Abs]'}]);
# unspecific designs
	    if ($unspec[2] > 0){
		$specstat[0]++;
		print OTEFILE "$columns[$header{QuerySubID}]\n";
	    }
# designs with no target at all
	    if ($unspec[3] eq $unspec[0]){
		$specstat[1]++;
		print NO "$columns[$header{QuerySubID}]\n";
	    }
	    if (!exists $stats{'Specificity[Abs]'}){
		$stats{'Specificity[Abs]'} = 1;
            }
        }
# multiple intended target genes (with same overlap)
	if (exists $header{'IntendedGene'}){
            if ($columns[$header{'IntendedGene'}]=~/\&/){
		if ($columns[$header{'OtherGene'}] eq 'NA'){
		    $multint++;
		    print MULTI "$columns[$header{QuerySubID}]\n";
		}
		else {
		    $intother++;
		    print INTOTHER "$columns[$header{QuerySubID}]\n";
		}
	    }
	    else {
		if ($columns[$header{'IntendedGene'}] ne 'NA'){
		    if ($columns[$header{'OtherGene'}] eq 'NA'){
			$singleint++;
			print SINGLE "$columns[$header{QuerySubID}]\n";
		    }
		    else {
			$other++;
			print OTHER "$columns[$header{QuerySubID}]\n";
		    }
		}
		else {
		    if ($columns[$header{'OtherGene'}] ne 'NA'){
                        $nointother++;
                        print NOINTOTHER "$columns[$header{QuerySubID}]\n";
                    }
		    else {
			$noint++;
		    }
		}
	    }
	    if (!exists $stats{'IntendedGene'}){
		$stats{'IntendedGene'} = 1;
            }
        }
# feature content
	if ($FeatureCalc eq 1){
	    my @features = keys %FeatureName;
	    for (my $i=0;$i<scalar(@features);$i++){
		if (exists $header{$features[$i]}){
# consider siRNA pools
		    if ($options{"POOL"}[0] ne "empty"){
			my @siRNAs = split(/\&/,$columns[$header{$features[$i]}]);
			my $mapped = 1;
			for (my $j=0;$j<scalar(@siRNAs);$j++){
			    if ($siRNAs[$j] > 0){
				$mapped = 0;
			    }
			}
			if ($mapped eq 0){
			    if (!exists $featurestat{$features[$i]}){
				$featurestat{$features[$i]} = 1;
			    }
			    else {
				$featurestat{$features[$i]}++;
			    }
			    print FEAT "$features[$i]\t$columns[$header{QuerySubID}]\n";
			}
		    }
		    else {
# single sequences
			if ($columns[$header{$features[$i]}] > 0){
			    if (!exists $featurestat{$features[$i]}){
				$featurestat{$features[$i]} = 1;
			    }
			    else {
				$featurestat{$features[$i]}++;
			    }
			    print FEAT "$features[$i]\t$columns[$header{QuerySubID}]\n";
			}
		    }
		}
	    }
	    if (!exists $stats{'Feature'}){
		$stats{'Feature'} = 1;
            }
	}
# hits to other off-target databases
	if ($oteeval ne 0){
	    for (my $i=0;$i<$oteeval;$i++){
		my $num = $i + 1;
		my $head = 'OTEEVAL_'.$num;
		if (exists $header{$head}){
# consider siRNA pools
                    if ($options{"POOL"}[0] ne "empty"){
                        my @siRNAs = split(/\&/,$columns[$header{$head}]);
                        my $mapped = 1;
                        for (my $j=0;$j<scalar(@siRNAs);$j++){
                            if ($siRNAs[$j] > 0){
                                $mapped = 0;
                            }
                        }
                        if ($mapped eq 0){
                            if (!exists $otestat{$head}){
				$otestat{$head} = [ $OTEHTML[$i], 1 ];
                            }
                            else {
				$otestat{$head}[1]++;
                            }
			    print OTEEVAL "$head\t$columns[$header{QuerySubID}]\n";
                        }
                    }
                    else {
# single sequences
			if ($columns[$header{$head}] > 0){
			    if (!exists $otestat{$head}){
				$otestat{$head} = [ $OTEHTML[$i], 1 ];
				print OTEEVAL "$head\t$columns[$header{QuerySubID}]\n";
			    }
			    else {
				$otestat{$head}[1]++;
			    }
			}
		    }
		}
	    }
	    if (!exists $stats{'oteeval'}){
		$stats{'oteeval'} = 1;
            }
	}
    }
    $headout++;
}
close OUTTAB;
close OTEFILE;
close NO;
close SINGLE;
close MULTI;
close INTOTHER;
close OTHER;
close NOINTOTHER;
if ($lowcomp eq 1){
    close LOW;
}
if ($canrepeats eq 1){
    close CAN;
}
if ($FeatureCalc eq 1){
    close FEAT;
}
if ($oteeval ne 0){
    close OTEEVAL;
}

open (STATS, ">$statoutTAB") || die "Cannot open STATS: $!\n";
print STATS "Reagent statistics\n";
print STATS "Feature\tAverage\tStandardDeviation\n";
print OUTHTML <<"";
    <tr>
        <td class="main">&nbsp;</td>
    </tr>
    <tr>
        <td class="main" id="stats"><h3>Statistics on overall $designsall $designopt</h3></td>
    </tr>
    <tr>
        <td class="style8"><strong>Reagent statistics</strong></td>
    </tr>


if (exists $stats{'LenFor'}){
    my ($avg,$stdev) = &stats(\@lenFor);
    print STATS "Length forward primer [nt]\t$avg\t$stdev\n";
    print OUTHTML <<"";
    <tr>
        <td class="main">Length forward primer [nt]: <strong>$avg +/- $stdev</strong></td>
    </tr>


}
if (exists $stats{'LenRev'}){
    my ($avg,$stdev) = &stats(\@lenRev);
    print STATS "Length reverse primer [nt]\t$avg\t$stdev\n";
    print OUTHTML <<"";
    <tr>
        <td class="main">Length reverse primer [nt]: <strong>$avg +/- $stdev</strong></td>
    </tr>


}
if (exists $stats{'GCFor[%]'}){
    my ($avg,$stdev) = &stats(\@GCFor);
    print STATS "GC content forward primer [%]\t$avg\t$stdev\n";
    print OUTHTML <<"";
    <tr>
        <td class="main">GC content forward primer [%]: <strong>$avg +/- $stdev</strong></td>
    </tr>


}
if (exists $stats{'GCRev[%]'}){
    my ($avg,$stdev) = &stats(\@GCRev);
    print STATS "GC content reverse primer [%]\t$avg\t$stdev\n";
    print OUTHTML <<"";
    <tr>
        <td class="main">GC content reverse primer [%]: <strong>$avg +/- $stdev</strong></td>
    </tr>


}
if (exists $stats{'TmFor[*C]'}){
    my ($avg,$stdev) = &stats(\@TmFor);
    print STATS "Melting temperature forward primer [*C]\t$avg\t$stdev\n";
    print OUTHTML <<"";
    <tr>
        <td class="main">Melting temperature forward primer [&deg;C]: <strong>$avg +/- $stdev</strong></td>
    </tr>


}
if (exists $stats{'TmRev[*C]'}){
    my ($avg,$stdev) = &stats(\@TmRev);
    print STATS "Melting temperature reverse primer [*C]\t$avg\t$stdev\n";
    print OUTHTML <<"";
    <tr>
        <td class="main">Melting temperature reverse primer [&deg;C]: <strong>$avg +/- $stdev</strong></td>
    </tr>


}
if (exists $stats{'ForRevPenalty'}){
    my ($avg,$stdev) = &stats(\@primerpen);
    print STATS "Primer penalty\t$avg\t$stdev\n";
    print OUTHTML <<"";
    <tr>
        <td class="main">Primer penalty: <strong>$avg +/- $stdev</strong></td>
    </tr>


}
if (exists $stats{'Length[nt]'}){
    my ($avg,$stdev) = &stats(\@length);
    print STATS "Reagent length [nt]\t$avg\t$stdev\n";
    print OUTHTML <<"";
    <tr>
        <td class="main">Reagent length [nt]: <strong>$avg +/- $stdev</strong></td>
    </tr>


}
if (exists $stats{'EfficiencyScore'}){
    my ($avg,$stdev) = &stats(\@effstat);
    print STATS "Percent efficiency\t$avg\t$stdev\n";
    print OUTHTML <<"";
    <tr>
        <td class="main">Percent efficiency: <strong>$avg +/- $stdev</strong></td>
    </tr>


}
elsif (exists $stats{'EfficientsiRNAs'}){
    my ($avg,$stdev) = &stats(\@effstat);
    print STATS "Number of efficient siRNAs\t$avg\t$stdev\n";
    print OUTHTML <<"";
    <tr>
        <td class="main">Number of efficient siRNAs: <strong>$avg +/- $stdev</strong></td>
    </tr>


}
print STATS "\nReagent specificity\n";
print OUTHTML <<"";
    <tr>
        <td class="main">&nbsp;</td>
    </tr>
    <tr>
        <td class="style8"><strong>Reagent specificity</strong></td>
    </tr>


if (exists $stats{'Specificity[Abs]'}){
    print STATS "$specstat[0]\t$designopt with $options{SIRNALENGTH}[0] nt matches to other targets\n$specstat[1]\t$designopt with no target at all\n";
    print OUTHTML <<"";
    <tr>
        <td class="main"><strong>$specstat[0]</strong> <a href="$otefile" title="OTE" target="_blank">$designopt</a> with <strong>$options{SIRNALENGTH}[0] nt</strong> off-target effect(s)<br><strong>$specstat[1]</strong> <a href="$notargetfile" title="NoTarget" target="_blank">$designopt</a> have no target at all</td>
    </tr>


}
if (exists $stats{'LowComplexRegions'}){
    print STATS "$lowcompstat\t$designopt with at least one region of low complexity\n";
    print OUTHTML <<"";
    <tr>
        <td class="main"><strong>$lowcompstat</strong> <a href="$lowcompfile" title="LowComplexity" target="_blank">$designopt</a> with at least one region of low complexity</td>
    </tr>


}
if (exists $stats{'CANRepeats'}){
    print STATS "$canstat\t$designopt with at least one $options{CANEVAL}[0]x CA[ATGC] repeat\n";
    print OUTHTML <<"";
    <tr>
        <td class="main"><strong>$canstat</strong> <a href="$CANfile" title="CAN" target="_blank">$designopt</a> with at least one <strong>$options{CANEVAL}[0]x</strong> CA[ATGC] repeat</td>
    </tr>


}
if (exists $stats{'IntendedGene'}){
    print STATS "$singleint\t$designopt with hits to single intended target\n";
    print STATS "$multint\t$designopt with hits to multiple intended target(s)\n";
    print STATS "$other\t$designopt with hits to single intended target and other target(s)\n";
    print STATS "$intother\t$designopt with hits to multiple intended targets and other target(s)\n";
    print STATS "$nointother\t$designopt with no hits to intended target but to other target(s)\n";
    print STATS "$noint\t$designopt with no target at all\n";
    print OUTHTML <<"";
    <tr>
        <td class="main"><strong>$singleint</strong> <a href="$singlefile" title="SingleTarget" target="_blank">$designopt</a> with hits to single intended target<br><strong>$multint</strong> <a href="$multifile" title="MultiTarget" target="_blank">$designopt</a> with hits to multiple intended targets<br><strong>$other</strong> <a href="$otherfile" title="OtherTarget" target="_blank">$designopt</a> with hits to single intended target and other targets<br><strong>$intother</strong> <a href="$intotherfile" title="IntendendandOtherTarget" target="_blank">$designopt</a> with hits to multiple intended targets and other targets<br><strong>$nointother</strong> <a href="$nointotherfile" title="NoIntOther" target="_blank">$designopt</a> with no hits to intended target but to other target(s)<br><strong>$noint</strong> <a href="$notargetfile" title="NoTarget" target="_blank">$designopt</a> with no target at all<br></td>
    </tr>


}
if (exists $stats{'oteeval'}){
    my @ote = keys %otestat;
    for (my $i=0;$i<scalar(@ote);$i++){
	print STATS "$otestat{$ote[$i]}[1]\t$designopt with matches in $otestat{$ote[$i]}[0] database\n";
	print OUTHTML <<"";
    <tr>
        <td class="main"><strong>$otestat{$ote[$i]}[1]</strong> <a href="$oteevalfile" title="OTEEVAL" target="_blank">$designopt</a> with matches in $otestat{$ote[$i]}[0] database</td>
    </tr>


    }
}
if (exists $stats{'Feature'}){
    my @feature = keys %featurestat;
    for (my $i=0;$i<scalar(@feature);$i++){
	print STATS "$featurestat{$feature[$i]}\t$designopt contain at least one $feature[$i] feature\n";
	print OUTHTML <<"";
    <tr>
        <td class="main"><strong>$featurestat{$feature[$i]}</strong> <a href=$featurefile title="Feature" target="_blank">$designopt</a> contain at least one $feature[$i] feature</td>
    </tr>


    }
}
if (exists $stats{'Location'}){
    print STATS "\nMapping status\n";
    print STATS "$locstat\t$designopt located in mapping database\n";
    print OUTHTML <<"";
    <tr>
        <td class="main">&nbsp;</td>
    </tr>
    <tr>
        <td class="style8"><strong>Mapping status</strong></td>
    </tr>
    <tr>
        <td class="main"><strong>$locstat</strong> $designopt located in mapping database</td>
    </tr>


}
my $packfile = $identifier.'.tar.gz';
print OUTHTML <<"";
    <tr>
        <td class="news">&nbsp;</td>
    </tr>
    <tr>
        <td class="main">&nbsp;</td>
    </tr>
    <tr>
        <td class="main"><strong><a href=$packfile title="Report">Download</a></strong> complete HTML report as *.tar.gz archive</td>
    </tr>
</table>


print OUTHTML "$footer\n";
close OUTHTML;
close FAILED;
close ERROR;
close REPORT;

# unlink files required no more
if (defined $fileLocs{'Unlink'}){
    for (my $i=0;$i<scalar(@{ $fileLocs{'Unlink'} });$i++){
	unlink ($fileLocs{'Unlink'}[$i]);
    }
}

# create *.tar.gz for output folder
system ("tar cfvz $options{OUTPUT}[0]$packfile --exclude=$packfile -C $options{OUTPUT}[0] .") eq 0 || print "Failed to pack output folder, 'tar' program not found: $?\n";

print "\nResults written to output file\n\nDesign finished!!!\n";
exit;