Skip to content

Commit

Permalink
doc update
Browse files Browse the repository at this point in the history
  • Loading branch information
ens-ftricomi committed Oct 28, 2024
1 parent 88a5a5c commit 1db7b8d
Show file tree
Hide file tree
Showing 3 changed files with 17 additions and 5 deletions.
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,10 @@ This pipeline processes transcriptomic data for various taxon IDs, performing a

4. **Run STAR Alignment**: Align the subsampled FASTQ files to the provided genome assembly using the STAR aligner, then store the results into the database.

## Batching

The batching option is available to process species with a huge amount of rnaseq data. The batches can be created via src/python/ensembl/genes/metadata/transcriptomic/check_for_transcriptomic_batch.py : give a taxon id and the batch size the script retrieves the list of run accession and split them in multiple txt files according to the batch size.
The pipeline considers the date of the last processed date as last cheked date for future updates.

### Mandatory arguments

Expand All @@ -25,6 +28,12 @@ The structure of the file can cahnge according to the running options
| taxon_id,gca (header) |
| <taxon_id>,<gca> |

In case of batching
| csv file format |
|-----------------|
| taxon_id,gca,runs_file (header) |
| <taxon_id>,<gca>,<path to the batch file> |


#### `--outDir`
Path to the directory where to store the results of the pipeline
Expand Down
13 changes: 8 additions & 5 deletions pipelines/nextflow/workflows/short_read.nf
Original file line number Diff line number Diff line change
Expand Up @@ -73,11 +73,14 @@ if (params.help) {
log.info ' --transcriptomic_dbhost STR Db host server '
log.info ' --transcriptomic_dbport INT Db port '
log.info ' --transcriptomic_dbuser STR Db user '
log.info ' --transcriptomic_dbpassword STR Db password '
log.info ' --user_r STR Db user read_only'
log.info ' --enscode STR Enscode path '
log.info ' --outDir STR Output directory. Default is workDir'
log.info ' --csvFile STR Path for the csv containing the db name'
log.info ' --transcriptomic_dbpassword STR Db password '
log.info ' --enscode STR Enscode path '
log.info ' --outDir STR Output directory. Default is workDir'
log.info ' --csvFile STR Path for the csv containing the db name'
log.info ' --cacheDir Path to the directory to use as cache for the intermediate files'
log.info ' --files_latency Sleep time (in seconds) after the genome and proteins have been fetched (default, 60 seconds)'
log.info ' --backupDB bool Dump the db and save it in a zipped file '
log.info ' --cleanOutputDir bool Remove all files present in theoutput directory except the db dump file'
exit 1
}

Expand Down
Binary file modified plot.jpeg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 1db7b8d

Please sign in to comment.