Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sfitz by readgroup and add FastQC #62

Merged
merged 46 commits into from
Jul 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
d0d3a7b
process by readgroup in progress
sorelfitzgibbon Apr 19, 2024
adb0591
standardize algorithms to algorithm
sorelfitzgibbon Jun 10, 2024
ad00e21
samtools stats by readgroup
sorelfitzgibbon Jun 11, 2024
046ce02
update changelog
sorelfitzgibbon Jun 12, 2024
07b5458
fix mislabel
sorelfitzgibbon Jun 12, 2024
eb422cc
revert unintentional resource changes
sorelfitzgibbon Jun 12, 2024
ae43ccd
add fastqc
sorelfitzgibbon Jun 5, 2024
68cf4b4
add fastqc module
sorelfitzgibbon Jun 5, 2024
b738e51
use process_afterscript
sorelfitzgibbon Jun 5, 2024
27f1581
update changelog
sorelfitzgibbon Jun 5, 2024
1c922f9
update nftest for fastqc
sorelfitzgibbon Jun 5, 2024
dff3a7a
fix nftest path
sorelfitzgibbon Jun 5, 2024
dbf3a01
merge with sfitz-by-readgroup complete and tested
sorelfitzgibbon Jun 12, 2024
9fd01e9
use fastqc docker with samtools
sorelfitzgibbon Jun 13, 2024
c5407f8
fastqc by readgroup
sorelfitzgibbon Jun 13, 2024
8653bbc
nftest paths updated
sorelfitzgibbon Jun 13, 2024
20085c3
refactor channels
sorelfitzgibbon Jun 14, 2024
b34451e
add hg003 to NFTest
sorelfitzgibbon Jun 14, 2024
d049c14
update samtools
sorelfitzgibbon May 29, 2024
53632de
pull main - samtools update
sorelfitzgibbon Jun 5, 2024
19b68a2
use fastqc docker with samtools
sorelfitzgibbon Jun 13, 2024
0e2c3df
add fastqc resource allocations
sorelfitzgibbon Jun 14, 2024
f95d47e
add fastqc threading and adjust resources
sorelfitzgibbon Jun 16, 2024
da059b8
add fastqc
sorelfitzgibbon Jun 5, 2024
8f92e36
add fastqc to metadata.yaml
sorelfitzgibbon Jun 17, 2024
cc38b86
Bump the pipeline-submodules group with 2 updates
dependabot[bot] Jun 15, 2024
9c9e207
Merge branch 'main' into sfitz-by-readgroup
sorelfitzgibbon Jun 20, 2024
198b40c
add final newline
sorelfitzgibbon Jun 20, 2024
91bf337
update readme
sorelfitzgibbon Jun 24, 2024
4e2b724
update nftest.yml
sorelfitzgibbon Jun 24, 2024
01b50bf
parameterize fastqc and add max gps to stats
sorelfitzgibbon Jun 25, 2024
e61b1ac
sanitize library ID
sorelfitzgibbon Jun 25, 2024
e6d0200
update test configs
sorelfitzgibbon Jun 25, 2024
91f47d3
change process input variable names
sorelfitzgibbon Jun 25, 2024
a2f0489
add slurm logs and extra test files to .gitignore
sorelfitzgibbon Jun 25, 2024
fae512d
update readme
sorelfitzgibbon Jun 26, 2024
ecf44dd
update comments
sorelfitzgibbon Jun 26, 2024
206c7e9
adjust run level triggers
sorelfitzgibbon Jun 26, 2024
8587520
adjust log filename
sorelfitzgibbon Jun 26, 2024
137874f
update nftests
sorelfitzgibbon Jun 26, 2024
6ad240e
fix test name
sorelfitzgibbon Jun 26, 2024
86cc4b2
remove out of date comments
sorelfitzgibbon Jun 27, 2024
8fd7119
change process names
sorelfitzgibbon Jun 27, 2024
9bb4d2f
rename bamqc_outformat to bamqc_output_format
sorelfitzgibbon Jun 27, 2024
2a63bb4
rename bamqc_outformat to bamqc_output_format
sorelfitzgibbon Jun 27, 2024
4e1e2fd
remove fastqc as default
sorelfitzgibbon Jun 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -80,3 +80,8 @@ work/
*.gz
*.tar
*.zip

# Other
test/*
test/*/*
slurm-*.out
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,14 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm
## [Unreleased]

### Added
- Add FastQC workflow
- Add per readgroup and per library functionality
- Add `process_afterscript`
- Add Nextflow version requirement to `README`

### Changed
- Update SAMtools 1.18 to 1.20
- Update NFTest for FastQC and new test sample
- Update repository/pipeline description
- Update Nextflow configuration test workflows

Expand Down
24 changes: 16 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ input:

| Field | Type | Required | Description |
| ----- | ---- | ------------ | ------------------------ |
| `algorithms` | list | no | List of tools to be run: ['stats', 'collectwgsmetrics', 'bamqc'], default = ['stats', 'collectwgsmetrics'] |
| `algorithm` | list | no | List of tools to be run: ['stats', 'collectwgsmetrics', 'bamqc'], default = ['stats', 'collectwgsmetrics'] |
| `reference` | path | yes/no | Reference fasta is required only for `CollectWgsMetrics` |
| `output_dir` | path | yes | Not required if `blcds_registered_dataset` = `true` |
| `blcds_registered_dataset` | boolean | no | Default is `false`. Only `uclahs_cds` users should change this. When `true`, BLCDS folder structure is used |
Expand All @@ -80,8 +80,16 @@ input:
#### SAMtools specific configuration
| Field | Type | Required | Description |
| ----- | ---- | ------------ | ------------------------ |
| remove_duplicates | boolean | no | Ignore reads marked as duplicate. default = `false` |
| samtools_stats_additional_options | string | no | Any additional options recognized by `samtools stats` |
| stats_max_rgs_per_sample | integer | no | If a sample has more than this number of readgroups, `SAMtools stats` will not run per readgroup analysis. Default = 20 |
| stats_max_libs_per_sample | integer | no | If a sample has more than this number of libraries, `SAMtools stats` will not run per library analysis. Default = 20 |
| stats_remove_duplicates | boolean | no | Ignore reads marked as duplicate. default = `false` |
| stats_additional_options | string | no | Any additional options recognized by `samtools stats` |

#### FastQC specific configuration
| Field | Type | Required | Description |
| ----- | ---- | ------------ | ------------------------ |
| fastqc_level | string | yes | 'readgroup', 'library' or 'sample' |
| fastqc_additional_options | string | no | Any additional options recognized by `FastQC` |

#### Picard specific configuration
| Field | Type | Required | Description |
Expand All @@ -95,7 +103,7 @@ input:
#### Qualimap specific configuration
| Field | Type | Required | Description |
| ----- | ---- | ------------ | ------------------------ |
| bamqc_outformat | string | no | Choice of 'pdf' or 'html', default = 'pdf' |
| bamqc_output_format | string | no | Choice of 'pdf' or 'html', default = 'pdf' |
| bamqc_additional_options | string | no | Any additional options recognized by `bamqc` |

#### Base resource allocation updaters
Expand Down Expand Up @@ -124,23 +132,23 @@ base_resource_update {
]
}
```
- To double memory for `run_CollectWgsMetrics_Picard` and triple memory for `run_stats_SAMtools` and `run_bamqc_Qualimap`:
- To double memory for `run_CollectWgsMetrics_Picard` and triple memory for `run_statsSamples_SAMtools` and `run_bamqc_Qualimap`:
```Nextflow
base_resource_update {
memory = [
['run_CollectWgsMetrics_Picard', 2],
[['run_stats_SAMtools', 'run_bamqc_Qualimap'], 3]
[['run_statsSamples_SAMtools', 'run_bamqc_Qualimap'], 3]
]
}
```
- To double CPUs and memory for `run_CollectWgsMetrics_Picard` and double memory for `run_stats_SAMtools`:
- To double CPUs and memory for `run_CollectWgsMetrics_Picard` and double memory for `run_statsSamples_SAMtools`:
```Nextflow
base_resource_update {
cpus = [
['run_CollectWgsMetrics_Picard', 2]
]
memory = [
[['run_CollectWgsMetrics_Picard', 'run_stats_SAMtools'], 2]
[['run_CollectWgsMetrics_Picard', 'run_statsSamples_SAMtools'], 2]
]
}
```
Expand Down
52 changes: 51 additions & 1 deletion config/F16.config
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,57 @@ process {
cpus = 1
memory = 250.MB
}
withName: run_stats_SAMtools {
withName: assess_ReadQualityReadgroups_FastQC {
cpus = 2
memory = 1.GB
retry_strategy {
memory {
strategy = 'add'
operand = 4.GB
}
}
}
withName: assess_ReadQualityLibraries_FastQC {
cpus = 2
memory = 1.GB
retry_strategy {
memory {
strategy = 'add'
operand = 4.GB
}
}
}
withName: assess_ReadQualitySamples_FastQC {
cpus = 2
memory = 1.GB
retry_strategy {
memory {
strategy = 'add'
operand = 4.GB
}
}
}
withName: run_statsReadgroups_SAMtools {
cpus = 1
memory = 1.GB
retry_strategy {
memory {
strategy = 'add'
operand = 4.GB
}
}
}
withName: run_statsLibraries_SAMtools {
cpus = 1
memory = 1.GB
retry_strategy {
memory {
strategy = 'add'
operand = 4.GB
}
}
}
withName: run_statsSamples_SAMtools {
cpus = 1
memory = 1.GB
retry_strategy {
Expand Down
52 changes: 51 additions & 1 deletion config/F2.config
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,57 @@ process {
cpus = 1
memory = 250.MB
}
withName: run_stats_SAMtools {
withName: assess_ReadQualityReadgroups_FastQC {
cpus = 1
memory = 1500.MB
retry_strategy {
memory {
strategy = 'add'
operand = 2000.MB
}
}
}
withName: assess_ReadQualityLibraries_FastQC {
cpus = 1
memory = 1500.MB
retry_strategy {
memory {
strategy = 'add'
operand = 2000.MB
}
}
}
withName: assess_ReadQualitySamples_FastQC {
cpus = 1
memory = 1500.MB
retry_strategy {
memory {
strategy = 'add'
operand = 2000.MB
}
}
}
withName: run_statsReadgroups_SAMtools {
cpus = 1
memory = 1500.MB
retry_strategy {
memory {
strategy = 'add'
operand = 2000.MB
}
}
}
withName: run_statsLibraries_SAMtools {
cpus = 1
memory = 1500.MB
retry_strategy {
memory {
strategy = 'add'
operand = 2000.MB
}
}
}
withName: run_statsSamples_SAMtools {
cpus = 1
memory = 1500.MB
retry_strategy {
Expand Down
52 changes: 51 additions & 1 deletion config/F32.config
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,57 @@ process {
cpus = 1
memory = 250.MB
}
withName: run_stats_SAMtools {
withName: assess_ReadQualityReadgroups_FastQC {
cpus = 2
memory = 1.GB
retry_strategy {
memory {
strategy = 'add'
operand = 4.GB
}
}
}
withName: assess_ReadQualityLibraries_FastQC {
cpus = 2
memory = 1.GB
retry_strategy {
memory {
strategy = 'add'
operand = 4.GB
}
}
}
withName: assess_ReadQualitySamples_FastQC {
cpus = 2
memory = 1.GB
retry_strategy {
memory {
strategy = 'add'
operand = 4.GB
}
}
}
withName: run_statsReadgroups_SAMtools {
cpus = 1
memory = 1.GB
retry_strategy {
memory {
strategy = 'add'
operand = 4.GB
}
}
}
withName: run_statsLibraries_SAMtools {
cpus = 1
memory = 1.GB
retry_strategy {
memory {
strategy = 'add'
operand = 4.GB
}
}
}
withName: run_statsSamples_SAMtools {
cpus = 1
memory = 1.GB
retry_strategy {
Expand Down
52 changes: 51 additions & 1 deletion config/F4.config
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,57 @@ process {
cpus = 1
memory = 250.MB
}
withName: run_stats_SAMtools {
withName: assess_ReadQualityReadgroups_FastQC {
cpus = 2
memory = 1.GB
retry_strategy {
memory {
strategy = 'add'
operand = 4.GB
}
}
}
withName: assess_ReadQualityLibraries_FastQC {
cpus = 2
memory = 1.GB
retry_strategy {
memory {
strategy = 'add'
operand = 4.GB
}
}
}
withName: assess_ReadQualitySamples_FastQC {
cpus = 2
memory = 1.GB
retry_strategy {
memory {
strategy = 'add'
operand = 4.GB
}
}
}
withName: run_statsReadgroups_SAMtools {
cpus = 1
memory = 1.GB
retry_strategy {
memory {
strategy = 'add'
operand = 3.GB
}
}
}
withName: run_statsLibraries_SAMtools {
cpus = 1
memory = 1.GB
retry_strategy {
memory {
strategy = 'add'
operand = 3.GB
}
}
}
withName: run_statsSamples_SAMtools {
cpus = 1
memory = 1.GB
retry_strategy {
Expand Down
52 changes: 51 additions & 1 deletion config/F72.config
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,57 @@ process {
cpus = 1
memory = 250.MB
}
withName: run_stats_SAMtools {
withName: assess_ReadQualityReadgroups_FastQC {
cpus = 2
memory = 1.GB
retry_strategy {
memory {
strategy = 'add'
operand = 4.GB
}
}
}
withName: assess_ReadQualityLibraries_FastQC {
cpus = 2
memory = 1.GB
retry_strategy {
memory {
strategy = 'add'
operand = 4.GB
}
}
}
withName: assess_ReadQualitySamples_FastQC {
cpus = 2
memory = 1.GB
retry_strategy {
memory {
strategy = 'add'
operand = 4.GB
}
}
}
withName: run_statsReadgroups_SAMtools {
cpus = 1
memory = 1.GB
retry_strategy {
memory {
strategy = 'add'
operand = 4.GB
}
}
}
withName: run_statsLibraries_SAMtools {
cpus = 1
memory = 1.GB
retry_strategy {
memory {
strategy = 'add'
operand = 4.GB
}
}
}
withName: run_statsSamples_SAMtools {
cpus = 1
memory = 1.GB
retry_strategy {
Expand Down
Loading