-
Notifications
You must be signed in to change notification settings - Fork 1
/
answers.txt
209 lines (172 loc) · 6.7 KB
/
answers.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
Name: Chris Edwards
Evergreen Login: edwchr30
Computer Science Foundations
Programming as a Way of Life
Fall 2013
Homework 3
For each problem that requires a written answer, write (or copy-and-paste)
your answers in this file. When you are done, you should have replaced all
the ellipses (the three dots) with your answers.
At the end, you will turn in this file along with your modified
dna_analysis.py program.
Problem 1:
(No answers in this file.)
Problem 2:
python dna_analysis.py data/sample_1.fastq
GC-content: 0.43029262963
python dna_analysis.py data/sample_2.fastq
GC-content: 0.451051333333
python dna_analysis.py data/sample_3.fastq
GC-content: 0.646864112524
python dna_analysis.py data/sample_4.fastq
GC-content: 0.347852341166
python dna_analysis.py data/sample_5.fastq
GC-content: 0.263157894737
python dna_analysis.py data/sample_6.fastq
GC-content: 0.491518518519
Problem 3:
The variable (linenum) is not defined
If I comment out
gc_count = 0
the script cannot print gc_content, which is now broken due to missing variable definitions.
Problem 4:
python dna_analysis.py data/test-small.fastq
GC-content: 0.3
AT-content: 0.7
Problem 5:
Number of G nucleotides: 1
Number of C nucleotides: 2
Number of A nucleotides: 5
Number of T nucleotides: 2
Problem 6:
Chriss-iMac:homework3 edwards$ python dna_analysis.py data/sample_1.fastq
Number of G nucleotides: 5738773
Number of C nucleotides: 5879128
Number of A nucleotides: 7701287
Number of T nucleotides: 7661547
GC-content: 0.43029262963
AT-content: 0.568993851852
Sum of individual counts: 26980735.0
The total number of bps: 27000000
Length of the Sequence: 27000000
Chriss-iMac:homework3 edwards$ python dna_analysis.py data/sample_2.fastq
Number of G nucleotides: 6103168
Number of C nucleotides: 6075218
Number of A nucleotides: 7467696
Number of T nucleotides: 7331353
GC-content: 0.451051333333
AT-content: 0.548112925926
Sum of individual counts: 26977435.0
The total number of bps: 27000000
Length of the Sequence: 27000000
Chriss-iMac:homework3 edwards$ python dna_analysis.py data/sample_3.fastq
Number of G nucleotides: 3008039
Number of C nucleotides: 3144239
Number of A nucleotides: 1561973
Number of T nucleotides: 1796632
GC-content: 0.646864112524
AT-content: 0.353131156076
Sum of individual counts: 9510883.0
The total number of bps: 9510928
Length of the Sequence: 9510928
Chriss-iMac:homework3 edwards$ python dna_analysis.py data/sample_4.fastq
Number of G nucleotides: 1620779
Number of C nucleotides: 1851138
Number of A nucleotides: 3304273
Number of T nucleotides: 3204771
GC-content: 0.347852341166
AT-content: 0.65214294989
Sum of individual counts: 9980961.0
The total number of bps: 9981008
Length of the Sequence: 9981008
Chriss-iMac:homework3 edwards$ python dna_analysis.py data/sample_5.fastq
Number of G nucleotides: 11
Number of C nucleotides: 9
Number of A nucleotides: 27
Number of T nucleotides: 28
GC-content: 0.263157894737
AT-content: 0.723684210526
Sum of individual counts: 75.0
The total number of bps: 76
Length of the Sequence: 76
Chriss-iMac:homework3 edwards$ python dna_analysis.py data/sample_6.fastq
Number of G nucleotides: 7304
Number of C nucleotides: 5967
Number of A nucleotides: 6867
Number of T nucleotides: 6852
GC-content: 0.491518518519
AT-content: 0.508111111111
Sum of individual counts: 26990.0
The total number of bps: 27000
Length of the Sequence: 27000
Chriss-iMac:homework3 edwards$ python dna_analysis.py data/test-high-gc-1.fastq
Number of G nucleotides: 12176
Number of C nucleotides: 11800
Number of A nucleotides: 7970
Number of T nucleotides: 7894
GC-content: 0.5994
AT-content: 0.3966
Sum of individual counts: 39840.0
The total number of bps: 40000
Length of the Sequence: 40000
Chriss-iMac:homework3 edwards$ python dna_analysis.py data/test-high-gc-2.fastq
Number of G nucleotides: 15955
Number of C nucleotides: 15955
Number of A nucleotides: 3908
Number of T nucleotides: 4023
GC-content: 0.79775
AT-content: 0.198275
Sum of individual counts: 39841.0
The total number of bps: 40000
Length of the Sequence: 40000
Chriss-iMac:homework3 edwards$ python dna_analysis.py data/test-moderate-gc-1.fastq
Number of G nucleotides: 9818
Number of C nucleotides: 9983
Number of A nucleotides: 10077
Number of T nucleotides: 9963
GC-content: 0.495025
AT-content: 0.501
Sum of individual counts: 39841.0
The total number of bps: 40000
Length of the Sequence: 40000
Chriss-iMac:homework3 edwards$ python dna_analysis.py data/test-moderate-gc-2.fastq
Number of G nucleotides: 7972
Number of C nucleotides: 7999
Number of A nucleotides: 11813
Number of T nucleotides: 12027
GC-content: 0.399275
AT-content: 0.596
Sum of individual counts: 39811.0
The total number of bps: 40000
Length of the Sequence: 40000
Chriss-iMac:homework3 edwards$ python dna_analysis.py data/test-small.fastq
Number of G nucleotides: 1
Number of C nucleotides: 2
Number of A nucleotides: 5
Number of T nucleotides: 2
GC-content: 0.3
AT-content: 0.7
Sum of individual counts: 10.0
The total number of bps: 10
Length of the Sequence: 10
The differences account for extra sequencing characters
Problem 7:
AT/GC ratio: 1.32234161747
Problem 8:
Chriss-iMac:homework3 edwards$ python dna_analysis.py data/sample_1.fastq
Number of G nucleotides: 5738773
Number of C nucleotides: 5879128
Number of A nucleotides: 7701287
Number of T nucleotides: 7661547
GC-content: 0.43029262963
AT-content: 0.568993851852
AT/GC ratio: 1.32234161747
Sum of individual counts: 26980735.0
The total number of bps: 27000000
Length of the Sequence: 27000000
This sample is considered to have a moderate GC content
Collaboration:
http://ada.evergreen.edu/csf/python13f/web/hw3.html
Reflection:
I think that my answer to number 6 is probably more verbose than you were looking for.
I left it in though, because I used it to evaluate my output and compare the data manually.