@@ -31,7 +31,7 @@ Global Class
31
31
** Parameters**
32
32
| name | type | Description |
33
33
| --- | --- | --- |
34
- | string | text | raw text to be analyzed |
34
+ | text | string | raw text to be analyzed |
35
35
36
36
### Static Members
37
37
#### ` Punctuations `
@@ -50,6 +50,7 @@ em-dash, period, comma, semicolon, colon, bang, question mark, interrobang, Span
50
50
### Static Methods
51
51
#### ` hasPunctuation(string) `
52
52
determines if string contains punctuation
53
+
53
54
** Parameters**
54
55
| name | type | Description |
55
56
| --- | --- | --- |
@@ -60,6 +61,7 @@ em-dash, period, comma, semicolon, colon, bang, question mark, interrobang, Span
60
61
61
62
#### ` hasSpace(string) `
62
63
determines if a string has a space
64
+
63
65
** Parameters**
64
66
| name | type | Description |
65
67
| --- | --- | --- |
@@ -92,6 +94,7 @@ em-dash, period, comma, semicolon, colon, bang, question mark, interrobang, Span
92
94
93
95
#### ` getNGrams(text, gramSize) `
94
96
gets ngrams from text
97
+
95
98
** Parameters**
96
99
| name | type | Description |
97
100
| --- | --- | --- |
@@ -103,8 +106,10 @@ em-dash, period, comma, semicolon, colon, bang, question mark, interrobang, Span
103
106
104
107
105
108
#### ` getWordNGrams(text) `
106
- Gets 2-word pairs from text.
107
- This doesn't use sentence punctuation as a boundary. Should it?
109
+ Gets 2-word pairs from text.
110
+
111
+ Note: This doesn't use sentence punctuation as a boundary. Should it?
112
+
108
113
** Parameters**
109
114
| name | type | Description |
110
115
| --- | --- | --- |
@@ -115,6 +120,8 @@ Gets 2-word pairs from text.
115
120
` Array<string> `
116
121
117
122
#### ` getFrequencyMap(frequencyMap) `
123
+ converts an array of strings into a map of those strings and number of occurences
124
+
118
125
** Parameters**
119
126
| name | type | Description |
120
127
| --- | --- | --- |
@@ -124,6 +131,8 @@ Gets 2-word pairs from text.
124
131
` Map<string, number> `
125
132
126
133
#### ` getPercentMap(frequencyMap) `
134
+ converts a frequency map into a map of percentages
135
+
127
136
** Parameters**
128
137
| name | type | Description |
129
138
| --- | --- | --- |
@@ -133,7 +142,8 @@ Gets 2-word pairs from text.
133
142
` Map<string, number> `
134
143
135
144
#### ` getTopGrams(frequencyMap) `
136
-
145
+ filters a frequency map into only a small subset of the most frequent ones
146
+
137
147
** Parameters**
138
148
| name | type | Description |
139
149
| --- | --- | --- |
@@ -145,61 +155,79 @@ Gets 2-word pairs from text.
145
155
146
156
### Instance Members
147
157
#### ` sanitizedText `
148
- lowercased text with diacritics removed
158
+ lowercased text with diacritics removed
159
+
149
160
` string `
150
161
#### ` letters `
151
- an array of letters in the text
162
+ an array of letters in the text
163
+
152
164
` Array<string> `
153
165
#### ` words `
154
166
an array of words in the text
167
+
155
168
` Array<string> `
156
169
#### ` bigrams `
157
- an array of letter bigrams in the text
170
+ an array of letter bigrams in the text
171
+
158
172
` Array<string> `
159
173
#### ` trigrams `
160
- an array of letter trigrams in the text
174
+ an array of letter trigrams in the text
175
+
161
176
` Array<string> `
162
177
#### ` uniqueLetters `
163
- an array of unique letters in the text
178
+ an array of unique letters in the text
179
+
164
180
` Array<string> `
165
181
#### ` uniqueBigrams `
166
182
an array of unique bigrams in the text
183
+
167
184
` Array<string> `
168
185
#### ` uniqueTrigrams `
169
186
an array of unique trigrams in the text
187
+
170
188
` Array<string> `
171
189
#### ` uniqueWords `
172
190
an array of unique words in the text
191
+
173
192
` Array<string> `
174
193
#### ` letterFrequencies `
175
- a map of letter frequencies in the sanitized text
194
+ a map of letter frequencies in the sanitized text
195
+
176
196
` Map<string, number> `
177
197
#### ` bigramFrequencies `
178
- a map of bigram frequencies in the sanitized text
198
+ a map of bigram frequencies in the sanitized text
199
+
179
200
` Map<string, number> `
180
201
#### ` trigramFrequencies `
181
- a map of trigram frequencies in the sanitized text
202
+ a map of trigram frequencies in the sanitized text
203
+
182
204
` Map<string, number> `
183
205
#### ` wordFrequencies `
184
- a map of word frequencies in the sanitized text
206
+ a map of word frequencies in the sanitized text
207
+
185
208
` Map<string, number> `
186
209
#### ` letterPercentages `
187
- a map of letter percentages in the sanitized text
210
+ a map of letter percentages in the sanitized text
211
+
188
212
` Map<string, number> `
189
213
#### ` bigramPercentages `
190
- a map of bigram percentages in the sanitized text
214
+ a map of bigram percentages in the sanitized text
215
+
191
216
` Map<string, number> `
192
217
#### ` trigramPercentages `
193
- a map of trigram percentages in the sanitized text
218
+ a map of trigram percentages in the sanitized text
219
+
194
220
` Map<string, number> `
195
221
#### ` wordPercentages `
196
- a map of word percentages in the sanitized text
222
+ a map of word percentages in the sanitized text
223
+
197
224
` Map<string, number> `
198
225
199
226
### Instance Methods
200
227
201
228
#### ` getTopLetters(topCount) `
202
- a map of the most used letters in the text
229
+ a map of the most used letters in the text
230
+
203
231
** Parameters**
204
232
| name | type | Description |
205
233
| --- | --- | --- |
@@ -209,7 +237,8 @@ lowercased text with diacritics removed
209
237
` Map<string, number> `
210
238
211
239
#### ` getTopBigrams(topCount) `
212
- a map of the most used bigrams in the text
240
+ a map of the most used bigrams in the text
241
+
213
242
** Parameters**
214
243
| name | type | Description |
215
244
| --- | --- | --- |
@@ -219,7 +248,8 @@ lowercased text with diacritics removed
219
248
` Map<string, number> `
220
249
221
250
#### ` getTopTrigrams(topCount) `
222
- a map of the most used trigrams in the text
251
+ a map of the most used trigrams in the text
252
+
223
253
** Parameters**
224
254
| name | type | Description |
225
255
| --- | --- | --- |
@@ -229,7 +259,8 @@ lowercased text with diacritics removed
229
259
` Map<string, number> `
230
260
231
261
#### ` getTopWords(topCount) `
232
- a map of the most used words in the text
262
+ a map of the most used words in the text
263
+
233
264
** Parameters**
234
265
| name | type | Description |
235
266
| --- | --- | --- |
0 commit comments