You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: source/widgets/data/applydomain.md
+1-3Lines changed: 1 addition & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,9 +18,7 @@ Given dataset and template transforms the dataset.
18
18
19
19
The widget receives a dataset and a template dataset used to transform the dataset.
20
20
21
-
#### Side note
22
-
23
-
Domain transformation works by using information from the template data. For example, for PCA, Components are not enough. Transformation requires information on the center of each column, variance (if the data is normalized), and if and how the data was preprocessed (continuized, imputed, etc.).
21
+
**Side note.** Domain transformation works by using information from the template data. For example, for PCA, Components are not enough. Transformation requires information on the center of each column, variance (if the data is normalized), and if and how the data was preprocessed (continuized, imputed, etc.).
Copy file name to clipboardExpand all lines: source/widgets/data/csvfileimport.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ The **CSV File Import** widget reads comma-separated files and sends the dataset
12
12
13
13
*Data Frame* output can be used in the [Python Script](../data/pythonscript.md) widget by connecting it to the `in_object` input (e.g. `df = in_object`). Then it can be used a regular DataFrame.
14
14
15
-
###Import Options
15
+
## Import Options
16
16
17
17
The import window where the user sets the import parameters. Can be re-opened by pressing *Import Options* in the widget.
18
18
@@ -41,7 +41,7 @@ Right click on the column name to set the column type. Right click on the row in
41
41
-*Ignore*: do not output the column.
42
42
4. Pressing *Reset* will return the settings to the previously set state (saved by pressing OK in the Import Options dialogue). *Restore Defaults* will set the settings to their default values. *Cancel* aborts the import, while *OK* imports the data and saves the settings.
43
43
44
-
###Widget
44
+
## Widget
45
45
46
46
The widget once the data is successfully imported.
47
47
@@ -51,7 +51,7 @@ The widget once the data is successfully imported.
51
51
2. Information on the imported data set. Reports on the number of instances (rows), variables (features or columns) and meta variables (special columns).
52
52
3.*Import Options* re-opens the import dialogue where the user can set delimiters, encodings, text fields and so on. *Cancel* aborts data import. *Reload* imports the file once again, adding to the data any changes made in the original file.
53
53
54
-
###Encoding
54
+
## Encoding
55
55
56
56
The dialogue for settings custom encodings list in the Import Options - Encoding dropdown. Select *Customize Encodings List...* to change which encodings appear in the list. To save the changes, simply close the dialogue. Closing and reopening Orange (even with Reset widget settings) will not re-set the list. To do this, press *Restore Defaults*. To have all the available encodings in the list, press *Select all*.
Copy file name to clipboardExpand all lines: source/widgets/data/impute.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -34,6 +34,6 @@ Example
34
34
35
35
To demonstrate how the **Impute** widget works, we selected the *heart_disease* dataset, which contains some missing values in the *major vessels colored* feature. We show those four attributes in [Data Table](../data/datatable.md).
36
36
37
-
We used the **Impute** widget and selected the *Fixed valuer* to impute the missing values only for the *major vessels colored* attribute. In Data Table (imputed), we see how the question marks turned into 0, which was our choice for a fixed value.
37
+
We used the **Impute** widget and selected the *Fixed values* to impute the missing values only for the *major vessels colored* attribute. In Data Table (imputed), we see how the question marks turned into 0, which was our choice for a fixed value.
Copy file name to clipboardExpand all lines: source/widgets/data/mergedata.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,7 +30,7 @@ Depending upon the merge types, selected features may be required to have unique
30
30
Merging Types
31
31
-------------
32
32
33
-
##### Append Columns from Extra Data (left join)
33
+
**Append Columns from Extra Data (left join)**
34
34
35
35
Columns from the Extra Data are added to the Data. Instances with no matching rows will have missing values added.
36
36
@@ -42,7 +42,7 @@ For this type of merge, the values on the left (e.g. cities) may repeat (e.g. th
42
42
43
43

44
44
45
-
##### Find matching pairs of rows (inner join)
45
+
**Find matching pairs of rows (inner join)**
46
46
47
47
Only those rows that are matched will be present on the output, with the Extra Data columns appended. Rows without matches are removed.
48
48
@@ -52,7 +52,7 @@ For this type of merge, combinations of features on the left and on the right mu
52
52
53
53

54
54
55
-
##### Concatenate tables (outer join)
55
+
**Concatenate tables (outer join)**
56
56
57
57
The rows from both the Data and the Extra Data will be present on the output. Where rows cannot be matched, missing values will appear.
58
58
@@ -62,17 +62,17 @@ For this type of merge, combinations of features on the left and on the right mu
62
62
63
63

64
64
65
-
#####Row index
65
+
### Row index
66
66
67
67
Data will be merged in the same order as they appear in the table. Row number 1 from the Data input will be joined with row number 1 from the Extra Data input. Row numbers are assigned by Orange based on the original order of the data instances.
68
68
69
-
#####Instance ID
69
+
### Instance ID
70
70
71
71
This is a more complex option. Sometimes, data is transformed in the analysis and the domain is no longer the same. Nevertheless, the original row indices are still present in the background (Orange remembers them). In this case, one can merge on instance ID. For example, you transformed the data with [PCA](../unsupervised/PCA.md), visualized it in the [Scatter Plot](../visualize/scatterplot.md), selected some data instances and now you wish to see the original information of the selected subset. Connect the output of Scatter Plot to Merge Data, add the original data set as *Extra Data* and merge by *Instance ID*.
72
72
73
73

74
74
75
-
#####Merge by two or more attributes
75
+
### Merge by two or more attributes
76
76
77
77
Sometimes our data instances are unique with respect to a combination of columns, not a single column. To merge by more than a single column, add the *Row matching* condition by pressing plus next to the matching condition. To remove it, press the x.
Copy file name to clipboardExpand all lines: source/widgets/data/pythonscript.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -45,7 +45,7 @@ Python Script widget is intended to extend functionalities for advanced users. C
45
45
46
46
Documentation and additional scripts are available in the [orange-scripts](https://github.com/biolab/orange-scripts) repository.
47
47
48
-
####Batch filtering
48
+
### Batch filtering
49
49
50
50
One can, for example, do batch filtering by attributes. We used zoo.tab for the example and we filtered out all the attributes that have more than 5 discrete values. This in our case removed only 'leg' attribute, but imagine an example where one would have many such attributes.
51
51
@@ -58,7 +58,7 @@ One can, for example, do batch filtering by attributes. We used zoo.tab for the
58
58
59
59

60
60
61
-
####Rounding values
61
+
### Rounding values
62
62
63
63
The second example shows how to round all the values in a few lines of code. This time we used wine.tab and rounded all the values to whole numbers.
64
64
@@ -69,7 +69,7 @@ The second example shows how to round all the values in a few lines of code. Thi
69
69
70
70

71
71
72
-
####Gaussian noise
72
+
### Gaussian noise
73
73
74
74
The third example introduces some Gaussian noise to the data. Again we make a copy of the input data, then walk through all the values with a double for loop and add random noise.
75
75
@@ -83,7 +83,7 @@ The third example introduces some Gaussian noise to the data. Again we make a co
83
83
84
84

85
85
86
-
####Custom text preprocessing
86
+
### Custom text preprocessing
87
87
88
88
The final example uses Orange3-Text add-on. **Python Script** is very useful for custom preprocessing in text mining, extracting new features from strings, or utilizing advanced *nltk* or *gensim* functions. Below, we tokenized our input data from *deerwester.tab* by splitting them by whitespace.
0 commit comments