Skip to content

Commit 5cae678

Browse files
authored
Merge pull request #12 from ajdapretnar/split
[DOC] Split
2 parents bfb01cc + 3468f74 commit 5cae678

File tree

5 files changed

+437
-0
lines changed

5 files changed

+437
-0
lines changed
165 KB
Loading
435 KB
Loading
51.4 KB
Loading

source/widgets/data/split.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
Split
2+
=====
3+
4+
Split text or categorical variables into indicator variables.
5+
6+
**Inputs**
7+
8+
- Data: data table
9+
10+
**Outputs**
11+
12+
- Data: data table with added columns
13+
14+
The widget splits the chosen text or categorical variable into separate variables. The newly-created variables can be categorical (Yes or No) or numeric (0 or 1) indicators, or appearance counts. The widget is typically used with survey data.
15+
16+
![](images/Split-stamped.png){width=40%}
17+
18+
1. Select the variable to split. Define the delimiter.
19+
2. Set the type of output variables. Categorical produces categorical indicator with values Yes and No, Numerical produces a dummy numerical variable with values 0 and 1, while Counts creates a variable that contains the number of appearances of each term. The latter two options are equivalent if each value can appear only once.
20+
3. If *Apply Automatically* is ticked, changes are communicated automatically. Alternatively, press *Apply*.
21+
22+
Example
23+
-------
24+
25+
The workflow uses data from the Orange survey taken in 2020. The survey was done in Google Forms, which joins responses for multiple choice questions.
26+
27+
To use this data in the analysis, we have to split on a delimiter, a semicolon in this case. We do this using **Split**. We pass the data from the [File](../data/file.md) widget to Split and set the parameters.
28+
29+
Say, we wish to observe and count the reasons users like Orange. We split on *Reason* and set the delimiter to semicolon. The output will be an array of categorical variables. We can observe the results in a [Data Table](../data/datatable.md).
30+
31+
![](images/Split-Example.png)
32+
33+
The advantage of Split is that now, we can count each response individually. Say, we observe how the responses differ on the position of the respondent in [Box Plot](../visualize/boxplot.md). It seems the professors especially appreciate Orange for its visual programming approach.
34+
35+
![](images/Split-BoxPlot.png){width=500px}

0 commit comments

Comments
 (0)