Statistics Workbench is a tool for the visualization and analyzation of data. It offers multiple mathematical functions for finding the maximum, minimum, mean, mode, maximal deviation and more of a dataset, as well as multiple chart types, like barcharts and piecharts. There are multiple ways of creating datasets from different inputs outlined below.
Statistics workbench offers a large number of functionalities for displaying the different charts, however it is not complete yet and not all functionalities are supported by all chart types yet.
There are Coding Standards which can be found in the wiki which we got from our predecessors and have maintained and used to make creating constistent code a bit easier and we encourage you to keep using them.
Take a look at Issue #75 to get an overview of what features may be missing and some ideas for features that could be nice to implement.
The test base is very comprehensive for some chart types and almost non-existent for some others and could also be a good place to start and get to know the legacy project. Below you will get a short introduction into the most important features to get you started as well.
First install metacello using this guide. Then run the following in a workspace in your Squeak image. This will install only required packages.
Metacello new
baseline: 'StatisticsWorkbench';
repository: 'github://hpi-swa-teaching/StatisticsWorkbench:dev/packages';
load.
If you want to install every package (including Tests and Examples), run the following in your workspace:
Metacello new
baseline: 'StatisticsWorkbench';
repository: 'github://hpi-swa-teaching/StatisticsWorkbench:dev/packages';
load: #('all').
After that you are good to go.
We created multiple examples, in order to get you started with the existing project. You can find them in the StatisticsWorkbench-Examples package. They can be used by calling:
<ExampleName> open.
E.g.
SWCreateNamedChartExample open.
Feel free to take a look at the open
method of each example to see how creating data or diagrams works.
There are now several different ways to create datasets, each of which has its own example
SWDataFrom<...>Example
E.g.
SWDataFromStringExample open.
Data can be created from seperate collections, from a collection of single datapoints or from a string, which is the way it initially was implemented and which we have slowly been phasing out in favor of using squeak internal data types to create data:
data := SWDataLabeled fromXValues: {1 . 2 . 3 . 4} versusYValues: {22 . 23 . 15 . 6} withLabels: {'one' . 'two' . 'three' . 'four'}.
labeledData := SWDataLabeled fromDataPointCollections: #(#(1 1 'one') #(2 1 'two') #(3 3 'three') #(4 1 'four')).
unlabeledData := SWDataUnlabeled fromDataPointCollections: #(#(0 2) #(1 1) #(2 1) #(3 3) #(4 0)).
Labels can also be added to an unlabeled data object afterwards:
data setLabels: {'one' . 'two' . 'three' . 'four'}.
Then, you can set the dimensions in order to label the X and Y axes:
data setAllDimensionNames: #('City' 'PopulationInThousands').
Afterwards you can visualize it using Line-/Bar-/Pie-Charts or Scatterplots:
graph := SWDiagram new visualize: data with: SWBarChart.
graph := SWDiagram new visualize: data with: SWLineChart.
graph := SWDiagram new visualize: data with: SWPieChart.
graph := SWDiagram new visualize: data with: SWScatterPlot.
Finally you can open your chart in a window - that can also be labelled - by typing
graph openInWindow.
graph openInWindowLabeled: 'My Diagram'
If you want to adjust your diagram, you can do that as follows:
(graph charts first) barWidth: 40.
graph axisColor: Color red.
You can also choose another SWColorTheme by calling:
graph applyColorTheme: SWDarkTheme new.
graph applyColorTheme: SWHPITheme new.
Furthermore, you can display the mean value of your data:
graph showMean.
You can also interact with the diagram, for example by right-clicking on data points to delete them from the diagram. This functionality can be used to exclude outliers.
Charts for single y-axis:
- Piechart
- Barchart
- LineCharts
- AreaChart
- Scatterplot
- SpiderChart
- Normalized BarChart
- Normalized AreaChart
Charts that are capable of multiple y-axes:
- Barchart
- LineCharts
- AreaChart
- Scatterplot
- SpiderChart
- Normalized BarChart
- Normalized AreaChart
Here are two examples of how to create stacked and normalized charts for multiple y-axes:
(SWDiagram new
stacked: true;
visualizeAll: dataCollection with: SWBarChart) openInWindowLabeled: ''
(SWDiagram new
stacked: true;
normalized: true;
visualizeAll: dataCollection with: SWBarChart) openInWindowLabeled: ''
Here "dataCollection" is of type OrderedCollection with elemets from SWDataLabeled or SWDataUnlabeled.
Charts that suppot 3-dimensional data:
- BubblePlot
Here is an examples of how to create a bubbleplot for 3-dimensional data:
| data |
data := SWDataUnlabeled fromXValues: {1 . 2 . 3 . 4 . 5} versusYValues: {5 . 200 . 38 . 69 . 16} versusZValues: {17 . 14 . 5 . 8 . 3}.
(SWDiagram new visualize: data with: SWBubblePlot) openInWindowLabeled: ''
It is also possible to visualize more than one data series within a BubblePlot by using the method SWDiagram new visualizeAll: dataCollection with: SWBubblePlot
.
It is also possible to perform Linear Regression on a ScatterPlot.
| data scatterPlot |
data := SWDataUnlabeled fromXValues: {1 . 2 . 3 . 4 . 5} versusYValues: {22 . 110 . 64 . 211 . 35}.
scatterPlot := SWDiagram new visualize: data with: SWScatterPlot.
(SWLinearRegression newFromScatterPlot: (scatterPlot charts first) plotOn: scatterPlot).
scatterPlot openInWindowLabeled: ''
Another way of accesing the StatisticsWorkBench tool is the brand new User Interface. This can be opened by calling
SWMainformModel open.
Features:
- Import CSV
- Export Charts as PNG
- Edit X, Y axis and columns in the editor
- Choose different themes for your charts
- View all types of charts without writing commands
The statistics workbench is not just designed for charts with a single x and y-axis. You can add multiple y-axes using the editor or by importing them via CSV. All imported y-axis values will show up in the multiple text selector on the right side of the UI. You can also display multiple charts at the same time for the current x- and y-axes values by selecting different chart types.
If you want to remove or edit y-axis values, simply select the desired rows and click the corresponding button to make the changes.
If you only have one y-axis value in your UI field, you can visualize all charts. However, if you have multiple y-axis values, you can't visualize the pieChart and will get an error message because this chart type is not designed to have more y-axes. All other charts will be stacked as you would expect with additional y-axes.
Note that csv file you want to import must have the following format:
X;Y1;Y2;Label
1;1;2;one
2;1;3;two
3;3;4;three
4;1;2;four