Skip to content

Commit

Permalink
wrapping up the docs
Browse files Browse the repository at this point in the history
  • Loading branch information
anadis504 committed May 22, 2023
1 parent ece82d6 commit 13e4d1e
Show file tree
Hide file tree
Showing 7 changed files with 17 additions and 12 deletions.
4 changes: 4 additions & 0 deletions .github/workflows/data-parser-release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -72,3 +72,7 @@ jobs:
permissions:
contents: write


# TODO:
# codesign --force -s - target/release/tmc-langs-cli for the mac release
# add this mac sign to gh actions and run the compilation
19 changes: 9 additions & 10 deletions data-parser/README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@
# Parsing the collected submissions on courses.mooc.fi

The output of the data-parser is a .csv file containing only answers to the `DOGS FACTORIAL ANALYSIS SURVEY` exercise types. The file will contain answers submitted **after** 22.05.2023 due to the latest format of the answers.
The output of the data-parser is a .csv file containing only answers to the `DOGS FACTORIAL ANALYSIS SURVEY` exercise types. The file will contain answers submitted **after** 22.05.2023 due to the latest format. The separator used in the .csv file is the semicolon `;`.

## Dataset layout

The file contains columns `user_id, name, email`, followed by a column per `questionLabel` existing in the course. Empty submissions (not answered questions) have empty entry-points.

## Multiple-choice questions

An exception to the above format are the multiple-choice questions. These questions are represented in the dataset as `questionLabel option` column per option that may be selected. The user answer is then represented as 1 for chosen option, 0 for not chosen option. If the user has not answered the given question at all, the fields are empty (null).
An exception to the above format are the multiple-choice questions. These questions are represented in the dataset as `"questionLabel option"` column per option that may be selected. The user answer is then represented as 1 for chosen option, 0 for not chosen option. If the user has not answered the given question at all, the fields are empty (null).

For submissions being collected across different _language versions_ it is adviced to `label` the multiple-choice options in the same manner as the questions. This allows easier combining of datasets from the different language courses, having the same column headers. The format is `label ; option text` where the text on the left-hand side of the semicolon `;` is used as the column header in the resulting dataset, while the text on the right is what is shown to the survey user. Only the first semicolon will be used as a separator, meaning the option text may contain arbitrary amount of semicolons if needed.
For submissions being collected across different _language versions_ it is adviced to `label` the multiple-choice options in the same manner as the questions. This allows easier combining of datasets from the different language courses, having the same column headers. The format is `label ; option text` where the text on the left-hand side of the semicolon `;` is used as the column header in the resulting dataset, while the text on the right is what is shown to the survey user. Only the first semicolon will be used as a separator, meaning the option text may contain arbitrary amount of semicolons if needed. In case no semicolon is found the full option text is used as the column header.

## Using the parser

In order to parse the collected submissions you need to download the files from the main course management page on courses.mooc.fi. The links to download the files are shown at the bottom of the picture

![picture of the main management page with link to downloading files in the bottom](../docs/imgs/data-parser/Download-files.png)
<img src="../docs/imgs/data-parser/Download-files.png" height=500>

The csv file for course instances is not used in the process and may be skipped. The needed files are:

Expand All @@ -28,9 +28,9 @@ To download the data-parser go to github release page https://github.com/rage/fa

The parser expects folder named `data` to contain the downloaded .csv files and being located in the same folder as it self. This is the directory structure:

![directory structure](../docs/imgs/data-parser/dir-struct.png)
<img src="../docs/imgs/data-parser/dir-struct.png" width=500>

where the green `main` is the executable in question.
where the green `main` is the executable program in question (will probably be called `main-[name of you os]-latest`).
The parser will use the latest versions of the .csv files if there are several versions available in the `data` folder as in the above example.

Run the parser with
Expand All @@ -39,7 +39,7 @@ Run the parser with
from the directory. The parser will create a `parsed-outputs` folder with the resulting .csv file:

![result](../docs/imgs/data-parser/dir-with-output-dir.png)
<img src="../docs/imgs/data-parser/dir-with-output-dir.png" width=500>

## Executing on Cubbli machine using VMware Horizen Client from your browser

Expand All @@ -55,14 +55,13 @@ Choose the `Cubbli Linux` desktop:

Download the files and and the executable as explained above.

> Open a browser in the VMware Client in you browser, remember you are accessing your helsinki Cubbli desktop through your bowser. If you are on a Mac with Finnish keyboard layout, for some reason the client may interpret it as another layout (US, or English). Try to navigate on your keyboard using the http://www.macfreek.nl/memory/Mac_Keyboard_Layout if you encounter this issue.
> ![ISO keyboard](../docs/imgs/data-parser/ISO-keyboard.png)
> Open a browser in the VMware Client in you browser, remember you are accessing your helsinki Cubbli desktop through your bowser. Your keyboard may also be different layout than you are used to. Search for `Keyboard` in the menu and change the `Layout` to the wanted one. (For Finnish Layout you may also just run the command `setxkbmap fi` in the Konsole)
Choose the `main-ubuntu-latest` executable from the github release page:

![ubuntu executable](../docs/imgs/data-parser/binary-download.png)

Open up a `Konsole` (search for `Konsole` in the menu). Create a new folder where you are going to work with your files. Move the executable file to the folder. Additionall, create a subfolder named `data` and move all the downloaded .csv files there. In the `Konsole`, navigate to the folder with the executable file and the `data` folder using the `cd` (change directory) command:
Open up a `Konsole` (search for `Konsole` in the menu). Create a new folder where you are going to work with your files. Move the executable file to the folder. Additionally, create a subfolder named `data` and move all the downloaded .csv files there. In the `Konsole`, navigate to the folder with the executable file and the `data` folder using the `cd` (change directory) command:

![navigate to the given directory](../docs/imgs/data-parser/dir-navigate.png)

Expand Down
4 changes: 2 additions & 2 deletions data-parser/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ def flatten(xs):
try:
submissions = (pl.read_csv(join('./data/', submission_files[0]))
# remove outdated format
.filter(pl.col('created_at') >= '2023-03-03'))
.filter(pl.col('created_at') >= '2023-05-22'))
except OSError as error:
print(error)

Expand Down Expand Up @@ -190,4 +190,4 @@ def flatten(xs):

dt = datetime.now().strftime('%d-%m-%Y %H:%M:%S')
filename = f'./parsed-outputs/Submissions {dt}.csv'
user_details.write_csv(filename, has_header=True, quote='"', null_value='', separator=';')
user_details.write_csv(filename, has_header=True, quote='"', null_value='', separator=';')
Binary file modified docs/imgs/data-parser/Download-files.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/imgs/data-parser/control-flow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/imgs/data-parser/dir-navigate.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 2 additions & 0 deletions docs/usermanual.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,8 @@ Be careful with the no whitespaces between `"` (double-quote) and the preceding

By clicking the "duplicate item" button, a new survey item will be inserted below containing the same options and of the same answer-type. Once the question label is defined, the answer-selector will appear. You can freely switch between answer-types that contain options without losing your list of options. Choosing an answer type that does not contain options (text, number, date) will clear the list of options.

`multiple-choice` options can also be labelled the same way as questions (`option_label ; arbitrary option text`), see the [parser documentation](../data-parser/README.md#multiple-choice-questions).

<p>&nbsp;</p>

### Making questions render conditionally
Expand Down

0 comments on commit 13e4d1e

Please sign in to comment.