This transformation job example takes the CSV files as an input and merge them together.
Operating mode: Data item (--in-directory
)
Parameters:
[
{
"name": "CSV files",
"type": "string",
"param": "files",
"value": "",
"help": "CSV files to merge, separated by coma"
},
{
"name": "Merge key",
"type": "string",
"param": "key",
"value": "",
"help": "CSV files to merge, separated by coma"
},
{
"name": "Keep all columns?",
"value": false,
"type": "boolean",
"param": "keep_all",
"help": "Keep all columns"
},
{
"name": "Columns",
"value": "x,y,z",
"type": "string",
"param": "columns",
"help": "Columns to keep, separated by comas",
"showIf": {
"parameter": "keep_all",
"operator": "eq",
"value": "false"
}
},
{
"name": "Join method",
"type": "select",
"valid": [
"outer",
"inner",
"left",
"right"
],
"param": "join",
"value": "outer",
"help": "How to join the files"
},
{
"name": "Define custom filename",
"value": false,
"type": "boolean",
"param": "rename",
"help": "By default, the first word of the folder will be used"
},
{
"name": "filename",
"value": "merged",
"type": "string",
"param": "filename",
"help": "filename without extension",
"showIf": {
"parameter": "rename",
"operator": "eq",
"value": "true"
}
}
]
The dataset used to test this transformation is accessible here: https://www.kaggle.com/datasets/luisomoreau/activity-detection.
Install the dependencies:
pip3 install -r requirement.txt
Run the script:
python transform.py --in-directory ../dataset/Cycling-2023-09-14_06-33-47 --files Accelerometer.csv,Gyroscope.csv --out-directory output --key time --join outer --rename False --keep_all false --columns x,y,z