Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(docs): update commands to use package name #537

Merged
merged 1 commit into from
Nov 8, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 23 additions & 9 deletions model_training/cmd_instruction.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,19 @@
# Manual Metric Collection and Training with Entrypoint

<!--toc:start-->

- [Manual Metric Collection and Training with Entrypoint](#manual-metric-collection-and-training-with-entrypoint)
- [1. Collect metrics](#1-collect-metrics)
- [1.1. By defining start time and end time](#11-by-defining-start-time-and-end-time)
- [1.2. By defining last interval from the execution time](#12-by-defining-last-interval-from-the-execution-time)
- [Output](#output)
- [2. Train models](#2-train-models)
- [3. Export models](#3-export-models)
- [3.1. Extracting collect date from benchmark file](#31-extracting-collect-date-from-benchmark-file)
- [3.2. Manually set collect-date](#32-manually-set-collect-date)

<!--toc:end-->

## 1. Collect metrics

Without benchmark/pipeline automation, kepler metrics can be collected by `query` function by setting `BENCHMARK`, `PROM_URL`, `COLLECT_ID` and either one of the following time options.
Expand All @@ -12,25 +26,25 @@ PROM_URL= # e.g., http://localhost:9090
COLLECT_ID= # any unique id e.g., machine name
```

### 1.1. by defining start time and end time
### 1.1. By defining start time and end time

```bash
# time value setting
START_TIME= # format date +%Y-%m-%dT%H:%M:%SZ
END_TIME= # format date +%Y-%m-%dT%H:%M:%SZ

# query execution
DATAPATH=/path/to/workspace python cmd/main.py query --benchmark $BENCHMARK --server $PROM_URL --output kepler_query --start-time $START_TIME --end-time $END_TIME --id $COLLECT_ID
DATAPATH=/path/to/workspace kepler-model query --benchmark $BENCHMARK --server $PROM_URL --output kepler_query --start-time $START_TIME --end-time $END_TIME --id $COLLECT_ID
```

### 1.2. by defining last interval from the execution time
### 1.2. By defining last interval from the execution time

```bash
# time value setting
INTERVAL= # in second

# query execution
DATAPATH=/path/to/workspace python cmd/main.py query --benchmark $BENCHMARK --server $PROM_URL --output kepler_query --interval $INTERVAL --id $COLLECT_ID
DATAPATH=/path/to/workspace kepler-model query --benchmark $BENCHMARK --server $PROM_URL --output kepler_query --interval $INTERVAL --id $COLLECT_ID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also remove all references to cmd/main.py as well through out the project (in a different PR) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure 💯

```

### Output
Expand All @@ -49,7 +63,7 @@ PIPELINE_NAME= # any unique name for the pipeline (one pipeline can be accumulat

# train execution
# require COLLECT_ID from collect step
DATAPATH=/path/to/workspace MODEL_PATH=/path/to/workspace python cmd/main.py train --pipeline-name $PIPELINE_NAME --input kepler_query --id $COLLECT_ID
DATAPATH=/path/to/workspace MODEL_PATH=/path/to/workspace kepler-model train --pipeline-name $PIPELINE_NAME --input kepler_query --id $COLLECT_ID
```

## 3. Export models
Expand All @@ -61,18 +75,18 @@ EXPORT_PATH= # /path/to/kepler-model-db/models
PUBLISHER= # github account of publisher
```

### 3.1. extracting collect date from benchmark file
### 3.1. Extracting collect date from benchmark file

The benchmark file is created by CPE operator or by query function from step 1.

```bash
# export execution
# require BENCHMARK from collect step
# require PIPELINE_NAME from train step
DATAPATH=/path/to/workspace MODEL_PATH=/path/to/workspace python cmd/main.py export --benchmark $BENCHMARK --pipeline-name $PIPELINE_NAME -o $EXPORT_PATH --publisher $PUBLISHER --zip=true
DATAPATH=/path/to/workspace MODEL_PATH=/path/to/workspace kepler-model export --benchmark $BENCHMARK --pipeline-name $PIPELINE_NAME -o $EXPORT_PATH --publisher $PUBLISHER --zip=true
```

### 3.2. manually set collect-date
### 3.2. Manually set collect-date

If the data is collected by tekton, there is no benchmark file created. Need to manually set `--collect-date` instead of `--benchmark` parameter.

Expand All @@ -83,5 +97,5 @@ COLLECT_DATE= # collect date
# export execution
# require BENCHMARK from collect step
# require PIPELINE_NAME from train step
DATAPATH=/path/to/workspace MODEL_PATH=/path/to/workspace python cmd/main.py export --pipeline-name $PIPELINE_NAME -o $EXPORT_PATH --publisher $PUBLISHER --zip=true --collect-date $COLLECT_DATE
DATAPATH=/path/to/workspace MODEL_PATH=/path/to/workspace kepler-model export --pipeline-name $PIPELINE_NAME -o $EXPORT_PATH --publisher $PUBLISHER --zip=true --collect-date $COLLECT_DATE
```