Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyTorch module #2

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

PyTorch module #2

wants to merge 5 commits into from

Conversation

bsircelj
Copy link

Two major updates.

  • Selecting "torch" as a model will build predefined wrapper class for PyTorch linear ANN model which behaves as other sklearn models with fit and predict functions.
  • Time offset between now includes other possibilities than hours.

Copy link
Collaborator

@klemenkenda klemenkenda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great pull request making the code much more consistent. Some minor changes are requested.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
| **retrain_period**| integer | None | A number of received samples after which the model will be re-trained. This is an optional parameter. If it is not specified no re-training will be done.|
| **samples_for_retrain**| integer | None | A number of samples that will be used for re-training. If retrain_period is not specified this parameter will be ignored. This is an optional parameter. If it is not specified (and retrain_period is) the re-train will be done on all samples received since the component was started.|
| **time_offset**| string | H | [String alias](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases) to define the data time offsets. The aliases used in training and topic names are lowercase for backwards compatibility.|
| **learning_rate**| float| 4E-5 | Learning rate for the torch model.|
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate PyTorch specific configuration parameters.

README.md Outdated
| **evaluation_split_point**| float | 0.8 | Define training and testing splitting point in the dataset, for model evaluation during learning phase (fit takes twice as long time).|
| **retrain_period**| integer | None | A number of received samples after which the model will be re-trained. This is an optional parameter. If it is not specified no re-training will be done.|
| **samples_for_retrain**| integer | None | A number of samples that will be used for re-training. If retrain_period is not specified this parameter will be ignored. This is an optional parameter. If it is not specified (and retrain_period is) the re-train will be done on all samples received since the component was started.|
| **time_offset**| string | H | [String alias](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases) to define the data time offsets. The aliases used in training and topic names are lowercase for backwards compatibility.|
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not very clear to me since it is new, does it affect ScikitLearn models?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just the unit for prediction_horizons? If so - move these descriptions closer together.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It is as you said a unit for prediction_horizons. I will also write about time_offset in prediction_horizons and move them together.

The sk-learn models can now also use different time offsets but if nothing is specified it defaults to hours for backwards compatibility.

@@ -94,7 +102,7 @@ Alternetively, process managers like `PM2` or `pman` would be a better fit for t

## Requirements

* Python 3.6+
* Python 3.9+
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 3.9?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.9 is the version which I have installed and the version on which the code was tested. The sk-learn code should work on both versions as I didn't change that but I can't guarantee for any of new pip packages used. Do I find compatible versions of packages for 3.6 python and update this also along with requirements.txt?

"""

def __init__(self, algorithm, sensor, prediction_horizon, evaluation_periode, error_metrics, split_point,
retrain_period = None, samples_for_retrain = None, retrain_file_location = None):
def rmse(self, true, pred):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a file called regression_metrics.py. We should include this in there?

src/lib/predictive_model.py Outdated Show resolved Hide resolved
#print "Loaded model from", filename
# print "Loaded model from", filename

class TorchNetwork:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we put this in a separate file?

src/main.py Outdated
port= 3001
path= "/ping?id=5&secret=b9347c25aba4d3ba6e8f61d05fd1c011"
port = 3001
path = "/ping?id=5&secret=b9347c25aba4d3ba6e8f61d05fd1c011"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put secret in some config file?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the purpose of the watchdog in this module but I moved all the hardcoded parameters to the config file.

src/main.py Outdated
timestamp = rec['timestamp']
ftr_vector = rec['ftr_vector']
measurement = ftr_vector[0] # first feature is the target measurement
# try:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try-catch was there for a reason ... probably because of potential crash due to misformatted messages. Please add it back.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that was an oversight. I wanted full error stack while debugging so I just put the try block in comments and forgot to uncomment after will fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants