-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
M3SA - Multi-Meta-Model Simulation Analyzer #251
M3SA - Multi-Meta-Model Simulation Analyzer #251
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several things that need to be updated for this PR to be accepted. I have divided this review into two parts. First, I will discuss things that need to be updated for me to accept this PR. Second, I will discuss some things I strongly suggest improving, but will not stop the PR from being accepted. I do however expect you to take a look at them, especially the easy to fix comments. I have also left many comments in the files.
Mandatory
- Rename opendc-m3sa to opendc-experiment-m3sa so it adheres to the overall naming scheme of OpenDC.
- [RN] Done
- There are multiple files that are changed in this PR that should not have been touched at all. For instance, all the package.json and package-lock.json files should not be changed. This is also why the docker is not building at the moment.
- [RN] Done
- ExperimentCli, and Scenarios should not be changed for the functionalities m3sa currently brings. m3sa should have its own Cli that can be executed separately. m3sa should create Scenarios to run, and execute them using ScenarioRunner.
- [RN] Done - not changed anymore. M3SACli and ExperimentCli are now, perhaps temporary, two separate entities.
- It should be much clearer how m3sa currently runs. I have been going through the Python code, and it is not clear to me what the order of execution is, where the actual simulation is being executed etc.
- [RN] Not clear to me what is the mandatory part regarding (4). I'll assume it's documentation, which is (already) done -- input guideline, readme of m3sa, and integration tutorial.
- Add a requirements.txt
- [RN] Done, you can find it in the root folder of the python code
- Add a README to the m3sa experiment in which you detail how to run m3sa
- [RN] (Already) Done
Suggestions
I will leave the Python files to you. However, I do have some suggestions:
- You are executing a lot of functions in the init of your classes. For instance, MetaModel computes everything and creates all outputs. This is not common in Python, and makes it difficult to understand what is happening.
- Not everything has to happen inside a class. This is not Java, you can make separate functions.
- Please take a look at Pandas and NumPy functions. There are multiple functions you have written yourself that are already implemented by either Pandas or NumPy. These will always be faster than whatever we can write in Python.
- Adding Typing could greatly improve your Python code. Currently, users either need a good understanding of the code to understand what is expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does experiment-base implement m3sa?
As far as I understand, m3sa uses base, not the other way around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both m3sa and experiment-base use, in one way or another, each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In which way does experiment-base use m3sa?
Also this would be circular importing, which is not possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, Experiment.kt should not be used by m3sa, and thus not changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's see if we have a better alternative -- how would you suggest to make this change? what code to add in which file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[TOTALK#1]
...experiments-base/src/main/kotlin/org/opendc/experiments/base/scenario/ExperimentFactories.kt
Outdated
Show resolved
Hide resolved
settings.gradle.kts
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename the module to opendc-experiment-m3sa, like the other moduiles
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
opendc-experiment-m3sa is inaccurate and redundant. similarly to opendc-experiment-base, opendc-experiment-faas, opendc-experiment-tf20.
these should all be changed and remove experiment.
=> opendc-tf20, opendc-base, opendc-faas, opendc-m3sa.
For the sake of consistency, I will rename to opendc-experiments-m3sa. Consider that redundant and inaccurate namings diminish the codebase readability and overall quality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[TODO#4]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that this might not be a somewhat older style of naming things (and thus redundant).
However, that is something to solve in the future (if we want).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's do that in the next PR. I'll handle.
@@ -79,6 +79,7 @@ public fun getExperiment(experimentSpec: ExperimentSpec): List<Scenario> { | |||
carbonTracePath = scenarioSpec.carbonTracePath, | |||
exportModelSpec = scenarioSpec.exportModel, | |||
outputFolder = outputFolder, | |||
m3saSetup = experimentSpec.m3saSetup, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is m3saSetup ever used when running a scenario?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, m3saSetup is used in scenario-post run. triggers and links the scenario with m3sa
|
||
self.max_model_len = min([len(model.raw_sim_data) for model in self.models]) | ||
|
||
def compute_windowed_aggregation(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pandas or NumPy should have functions that do this for you.
There will always be more performant than what you can write in Python
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I see. How to adapt this?
numeric_values = model.raw_sim_data | ||
model.processed_sim_data = self.mean_of_chunks(numeric_values, self.window_size) | ||
|
||
def generate_plot(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would this function not be called with the plot type?
It is not clear to me where this.plot_type is set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I thought the idea was to automatically create multiple plots?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, not really. this depends on the user -- the plotting process takes time as well, right? Why wasting time on plotting if the user may want to only get the data of the metamodel?
see more in the paper §4, the Meta-Model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the plot_type is set by the user -- either time_series, cumulative, cumulative_time_series.
For more details, have a look at NFRs for §3 and §4 of the paper for better clarification. alternatively, have a look at the documentation.
the 'parse_trackr' method can be called to load additional experimental details from a corresponding JSON file. | ||
""" | ||
|
||
def __init__(self, raw_sim_data, id, path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just turn this into a DataClass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this class seems to be purely holding data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. How would we benefit from this, @DanteNiewenhuis?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[TODO#1] read about DataClasses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solved!
return input_json | ||
|
||
|
||
def find_root_dir(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fairly dangerous code.
The chance of this failing is very high.
For instance, what if I have multiple README.md files in my project?
What if I run m3sa from another folder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- if you have multiple README.md files, in the same folder (i.e., the root folder), you break the law of operating systems :))
- I agree this can be improved -- what better ways to ensure we are in the root folder do we have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[TOTALK#3]
✅ Good: "host", "simulation_data", "cats_predictions" | ||
❌ Wrong: "host.json", "opendc/folder_x/folder_y/data" | ||
""" | ||
SIMULATION_DATA_FILE = "host" # opendc outputs in file host.parquet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file should be removed.
Given that this is the integrated version of m3sa, it can by definition not be used by other simulators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true. will remove, nice catch.
[TODO#2]
…rror the opendc master branch
…ulti CPUs are combined into one. This is for performance and explainability. (atlarge-research#255) (#37) Co-authored-by: Dante Niewenhuis <[email protected]>
…ulti CPUs are combined into one. This is for performance and explainability. (atlarge-research#255) (#38) Co-authored-by: Dante Niewenhuis <[email protected]>
…ulti CPUs are combined into one. This is for performance and explainability. (atlarge-research#255) (#39) Co-authored-by: Dante Niewenhuis <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m3saSetup in ExperimentSpec -- non mandatory to setup, but part of an experiment, used when parsing the JSON file. this is part of the JSON file parser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR is better than the previous version, but I still have some issues.
First, you are still changing things in the ExperimentSpecs. This should not be needed.
Second, As far as I can see, the Python files have not been changed. This means that I will not understand what is happening.
Finally, it seems like there is no pre-processing being done. As far as I understand, the multi/meta model creates multiple models from the given input. I cannot find any such pre processing in the code at the moment.
@Serializable | ||
public data class ExperimentSpec( | ||
var id: Int = -1, | ||
var name: String = "", | ||
val outputFolder: String = "output", | ||
val initialSeed: Int = 0, | ||
val runs: Int = 1, | ||
val m3saSetup: String = "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this defined in the ExperimentSpec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can see, this file is not needed. You can just import runExperiment from experiment-base
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to do something different here at some point, you can create it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this have to be a different file than just the ExperimentFactory?
| failureModels | List[[FailureModel](#failuremodel)] | no | empty | List of failure models to simulate various types of failures. | | ||
| exportModels | List[[ExportModel](#exportmodel)] | no | empty | Specifications for exporting data from the simulation. | | ||
| carbonTracePaths | List[string] | no | null | Paths to carbon footprint trace files. | | ||
| m3saSetup | string | no | null | Path to the cofigurator file of M3SA. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I'll revert it to the current scenario, from the master branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These versions should not be changed, Could you please update this and push again?
site/package-lock.json
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you remove this file from your push?
eb50d80
to
82ca406
Compare
Summary
M3SA - Multi-Meta-Model Simulation Analysis. Full functionality implemented and integrated into OpenDC.
Multi-Model
Meta-Model
M3SA
-a
when running OpenDC, via
ScenarioCli.kt
.
Paper
M3SA is part of the research efforts of AtLarge Research Team, towards massivizing computer systems. We conduct scientific research and follow the AtLarge Design Vision. We design, analyze, and engineer the M3SA system, towards performance, accuracy, universality, and usability.
M3SA: Exploring the Performance and Climate Impact of Datacenters by Multi-Model Simulation and Analysis
Authors: Radu Nicolae (@Radu-Nicolae), Alexandru Iosup (@AlexandruIosup), Dante Niewenhuis (@DanteNiewenhuis)
External Dependencies 🍀
N/A
Breaking API Changes⚠️
N/A
Simply specify none (N/A) if not applicable.