You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
an Experiment object, which would essentially contain a list of Scenario instances (let's call it scenarios_list here)
a run() method which loops over the scenarios_list and run them one by one (or in parallel if someone knows how to parallelize computations... see Speed up computation with parallelization #49 if you have ideas about that)
as currently done in main.py it would add one level in the folders to gather results, corresponding to the Experiment instance (e.g. ./experiments/my_experiment/scenario_n/...)
a pre-validation of all scenarios, as we have in main.py
a results file with all scenarios results, as we have in main.py
a notebook that offers default tables and graphs on the results
It would enable to refactor main.py properly, to be much closer to the way the library is supposed to be used in a notebook (for an individual scenario or a serie of scenarios gathered in an Experiment).
The text was updated successfully, but these errors were encountered:
I think that is it a great idea, the benchmark of many methods will become easier.
For the pre-validation of scenario, maybe it is time to move the split of the data between the partner in the scenario's init ?
Thus the process will be : init, validate, run
This object could also be the place to implement methods to compare directly scenarios
For the parallelize scenario computations, I already made a project using multiprocessing and I see 2 issues
The first issue is about the handling of datasets, as each process needs its own memory space. Computing 2 scenarios at the same time will require twice the memory in space, so we will quickly run out of space. It could maybe work with a shared dataset, but honestly I don't really know if the partners are deepcopying the dataset...
The shared memory will implies that we compute in the same time only scenario with the same dataset.
In fact the best thing to share can be the whole scenario, dataset and partners included. (Thus, we will parallelize mpl, or repetition)
The second issue is the parallelization on GPU. I never did that, and I am wondering how it works.
Actually maybe going one step further: if we implement that, do we still need to keep the old way of launching script with a config file ? What is the benefit of it compared to what you propose ? I don't see any.
@RomainGoussault @arthurPignet what do you think of the following:
Experiment
object, which would essentially contain a list ofScenario
instances (let's call itscenarios_list
here)run()
method which loops over thescenarios_list
and run them one by one (or in parallel if someone knows how to parallelize computations... see Speed up computation with parallelization #49 if you have ideas about that)main.py
it would add one level in the folders to gather results, corresponding to theExperiment
instance (e.g../experiments/my_experiment/scenario_n/...
)nb_repeats
parameter, as @RomainGoussault you had introducedname
main.py
main.py
It would enable to refactor
main.py
properly, to be much closer to the way the library is supposed to be used in a notebook (for an individual scenario or a serie of scenarios gathered in anExperiment
).The text was updated successfully, but these errors were encountered: