Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an Experiment object to gather scenarios and run them in sequence #273

Closed
bowni opened this issue Nov 6, 2020 · 3 comments · Fixed by #275
Closed

Add an Experiment object to gather scenarios and run them in sequence #273

bowni opened this issue Nov 6, 2020 · 3 comments · Fixed by #275
Assignees
Labels
enhancement New feature or request

Comments

@bowni
Copy link
Member

bowni commented Nov 6, 2020

@RomainGoussault @arthurPignet what do you think of the following:

  • an Experiment object, which would essentially contain a list of Scenario instances (let's call it scenarios_list here)
  • a run() method which loops over the scenarios_list and run them one by one (or in parallel if someone knows how to parallelize computations... see Speed up computation with parallelization #49 if you have ideas about that)
  • as currently done in main.py it would add one level in the folders to gather results, corresponding to the Experiment instance (e.g. ./experiments/my_experiment/scenario_n/...)
  • additionnally we could have:
    • a nb_repeats parameter, as @RomainGoussault you had introduced
    • a name
    • a pre-validation of all scenarios, as we have in main.py
    • a results file with all scenarios results, as we have in main.py
    • a notebook that offers default tables and graphs on the results

It would enable to refactor main.py properly, to be much closer to the way the library is supposed to be used in a notebook (for an individual scenario or a serie of scenarios gathered in an Experiment).

@bowni bowni added the enhancement New feature or request label Nov 6, 2020
@arthurPignet
Copy link
Collaborator

arthurPignet commented Nov 6, 2020

I think that is it a great idea, the benchmark of many methods will become easier.

For the pre-validation of scenario, maybe it is time to move the split of the data between the partner in the scenario's init ?
Thus the process will be : init, validate, run

This object could also be the place to implement methods to compare directly scenarios

For the parallelize scenario computations, I already made a project using multiprocessing and I see 2 issues

  • The first issue is about the handling of datasets, as each process needs its own memory space. Computing 2 scenarios at the same time will require twice the memory in space, so we will quickly run out of space. It could maybe work with a shared dataset, but honestly I don't really know if the partners are deepcopying the dataset...
    The shared memory will implies that we compute in the same time only scenario with the same dataset.
    In fact the best thing to share can be the whole scenario, dataset and partners included. (Thus, we will parallelize mpl, or repetition)
  • The second issue is the parallelization on GPU. I never did that, and I am wondering how it works.

@RomainGoussault
Copy link
Member

I totally agree with the idea.

Actually maybe going one step further: if we implement that, do we still need to keep the old way of launching script with a config file ? What is the benefit of it compared to what you propose ? I don't see any.

@bowni
Copy link
Member Author

bowni commented Nov 6, 2020

The benefit is the mechanism to make the product of parameters to define the list of scenarios

@bowni bowni self-assigned this Nov 6, 2020
@bowni bowni mentioned this issue Nov 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants