Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Next-gen batch runner #2321

Open
EwoutH opened this issue Sep 24, 2024 · 5 comments
Open

Next-gen batch runner #2321

EwoutH opened this issue Sep 24, 2024 · 5 comments

Comments

@EwoutH
Copy link
Member

EwoutH commented Sep 24, 2024

Objective

The goal of this proposal is to redesign the Mesa batch runner into a modular, flexible system that separates the batch run process into three stages: Preparation, Running, and Processing. The focus will be on the Preparation stage, where different experimental designs can be used to generate run configurations. These configurations will be encapsulated in a dataclass that includes the model class and all relevant parameters, ensuring reusability in the Running stage.

Design Overview

  1. Preparation Stage:
    • Use a dataclass (RunConfiguration) to store the model class, run parameters, and configuration details (e.g., max_steps, data_collection_period).
    • Implement different configuration generators (e.g., full factorial, sparse grids, manual) to allow for flexible experiment designs.
  2. Running Stage:
    • The batch runner will execute all configurations using multiprocessing when necessary. It will take a list of RunConfiguration objects and execute each run independently.
    • Results will be collected during execution and processed after all runs are completed.
  3. Processing Stage:
    • Results from the batch run will be processed into a usable format (e.g., a list of dictionaries, pandas DataFrames) for further analysis.

Key Components

1. RunConfiguration Dataclass

This dataclass stores all the information required to run a single configuration of the experiment.

from dataclasses import dataclass
from typing import Any, Dict

@dataclass
class RunConfiguration:
    model_cls: type[Model]
    run_id: int
    iteration: int
    parameters: Dict[str, Any]
    max_steps: int
    data_collection_period: int

2. Configuration Generators

Provide different strategies for generating configurations:

  • Full Factorial: Generate all combinations of parameters.
  • Sparse Grid: Sample a subset of parameter space.
  • Base case: Start with a reference scenario and vary parameters from there.
  • Manual Configuration: Allow users to specify configurations explicitly.

Each generator will output a list of RunConfiguration objects.

3. Batch Runner Class

The BatchRunner class will manage the execution of all runs using the RunConfiguration objects. It will handle multiprocessing, progress tracking, and result collection.

class BatchRunner:
    def __init__(
        self,
        configurations: List[RunConfiguration],
        number_processes: int | None = 1,
        display_progress: bool = True,
    ):
        self.configurations = configurations
        self.number_processes = number_processes
        self.display_progress = display_progress

    def run_all(self) -> List[Dict[str, Any]]:
        # Core logic to run all configurations in parallel or serially
@EwoutH EwoutH changed the title Next Next-Gen batch runner Sep 24, 2024
@EwoutH EwoutH changed the title Next-Gen batch runner Next-gen batch runner Sep 24, 2024
@EwoutH
Copy link
Member Author

EwoutH commented Sep 24, 2024

Experimental designs might be one of the most important new things to support. I encountered this library that might be useful:

@Corvince
Copy link
Contributor

Nice initiative! One thing to note is that the current batch_run function already is somewhat defined around these 3 stages. If you look at the code, it is composed of a "make_kwargs" function (corresponding to stage 1), a "run" function (stage 2.1) and a "collect" function (stage 2.2). Currently it just returns the collected data afterwards in a neutral format (dict), but I originally envisioned a further processing function or functions that do something useful with the result (stage 3).

So I think your vision aligns nicely with the current structure. And I agree that the most important area of improvement is stage 1 and a clear "run configuration" definition.

@quaquel
Copy link
Member

quaquel commented Sep 24, 2024

I like the conceptual design. I would however design it to be easy to extend / combine with whatever experimental design generator you want to use, rather than try and cover all of that ourselves. The same applies to the subsequent stages.

The motivation for this is that doing large-scale computational experimentation is its own can of worms and not, in my view, the core of the MESA library. It is easy to go overboard with trying to built on this into MESA, but making it less and less useful for others. To wit, last week I spoke with various people who use NetLogo and do large scale uncertainty quantification. None of them use Netlogo's behavior space but all use other packages that interface with NeLogo via java. So, it is more important in my view to establish a clean API for running a single experiment on a MESA model, then do design a very elaborate batch runner.

@Corvince
Copy link
Contributor

Agree with @quaquel, but I think this is somewhat in line with what @EwoutH was proposing, in my understanding. The RunConfiguration should be that interface. Which tools you use to generate it is up to you, but we provide some basic configuration generators.

Although maybe RunConfiguration should be split into ModelConfiguration and RunConfiguration. The former details how and individual model should look like, and the latter how it is run. So we create a list(?) of model configurations and then pass that to the run configuration which describes how stage 2 and 3 should be handled.

@quaquel
Copy link
Member

quaquel commented Sep 24, 2024

I guess there is a distinction between inputs to creating experiments and the individual experiments. To start with the latter, this can be as simple as a dict of key value pairs. Typically this will be passed directly to the __init__ of the model.

The other is more subtle, and I lack a good name for it. It is basically the parameter space and some density function over this space. in the simplest case, this space is bounded, the axes are orthogonal to one another (i.e., they are independent, that is, there are no correlations), and you assume a uniform distribution over the space (so all points are equally likely). Each of these assumptions can be relaxed but they make your life increasingly more difficult. Moreover, you have to specify how you want to sample points from this space (monte carlo, LHS, some factorial design, etc.), and you have to specify how many points you want to sample. All this interacts in a messy way. For example, if you have a factorial design, you normally specify the number of points on each dimension. If you have a Monte Carlo sampler, you specify how many points in total you wan want to sample.

Given all this, you have either a collection of experiments or an experiment generator. This you pass to the runner, which then executes them (potentially in parallel). It is only the last task that is properly the batch runner. The rest is the design of experiments.

an additional minor concern in that you typically want to run each experiment for multiple seeds. You can collapse the seed into the experiment or delay it and let it be handled by the batch runner. Regardless, you need to track the seed number, of course, for replication purposes of each experiment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants