Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reporting restructure #254

Open
wants to merge 28 commits into
base: main
Choose a base branch
from
Open

Reporting restructure #254

wants to merge 28 commits into from

Conversation

scarere
Copy link
Collaborator

@scarere scarere commented Oct 10, 2024

PR Type

Feature

Short Description

Clickup Ticket(s): Add Base Reporter Class

This PR restructures and unifies reporting. What was previously the metrics reporter is now JsonReporter. Both the JsonReporter and WandBReporter inherit from the same base class.

Servers and clients now accept any number of reporters and send data to them frequently (During setup, at the end of each round, each epoch and each batch). It is up to the reporter classes internal logic to determine how frequently to report data.

There is no longer any default reporters. Users will not have to pass None to the wandb reporter argument of clients and servers in order to turn wandb reporting off. Additionally if users want to generate the metrics json file frequently used for pytests, they must explicitly pass a JsonReporter to their server or client instance. Creating a JsonReporter instance with all its default arguments exactly mimics the behaviour of what was previously the metrics reporter (Except some of the keys have been changed). There is no longer any need to call dump to write the metrics, this has been moved into the server/client shutdown.

I also upgraded the wandb dependency.

I think there is some work to be done in terms of deciding how to structure what metrics and what to use for the key names. It be great if there was a bit of a more predictable pattern.

Tests Added

Had to modify a few tests to reflect the modified structure of the output metrics json. Also found two issues with the tests.

The MKMMD loss tests only work if your machine does not have a GPU. My temporary fix for this was to set devide to CPU. They still run in a few seconds. If GPU acceleration is needed then someone needs to go through and debug them. Created a ticket here

Tests that made use of both freeze_time and patch could not be run individually in some instances. It seems like somehow their imports were reliant on other tests in the test suite. This makes debugging tests a bit annoying because you have to run the entire test suite just to check whether or not you fixed one specific test that was failing. I created a ticket here

Copy link
Collaborator

@jewelltaylor jewelltaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All in all, the changes look great! I have a few minor suggestions, feel free to reply if any of them don't make sense. I will take one more pass through once you fix the merge conflicts and have a chance to go through the PR again. Then we can get this merged in early this week

@emersodb
Copy link
Collaborator

I'm not sure why these aren't failing the pre-commit checks, but it looks like there are a few places where the MetricsReporter class still exists on this branch. Namely:

  • fl4health/clients/model_merge_client.py
  • fl4health/server/model_merge_server.py
  • fl4health/server/adaptive_constraint_servers/ditto_server.py
  • fl4health/server/adaptive_constraint_servers/mrmtl_server.py
    It also appears to be mentioned in the comments of
  • tests/smoke_tests/run_smoke_test.py

Copy link
Collaborator

@emersodb emersodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, really like the changes. Just a few suggestions and questions associated with the various components.

fl4health/reporting/__init__.py Show resolved Hide resolved


class WandBReporter(BaseReporter):
def __init__(self, step_type: StepType | str = StepType.ROUND, **kwargs: Any) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I'm not a huge fan of relying on kwargs. I know it adds some flexibility with adding additional options and configurations without limiting what you can do. The reason is that you need a deeper understanding of the code to use them because the arguments are sort of hidden away somewhere in docs or code.

I imagine W and B uses kwargs. So I understand the use here, but if there are arguments that we know we want like "dir" or run_id, I'd rather that we make them explicit optional arguments here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I agree but in this instance I thought it'd be fine since all the kwargs go straight to the wandb.init function detailed here. I want to expose all the arguments but didn't want to unnecessarily clutter the wandbreporter init method

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think you're right. Perhaps we can just limit ourselves to the ones we absolutely expect to be there and the rest can be in kwargs. Perhaps we can also put a note in the documentation to where a user might find the args for the WandB object?

@@ -78,9 +78,12 @@ def main(config: Dict[str, Any], server_address: str) -> None:
loss_weight_patience=config["proximal_weight_patience"],
)

wandb_reporter = ServerWandBReporter.from_config(config)
wandb_reporter = WandBReporter("round", **config["reporting_config"])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why we're doing this with the kwargs implementation below. However, I'd recommend we limit the use of freeform arguments from unrolled dictionaries as much as possible if we can. It makes it harder for a user to know what to send along.

fl4health/clients/basic_client.py Show resolved Hide resolved
fl4health/clients/evaluate_client.py Outdated Show resolved Hide resolved
for r in self.reporters:
r.initialize(id=self.server_name)

def report_centralized_eval(self, history: History, num_rounds: int) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is perhaps something that might go into the ReporterManager as well along with the other state tracking that's being done below

fl4health/server/base_server.py Outdated Show resolved Hide resolved
tests/losses/test_mkmmd_loss.py Outdated Show resolved Hide resolved
DEFAULT_TOLERANCE = 0.0005


def assert_metrics_dict(metrics_to_assert: dict[str, Any], metrics_saved: dict[str, Any]) -> list[str]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a docstring here to help orient the user on what this is doing etc?

def assert_metrics_dict(metrics_to_assert: dict[str, Any], metrics_saved: dict[str, Any]) -> list[str]:
errors = []

def _assert(value: Any, saved_value: Any) -> Optional[str]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to make this an inline function rather than popping it out of here into it's own separate function above this method?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants