Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAM-efficient Retrieval #23

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

RAM-efficient Retrieval #23

wants to merge 2 commits into from

Conversation

yasamanparhizkar
Copy link
Collaborator

PR Type

Feature

Short Description

Reduces RAM usage by the retrieval pipeline by:

  • Creating large matrices in mmlearn/modules/metrics/retrieval_recall.py in batches instead of all at once.
  • Using a single MetricCollection in mmlearn/tasks/zero_shot_retrieval.py for all modality-pairs to avoid duplicate tensors creations.
  • Added a progress bar during the calculation of retrieval recall@k. The progress bar is appreciated because this calculation takes a considerable amount of time.

TODOs:

  • This RAM efficiently might come at the cost of longer runtimes; more investigation is needed to ensure whether that's really the case.
  • This code directly changes functions in retrieval_recall.py and zero_shot_retrieval.py without providing an option to use the previous implementation. This option needs to be added.

Tests Added

You can run retrieval by:

mmlearn_run 'hydra.searchpath=[pkg://projects.med_benchmarking.configs]' +experiment=baseline experiment_name=test_eval job_type=eval [email protected]=ROCO datasets.test.split=test +datasets/[email protected]_fn.batch_processors.text=HFCLIPTokenizer +datasets/[email protected]=med_clip_vision_transform datasets.test.transform.job_type=eval dataloader.test.batch_size=32 dataloader.test.num_workers=4

However, W&B does not log RAM usage during recall@k calculation. You need to manually add logging lines and compare RAM usage of this implementation vs. the previous one.

@yasamanparhizkar yasamanparhizkar marked this pull request as draft October 5, 2024 01:50
@yasamanparhizkar yasamanparhizkar self-assigned this Oct 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant