We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
To save compute. Another hard issue :)
The text was updated successfully, but these errors were encountered:
A sketch of how this could work.
Add an option in dpo_tune where instead of using concatenated_forward, we run just forward for each with an optional save of the logprobs.
dpo_tune
open-instruct/open_instruct/dpo_tune.py
Line 568 in 42c1fa3
Then, you iterate over batches and compute loss and update the model.
Optional: logic to move one model into cuda at a time. Shouldn't be too hard.
Sorry, something went wrong.
Yeah, sounds about right. It's very easy to implement, I did it in EasyLM (although sharding issues mean its broken), but the logic should be right: https://github.com/hamishivi/EasyLM/blob/main/EasyLM/models/llama/llama_train_dpo.py#L372-L400
No branches or pull requests
To save compute.
Another hard issue :)
The text was updated successfully, but these errors were encountered: