Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding fixes so transducer can work again #247
base: master
Are you sure you want to change the base?
adding fixes so transducer can work again #247
Changes from 3 commits
a54518b
6955bfb
9e6cded
7946077
a9a91ca
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why
Any
and notstr
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically the edit actions can accept any hashable symbol. So strings, ints, even tuples. So any is a better representation of its coverage given that maxwell is also symbol agnostic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we start an issue to document exactly what's going on here?
encode_actions
converts vocab intoAction
s and stored them in a separate vocabulary, right?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it nees to track all potential edit actions for a given symbol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, but can you remove the hard line breaks here? That would cause problems with tools like Sphinx that can parse args lists in docstrings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the best way to preserve formatting then? I want to have something along the lines of a case statement for readibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't within the arglist, so move it out of the arglist. Something like this:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a comment here or in the README about what behavior a 'dummy' expert entails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed dummy expert. Not important for this go around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a more general comment here before this for the unfamiliar user: this basically converts the dataset into a format that is usable by the Expert -- is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We just want the raw symbols string. We don't want to worry about BOS and EOS being encoded by maxwell.
What comment would make sense to you? (I'm too familiar for a good user friendly doc string.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are exceptions elsewhere, why is this an assertion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more of a sanity check thing to me. Makes more sense to write a single assert line than defining an error class, doing a coinditional check, then raising. Just an economy thing I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I understanding correctly that in order to use an existing sed, the user would specify
0
orNone
fororacle_em_epochs
? Avoiding rerunning em every time is a great feature, but I wonder if we can think through a more intuitive user interface for it, this feels a bit buried. Maybe you've already put some thought into this though so please let me know if you think this is already a good interface.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry I see the comment in the method header. I missed it before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just added a bool to the train script that's triggered if a sed file already exists. I think this is a bit cleaner. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the "two steps" referred to here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's from Adam's PR, no comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, let's remove it then since we don't know what it means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is copied from #233
iirc I meant that looping and stacking predictions, and calling
util.pad_tensor_after_eos
may do some redundant things that could be cleaned up at some point.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So should we leave the comment to clean up later or just remove now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So now we orchestrate the vocab sizes once in the trainer, which already has a handle on the initialized expert? This is much nicer than w/e I was trying to do before.
It still somehow feels clunky I think, but that is an effect of the single embeddings matrix updates not easily aligning to the expert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's still a bit clunky but at least the clunkiness is occuring in the train script (which is localized clumsiness). Probably a good next issue is to break up the train script some more so it's less of a monolith.