Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

having trouble converting opus-mt model into mlmodel format #22

Open
harrylyf opened this issue Apr 25, 2021 · 0 comments
Open

having trouble converting opus-mt model into mlmodel format #22

harrylyf opened this issue Apr 25, 2021 · 0 comments

Comments

@harrylyf
Copy link

harrylyf commented Apr 25, 2021

Hi,

I am working on a school project about translation tasks on iOS using Core ML. The model I am using is opus-mt-en-zh. I've already tested it on python environment through this code:

from transformers import AutoTokenizer, AutoModelWithLMHead
import torch
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-zh")
model = AutoModelWithLMHead.from_pretrained("Helsinki-NLP/opus-mt-en-zh", torchscript=True)
test_case = ["My name is Wolfgang and I live in Berlin", "Hello world!", "This is interesting."]
encoded = tokenizer.prepare_seq2seq_batch(test_case, return_tensors='pt')
translated = model.generate(**encoded)
tokenizer.batch_decode(translated, skip_special_tokens=True)
# Output: ['我叫沃尔夫冈 我住在柏林', '哈罗,世界好!', '这很有趣。']

I want to recreate the same process on iOS devices like you did for question answering and text generation. However, I am having trouble converting it into mlmodel format. Based on coremltools' guideline, it requires "inputs" for PyTorch conversion. Given that the input size varies due to various sentences' length, I am not sure how to handle it here.

In addition, suppose that the model has been converted, do I also need to write my own tokenizer class or is there any convenient way of doing it? Or do you happen to have the existing model or know some repositories which do the same thing?

Thank you in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant