Giter VIP home page Giter VIP logo

swift-coreml-transformers's Introduction

This repo is not actively maintained and has been archived. For an in-development replacement, please head over to swift-transformers!

Swift Core ML implementations of Transformers: GPT-2, DistilGPT-2, BERT, DistilBERT, more coming soon!

This repository contains:

  • For BERT and DistilBERT:
    • pretrained Google BERT and Hugging Face DistilBERT models fine-tuned for Question answering on the SQuAD dataset.
    • Swift implementations of the BERT tokenizer (BasicTokenizer and WordpieceTokenizer) and SQuAD dataset parsing utilities.
    • A neat demo question answering app.
  • For GPT-2 and DistilGPT-2:
    • a conversion script from PyTorch trained GPT-2 models (see our transformers repo) to CoreML models.
    • The GPT-2 generation model itself, including decoding strategies (greedy and TopK are currently implemented) and GPT-2 Byte-pair encoder and decoder.
    • A neat demo app showcasing on-device text generation.

πŸ¦„ GPT-2 and DistilGPT-2

Unleash the full power of text generation with GPT-2 on device!!

demo

🐸 BERT and DistilBERT

The BERTSQUADFP16 Core ML model was packaged by Apple and is linked from the main ML models page. It was demoed at WWDC 2019 as part of the Core ML 3 launch.

The DistilBERT Core ML models were converted from πŸ€—/transformers exports using the scripts in this repo.

core ml 3

πŸ¦„ Demo Time πŸ”₯

demo

Apple demo at WWDC 2019

wwdc demo

full video here

BERT Architecture (wwdc slide)

bert

Notes

We use git-lfs to store large model files and it is required to obtain some of the files the app needs to run. See how to install git-lfson the installation page

swift-coreml-transformers's People

Contributors

bilal2vec avatar hollance avatar julien-c avatar lysandrejik avatar revanthkausikan avatar vaibhavs10 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

swift-coreml-transformers's Issues

having trouble converting opus-mt model into mlmodel format

Hi,

I am working on a school project about translation tasks on iOS using Core ML. The model I am using is opus-mt-en-zh. I've already tested it on python environment through this code:

from transformers import AutoTokenizer, AutoModelWithLMHead
import torch
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-zh")
model = AutoModelWithLMHead.from_pretrained("Helsinki-NLP/opus-mt-en-zh", torchscript=True)
test_case = ["My name is Wolfgang and I live in Berlin", "Hello world!", "This is interesting."]
encoded = tokenizer.prepare_seq2seq_batch(test_case, return_tensors='pt')
translated = model.generate(**encoded)
tokenizer.batch_decode(translated, skip_special_tokens=True)
# Output: ['ζˆ‘ε«ζ²ƒε°”ε€«ε†ˆ ζˆ‘δ½εœ¨ζŸζž—', 'ε“ˆη½—,δΈ–η•Œε₯½!', 'θΏ™εΎˆζœ‰θΆ£γ€‚']

I want to recreate the same process on iOS devices like you did for question answering and text generation. However, I am having trouble converting it into mlmodel format. Based on coremltools' guideline, it requires "inputs" for PyTorch conversion. Given that the input size varies due to various sentences' length, I am not sure how to handle it here.

In addition, suppose that the model has been converted, do I also need to write my own tokenizer class or is there any convenient way of doing it? Or do you happen to have the existing model or know some repositories which do the same thing?

Thank you in advance!

Using the "past" flag in CoreML?

I was just reading about the past flag, for caching state, and wondered whether it's possible to use that in CoreML?
Also, is this repo still active, or has CoreML 4 (and the new pytorch->CoreML conversion tools) kind of eclipsed the work being done here? I'm using your GPT2, trained on custom data, and would be curious about experimenting with XLNet, but activity here seems pretty quiet... ?

distilbert-onnx-coreml.py Segmentation fault attempting MLM version

I've trained a version of DistilBertForMaskedLM and am trying to convert it to CoreML. The only way I adapted the script, so far, was to remove the custom_conversion_functions={"Softmax": _convert_softmax} option in the convert() function, since I just want the predictions out, but it exits with a Segmentation Fault. The last few lines of the trace appear to just indicate the completion of the onnx export:

%1107 : Float(1, 256, 512) = onnx::Add(%1106, %vocab_layer_norm.bias) # /home/james/anaconda3/envs/torch/lib/python3.7/site-packages/torch/nn/functional.py:1696:0
  %1108 : Float(512, 219) = onnx::Transpose[perm=[1, 0]](%distilbert.embeddings.word_embeddings.weight) # /home/james/anaconda3/envs/torch/lib/python3.7/site-packages/torch/nn/functional.py:1372:0
  %1109 : Float(1, 256, 219) = onnx::MatMul(%1107, %1108) # /home/james/anaconda3/envs/torch/lib/python3.7/site-packages/torch/nn/functional.py:1372:0
  %output_weights : Float(1, 256, 219) = onnx::Add(%1109, %vocab_projector.bias) # /home/james/anaconda3/envs/torch/lib/python3.7/site-packages/torch/nn/functional.py:1374:0
  return (%output_weights)

Segmentation fault (core dumped)

I'm guessing that maybe the final layers of DistilBertForMaskedLM need custom functions (vocab_transform, vocab_layer_norm, and vocab_projector)?

Any help/tips greatly appreciated.

UPDATE: Looking into DistilBertForMaskedLM, I decided to try just exporting the underlying model using model = model.distilbert, but I get the same Segmentation Fault.

UPDATE 2: Oh, gulp... Wait, I'm running this on my Ubuntu machine (where I do the training). Probably the issue... (Confirmed.)

GPT-2 low quality responses

I'm trying to develop an iOS app which utilizes your distilgpt2-64-6.mlmodel but getting strange answers to my questions.
I configured the model the same as you in attached ViewController: strategy: .topK(40) and nTokens: 50.
I'm attaching some screenshots that show my conversation with the model (question is at the top (You) and answer from model (Device) is right below).
What can be the cause of such behaviour?

IMG_1538
IMG_1537

Unable to deserialize object

I run xCode 13.2.1. When I build and run I get the following warnings, which results in errors.

coremlc: warning: unable to read document: /Users/name/Downloads/swift-coreml-transformers-master/Resources/gpt2-64-12.mlmodel
detail: validator error: unable to deserialize object
coremlc: error: unable to read document: /Users/name/Downloads/swift-coreml-transformers-master/Resources/gpt2-64-12.mlmodel
detail: validator error: unable to deserialize object

unable to read document: /Users/name/Downloads/swift-coreml-transformers-master/Resources/gpt2-64-12.mlmodel unable to read document: /Users/name/Downloads/swift-coreml-transformers-master/Resources/distilgpt2-64-6.mlmodel

A few questions.

Hi julen,
first of all let me say thank you.
I'm really new to ml and I'm kinda puzzling a lot of info right now and I have a little mess..
If I want to fine-tune a model on my own dataset and then implement that model in an app like you're doing here what's the way to achieve it?
If i got it right I could write the model on a colab notebook, using your transformer library, train it , export it (how?) , convert ( with one of the scripts in the model_generation folder? )it and use it in CoreML?
Thanks again and sorry for this very noob question..
Vincenzo

CoreML documents can not be deserialized

I freshly cloned the project and I seem to have a problem with the CoreML models. When ever I try to open any of them I get an error: "validation error: unable to deserialize object". On other models I tried downloading it works without a problem, any help for that, is something broken with the files?

Is Text2Text going to be supported?

I'm working on a project using T5, and it would be amazing to do everything on device without a need for a server and API.

Thanks for all these amazing projects!

Better Documentation regarding app setup

Recently came across this project and found it fascinating! Is there are plan to have more direct and detailed documentation regarding setup of the app, to make getting the app up and running as frictionless as possible? I would be more than happy contributing to such documentation.

Error when building the gpt2 app

The BERT app works fine, but it looks like there is something wrong with the .mlmodel files for gpt2.

Xcode is giving me this error when building the gpt2 app:

Showing All Issues
CoreMLModelCompile /Users/[user]/Library/Developer/Xcode/DerivedData/CoreMLBert-dgdnlbjylimukwhbaciwiwfltdzp/Build/Products/Debug-iphoneos/CoreMLGPT2.app/ 

/Users/[user]/Desktop/swift-coreml-transformers/Resources/gpt2-512.mlmodel (in target 
'CoreMLGPT2' from project 'CoreMLBert')

    cd /Users/[user]/Desktop/swift-coreml-transformers

    /Applications/Xcode.app/Contents/Developer/usr/bin/coremlc compile 

/Users/[user]/Desktop/swift-coreml-transformers/Resources/gpt2-512.mlmodel 

/Users/[user]/Library/Developer/Xcode/DerivedData/CoreMLBert-

dgdnlbjylimukwhbaciwiwfltdzp/Build/Products/Debug-iphoneos/CoreMLGPT2.app/ --output-
partial-info-plist /Users/[user]/Library/Developer/Xcode/DerivedData/CoreMLBert-dgdnlbjylimukwhbaciwiwfltdzp/Build/Intermediates.noindex/CoreMLBert.build/Debug-iphoneos/CoreMLGPT2.build/gpt2-512-CoreMLPartialInfo.plist

coremlc: Error: Error reading protobuf spec. validator error: Bias layer '0_block_attn_afterbias' cannot be 2 dimensional. Must be 1D or 3D.

Command CoreMLModelCompile failed with a nonzero exit code

IOS version: 13.1 public beta 3
Xcode version: 11

coreml model based on transformer for translation purpose

Hi,

I am wondering if it is possible for you to make a transformer-based model for translation purposes in Core ML for this repository? I had a trained PyTorch model but failed to convert it to ONNX. Are there any other existing translation model out there that can be easily converted to .mlmodel format?

Thanks,
Yufan

COREML BERT Crashing on long text

For documents with lots of words, BERT ends up crashing outputting the error
Fatal error: 'try!' expression unexpectedly raised an error: App.TokenizerError.tooLong("Token indices sequence length is longer than the specified maximum\nsequence length for this BERT model (784 > 512. Running this\nsequence through BERT will result in indexing errors\".format(len(ids), self.max_len)")

How do you solve this or is BERT only available for paragraphs which a less number of words? Can we increase the maxLen to 1024 or even 2048 or would that not work?

distilbert-onnx-coreml.py "works" for BERT, but I get "Error computing NN outputs." when predicting

Hi,

I used distilbert-onnx-coreml.py to convert a custom PyTorch BertForSequenceClassification model to CoreML. The conversion finishes without error.

However I can't use the resulting CoreML model for prediction. The following code fails:

model = coremltools.models.MLModel(f"./path/to/model/model.mlmodel")

input_ids = np.zeros((1,64))
d = {}
d['input_ids'] = input_ids

predictions = model.predict(d, True)

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-29-1c38a7b07949> in <module>
----> 1 predictions = model.predict(d, True)

~/anaconda3/lib/python3.7/site-packages/coremltools/models/model.py in predict(self, data, useCPUOnly, **kwargs)
    328 
    329         if self.__proxy__:
--> 330             return self.__proxy__.predict(data, useCPUOnly)
    331         else:
    332             if _macos_version() < (10, 13):

RuntimeError: {
    NSLocalizedDescription = "Error computing NN outputs.";
}

Note, my input dim is 64:

spec.description.input

[name: "input_ids"
type {
  multiArrayType {
    shape: 1
    shape: 64
    dataType: INT32
  }
}
]

When I try to substitute my model into the DistilBERT demo app, I get the following error in Xcode when predicting:

CoreMLBert.bert_transactions_64Input
2020-01-07 10:12:58.271435+1300 CoreMLBert[1044:35882] [espresso] [Espresso::handle_ex_plan] exception=Espresso exception: "Invalid state": Cannot squeeze a dimension whose value is not 1: shape[1]=64 status=-5
2020-01-07 10:12:58.272716+1300 CoreMLBert[1044:35882] [coreml] Error computing NN outputs -5

The only hint that something might have gone wrong in the onnx->coreml conversion is a note about a deleted node, however I'm struggling to find out whether this is just a red herring:

[Core ML Pass] 1 disconnected constants nodes deleted
Translation to CoreML spec completed. Now compiling the CoreML model.
Model Compilation done.

Are there any particular layers that need custom conversion in BERT into coreml? Any suggestions on further debugging?

Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.