Giter VIP home page Giter VIP logo

ericfillion / happy-transformer Goto Github PK

View Code? Open in Web Editor NEW
497.0 7.0 64.0 19.22 MB

Happy Transformer makes it easy to fine-tune and perform inference with NLP Transformer models.

Home Page: http://happytransformer.com

License: Apache License 2.0

Python 100.00%
language-models artificial-intelligence ai question-answering bert roberta nlp machine-learning text-classification deep-learning transformers python natural-language-processing

happy-transformer's Introduction

License Downloads Website shields.io PyPI

Happy Transformer

Documentation and news: happytransformer.com

Join our Discord server: Support Server

HappyTransformer

Happy Transformer makes it easy to fine-tune and perform inference with NLP Transformer models.

3.0.0

  1. DeepSpeed for training
  2. Apple's MPS for training and inference
  3. WandB to track training runs
  4. Data supplied for training is automatically split into portions for training and evaluating
  5. Push models directly to Hugging Face's Model Hub

Read about the full 3.0.0 update including breaking changes here.

Tasks

Tasks Inference Training
Text Generation
Text Classification
Word Prediction
Question Answering
Text-to-Text
Next Sentence Prediction
Token Classification

Quick Start

pip install happytransformer
from happytransformer import HappyWordPrediction
#--------------------------------------#
happy_wp = HappyWordPrediction()  # default uses distilbert-base-uncased
result = happy_wp.predict_mask("I think therefore I [MASK]")
print(result)  # [WordPredictionResult(token='am', score=0.10172799974679947)]
print(result[0].token)  # am

Maintainers

Tutorials

Text generation with training (GPT-Neo)

Text classification (training)

Text classification (hate speech detection)

Text classification (sentiment analysis)

Word prediction with training (DistilBERT, RoBERTa)

Top T5 Models

Grammar Correction

Fine-tune a Grammar Correction Model

happy-transformer's People

Contributors

adamcyber1 avatar davidcoallier avatar dependabot[bot] avatar ericfillion avatar loganroth avatar sjcantor avatar superjcd avatar ted537 avatar ugokalp avatar ujwal-narayan avatar willmacd avatar yuchenlin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

happy-transformer's Issues

Fix files failing pylint

pylint has a number of failures on a number of files. To enable it in the actions we need to fix all of the pylint errors throughout the files first. After this is done the "pylint" line in the pythonapp.yml file can be uncommented to run again.

Fine tuning XLNet

Perform research on how we could add methods to HappyBERT that enable fine tuning. Post a 200 word paragraph to the report section of our google drive about your findings. Include potential resources we could use.

Any plans on other recent models like ALBERT?

Hi Eric,

Thanks for your wonderful work! I was wondering you would like to integrate some other recent models like ALBERT in the near future. That would make the experiments using happy-transformer more comprehensive!

Begin the implementation of GleefulTransformer

Use the outputs of "k option in predict_mask" as features for a final model that optimally combines these outputs.

Do research on the structure of the model including specifics on the features and the output. It would be better if the model could predict k options as well instead of only the top 1.

Turn finetuned model to happytransformer model

The finetuning method for language models returns the finetuned model.

Need a way of going from base model to happytransformer model.
model.from_pretrained() requires a directory but we only have a model object.

Fine tuning BERT

Perform research on how we could add methods to HappyBERT that enable fine tuning. Post a 200 word paragraph to the report section of our google drive about your findings. Include potential resources we could use.

XLNet-base-uncased not found

Hi I tried to run the example script on Readme.md and it gives me the error that XLNet-base-uncased not found

Model name 'xlnet-base-uncased' was not found in tokenizers model name list (xlnet-base-cased, xlnet-large-cased). We assumed 'xlnet-base-uncased' was a path, a model identifier, or url to a directory containing vocabulary files named ['spiece.model'] but couldn't find such vocabulary files at this path or url.

Thank you

Clean up predict_mask for all child classes

Lots of code is reused amongst the child classes for predict_mask. Create methods within the parent class that perform some of the common tasks to make the code within the child classes shorter and easier to understand.

Create a testing class for the Winograd Schema Challenge

Use "predict_mask_with_options" for each child class to generate results for the WSC273. Within the WSCTesting class, create a method that performs the test for each child class. Also create a single method that calls each of these tests at once and outputs the results.

mask_lm_labels should be -100 instead of -1?

Hi,

We were running the fine-tuning mwp examples and found it reported a bug saying "Target -1 is out of bound". After debugging, we find that the doc of BertForMaskedLM says

masked_lm_labels (:obj:torch.LongTensor of shape :obj:(batch_size, sequence_length), optional, defaults to :obj:None):
Labels for computing the masked language modeling loss.
Indices should be in [-100, 0, ..., config.vocab_size] (see input_ids docstring)
Tokens with indices set to -100 are ignored (masked), the loss is only computed for the tokens with labels

Thus, we modified the mask_tokens function by labels[~masked_indices] = -100 # We only compute loss on masked tokens. And it works now. Not sure if it is correct, could you please do a double check ?

Sentence perplexity for BERT

Create a sentence perplexity method for BERT. Perhaps this method can be placed in the parent class; that way, all children can use it.

Predict k masked words

Create a transformer method to predict k number of masked words and return a list with the options in descending order by softmax

Predict multiple masked words

Is there a way to predict multiple masked words for this? So for example if I have a sentence:

"[MASK] have a [MASK] dog and I love [MASK] so much"

Thank you so much!

The sentence separator in TextDataset.init seems wrong

    def __init__(self, tokenizer, file_path, block_size=512):
        assert os.path.isfile(file_path)
        with open(file_path, encoding="utf-8") as f:
            text = f.read()

        tokenized_text = tokenizer.encode(
            text, add_special_tokens=True)  # Get ids from text
        self.examples = []
        # Truncate examples to a max blocksize
        for i in range(0, len(tokenized_text) - block_size + 1, block_size):
            self.examples.append(tokenized_text[i:i + block_size])

Here it seems to separate the file in file_path with the block_size instead of the \n ? Thus, the provided example in the readme cannot be trained.

lr always 0 when fine-tuning mlm

Hi,

I found that the fine-tuned model did not do better on the mwp on the trained corpus, so I debug the code by showing its learning rater as follows in the train function.

  if global_step % logging_steps == 0:
                # tb_writer.add_scalar("lr", scheduler.get_lr()[0], global_step)
                print()
                print("\t lr:", scheduler.get_lr()[0])
                print("\t avg loss:", (tr_loss - logging_loss) / logging_steps, global_step)
                logging_loss = tr_loss

However, I found that lr is always 0..

Finetuning MWP with custom masks?

Hi Eric,

Thanks for the feature of fine-tuning MWP. I am wondering if we can write our own customized masking strategy instead of random sampling?

Simply put, I have a list of sentences where masks are already created, and we know the associated words for each mask. Can we fine-tune HappyBERT/HappyROBERTA for them?

Thanks!

Create a README

Include the purpose of HappyTransformer. Mention our future goals. Finally, write documentation on how to create child classes and use the methods we have completed thus far.

Add some continuous integration

Use one of the various continuous integration (CI) tools to report the build status and test coverage (when there are tests) of the project.

travis CI seems like a quick and easy option.

Finetuning Debugging

I uploaded finetuning.ipynb on my branch the error I keep getting is RuntimeError: CUDA error: device-side assert triggered.

I need someone experience to debug this bad boy.

predict_mask_with_options RoBERTa

Create a predict_mask_with_options method for RoBERTa. If the Fairseq libray does not provide enough functionality to accomplish the task, then convert the RoBERTa class to the standard Transformer library implementation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.