attilanagy234 / neural-punctuator Goto Github PK

View Code? Open in Web Editor NEW

48.0 48.0 7.0 7.82 MB

Complimentary code for our paper Automatic punctuation restoration with BERT models

License: MIT License

Python 0.35% Jupyter Notebook 99.65%

bert punctuation-restoration transformer

neural-punctuator's People

Contributors

Stargazers

Watchers

Forkers

m5l14i11 appleyc trishal-singh ljj12 aetherprior mec-correcaotextual juliandarley

neural-punctuator's Issues

Return dict must be forced to false to ensure tensor return

In line 16 of src/neural_punctuator/models/BertPunctuator.py add return_dict=False to from_pretrained(), otherwise a dict is returned, from which your code takes the name of the layer, rather than the tensor.

(This will fix the bug with a traceback of dropout expecting a tensor rather than a string)

Random seed

create random seed
save with model for reproducibility

Fix number tokenization with albert

Running Error

Execuse me, when I train the model in GPU,I meet this error.Could you help me ?
Traceback (most recent call last):
File "main.py", line 6, in
pipe.train()
File "/workspace/neural-punctuator-main/src/neural_punctuator/wrappers/BertPunctuatorWrapper.py", line 17, in train
self._trainer.train()
File "/workspace/neural-punctuator-main/src/neural_punctuator/trainers/BertPunctuatorTrainer.py", line 106, in train
mask = ((targets == 0) & (np.random.rand(*targets.shape) < .1)) | (targets > 0)
TypeError: and() received an invalid combination of arguments - got (numpy.ndarray), but expected one of:

(Tensor other)
didn't match because some of the arguments have invalid types: (numpy.ndarray)
(Number other)
didn't match because some of the arguments have invalid types: (numpy.ndarray)

id2target and target2id

How did you decide id2target for converting back from prediction to original.
id2target = {-1: 0,
9: 1, # .
60: 2, # ?
15: 3, # ,
-2: -1, # will be masked
}

I have class 1 as , 2 as ? 3 as . and 4 as ! 0 for all others.
How should I use this?

Create baseline for Hungarian

Need to show that transformers actually bring improvement.
Train a simpler model (LSTM/1d conv, etc.)

Thanks,
Camille

Validation accuracy more than train

Hi, I am getting validation accuracy greater than training and validation loss lesser than train. Can you tell why? Maybe due to class imbalance? I am using different dataset