Giter VIP home page Giter VIP logo

Comments (22)

gireek avatar gireek commented on May 17, 2024 6

@yuyan2do Can we get the Colab as you stated earlier. It will be extremely helpful for PyTorch newbies to explore ProphetNet. Thanks

from prophetnet.

yuyan2do avatar yuyan2do commented on May 17, 2024 5

@gireek Create a Colab tutorial is in our backlog. We will priority this work if more people ask for it.

from prophetnet.

yuyan2do avatar yuyan2do commented on May 17, 2024 3

It missed a preprocess step, which caused many token been replaced by [UNK].

@qiweizhen Could you add a tutorial about "training/inference on own data"?

from prophetnet.

bertagknowles avatar bertagknowles commented on May 17, 2024 3

I too am overwhelmed by so many scripts...a working colab notebook with scripts in correct order would make the task very easy to follow. Already about six users have demanded this. Thanks for prioritizing this :)

from prophetnet.

yuyan2do avatar yuyan2do commented on May 17, 2024

Will it be helpful if we provide an example using colab?

from prophetnet.

marcelbra avatar marcelbra commented on May 17, 2024

Yes, that would be amazing!!

from prophetnet.

marcelbra avatar marcelbra commented on May 17, 2024

I managed to run the inference script (however, evaluation is still throwing an error). You can find my workspace here to have a look. My question is, how can I now pass a new input to the model? I guess evaluation is not that important for now, I just want to see how the model's output might look like to my type of text. The input parameter --input did not work. Thanks!

from prophetnet.

yuyan2do avatar yuyan2do commented on May 17, 2024

I take a quick look. The error is during evaluation, which you said is not important for now.
The inference output should be in "cnndm/sort_hypo$SUFFIX.txt"

SUFFIX=_ck9_pelt1.2_test_beam5
BEAM=5
LENPEN=1.2
CHECK_POINT=cnndm/finetune_cnndm_checkpoints/checkpoint9.pt
OUTPUT_FILE=cnndm/output$SUFFIX.txt
SCORE_FILE=cnndm/score$SUFFIX.txt
INPUT=input/test.txt

fairseq-generate cnndm/processed --path $CHECK_POINT --user-dir prophetnet --task translation_prophetnet --batch-size 32 --gen-subset test --beam $BEAM --num-workers 4 --min-len 45 --max-len-b 110 --no-repeat-ngram-size 3 --lenpen $LENPEN 2>&1 > $OUTPUT_FILE

grep ^H $OUTPUT_FILE | cut -c 3- | sort -n | cut -f3- | sed "s/ ##//g" > cnndm/sort_hypo$SUFFIX.txt

Remove below line for evaluation, compute rouge score, to avoid the error.
(FileNotFoundError: [Errno 2] No such file or directory: 'cnndm/original_data/test.summary')

python cnndm/eval/postprocess_cnn_dm.py --generated cnndm/sort_hypo$SUFFIX.txt --golden cnndm/original_data/test.summary > $SCORE_FILE

from prophetnet.

marcelbra avatar marcelbra commented on May 17, 2024

I managed to infer on the command line using this fairly simple script
fairseq-interactive cnndm/processed --path ../../checkpoint9.pt --user-dir prophetnet --task translation_prophetnet. It will prompt to input some text.

I have two questions.

When creating the binary files, it replaced a large portion with [UNK].

| [src] Dictionary: 30521 types
| [src] cnndm/prophetnet_tokenized/train.src: 287113 sents, 268357288 tokens, 16.9% replaced by [UNK]
| [src] Dictionary: 30521 types
| [src] cnndm/prophetnet_tokenized/valid.src: 13368 sents, 12065326 tokens, 16.3% replaced by [UNK]
| [src] Dictionary: 30521 types
| [src] cnndm/prophetnet_tokenized/test.src: 11490 sents, 10518620 tokens, 16.3% replaced by [UNK]
| [tgt] Dictionary: 30521 types
| [tgt] cnndm/prophetnet_tokenized/train.tgt: 287113 sents, 19659024 tokens, 16.2% replaced by [UNK]
| [tgt] Dictionary: 30521 types
| [tgt] cnndm/prophetnet_tokenized/valid.tgt: 13368 sents, 1019609 tokens, 16.9% replaced by [UNK]
| [tgt] Dictionary: 30521 types
| [tgt] cnndm/prophetnet_tokenized/test.tgt: 11490 sents, 833967 tokens, 16.8% replaced by [UNK]
| Wrote preprocessed data to cnndm/processed

Is that correct?

Also, when predicting, it seems like important words are being replaced by [UNK].

For example:

This paragraph

We investigate how perceived job riskiness and individual attitudes impact the vocational choice of business graduates. The hypotheses are tested with a sample of 182 similarly qualified students at two European business schools. Participants are randomly allocated to two conditions under which they receive a job-description that highlights job security or job risk. The findings indicate that risk negatively affects employer attractiveness and the inclination to apply. Besides that, the subjective person-job fit has a positive direct impact on employer attractiveness and the inclination to apply. Contrary to the expectations, risk had no significantly stronger effect on women.

will be evaluated to

[UNK] investigate how perceived job [UNK] and individual attitudes impact the vocational choice of business [UNK] [UNK] [UNK] are tested with a sample of 182 similarly qualified students at two [UNK] business [UNK] [UNK] are randomly allocated to two conditions under which they receive a [UNK] that highlights job security or job [UNK] [UNK] findings indicate that risk negatively affects employer [UNK] and the inclination to [UNK] [UNK] [UNK] the subjective [UNK] fit has a positive direct impact on employer [UNK] and the inclination to [UNK] [UNK] to the [UNK] risk had no significantly stronger effect on [UNK]

yielding the hypothesis

students are randomly allocated to receive a [UNK] that highlights job security or job [UNK] . [X_SEP] the subjective fit has a positive direct impact on employer [UNK] . [X_SEP] the inclination to [UNK] [UNK] to the [UNK] risk had no significantly stronger effect on [UNK] .

This looks quite good, however, the frequency of these tokes seems weird. I guess it's related to the UniLM but I'm quite unsure how to proceed here.

from prophetnet.

marcelbra avatar marcelbra commented on May 17, 2024

Okay I figured. In the UniLM data you linked there were only dev.src/dev.tgt. When creating binaries it threw error valid.src/valid.tgt missing so I changed dev.src/dev.tgt to valid.src/valid.tgt since it was the only one named wrong. Was that correct?

Just checked preprocessing without applying changes to provided data. High percentage is still there plus thrown error due to wrong naming.

I might have missed running the python script before. I will try that later and tell you what happened.

from prophetnet.

yuyan2do avatar yuyan2do commented on May 17, 2024

You need run below script to do the modify, instead of rename.
https://github.com/microsoft/ProphetNet/blob/master/src/cnndm/preprocess_cnn_dm.py

preocess('cnndm/original_data/dev.article', 'cnndm/prophetnet_tokenized/valid.src', keep_sep=False)
preocess('cnndm/original_data/dev.summary', 'cnndm/prophetnet_tokenized/valid.tgt', keep_sep=True)

from prophetnet.

marcelbra avatar marcelbra commented on May 17, 2024

Yes, I did that last night! Preprocessing replaced 0.0% now, so everything seems fine. But when infering on many texts there seem to be replacements of (sometimes unusual but sometimes quite usual) words. Is that intentional?

from prophetnet.

yuyan2do avatar yuyan2do commented on May 17, 2024

Yes, I did that last night! Preprocessing replaced 0.0% now, so everything seems fine. But when infering on many texts there seem to be replacements of (sometimes unusual but sometimes quite usual) words. Is that intentional?

Could you give some examples for this?

from prophetnet.

marcelbra avatar marcelbra commented on May 17, 2024
  1. Downloaded date from here
  2. used preprocess_cnn_dm.py to create train/test/valid+src/tgt
  3. created binaries using
    fairseq-preprocess \ --user-dir ./prophetnet \ --task translation_prophetnet \ --source-lang src --target-lang tgt \ --trainpref cnndm/prophetnet_tokenized/train --validpref cnndm/prophetnet_tokenized/valid --testpref cnndm/prophetnet_tokenized/test \ --destdir cnndm/processed --srcdict ./vocab.txt --tgtdict ./vocab.txt \ --workers 20
  4. downloaded checkpoint from here
  5. ran inference script
    fairseq-interactive cnndm/processed --path ../../checkpoint9.pt --user-dir prophetnet --task translation_prophetnet
  6. Two paragraphs I just tried:

I encountered this story—which is about Taylor Swift clones—when it won the Gulf Coast Barthelme Prize a couple of years ago. The judge was Steve Almond, who wrote, “I tried quite hard to resist choosing “Taylor Swift” as the winner of this year’s Barthelme Award. Why? Because all the stories I received were worthy and many were more technically ambitious when it came to language and form, by which I guess I mean experimental. . . . But what the hell. In the end, I just wanted to read this thing again and again.” Which is exactly right. Whatever you think of the actual Taylor Swift, this story is just plain fun.
S-0 [UNK] encountered this [UNK] is about [UNK] [UNK] [UNK] it won the [UNK] [UNK] [UNK] [UNK] a couple of years [UNK] [UNK] judge was [UNK] [UNK] who [UNK] [UNK] tried quite hard to resist choosing [UNK] [UNK] as the winner of this [UNK] [UNK] [UNK] [UNK] [UNK] all the stories [UNK] received were worthy and many were more technically ambitious when it came to language and [UNK] by which [UNK] guess [UNK] mean [UNK] . . . [UNK] what the [UNK] [UNK] the [UNK] [UNK] just wanted to read this thing again and [UNK] [UNK] is exactly [UNK] [UNK] you think of the actual [UNK] [UNK] this story is just plain [UNK]
H-0 -0.5018416047096252 the [UNK] [UNK] [UNK] [UNK] [UNK] won the [UNK] [UNK] [UNK] [UNK] a couple of years ago . [X_SEP] the [UNK] [UNK] [UNK] [UNK] judge was [UNK] [UNK] who tried quite hard to resist choosing [UNK] [UNK] as the winner .
P-0 -2.3202 -0.8823 -0.1841 -1.0270 -0.4661 -0.7258 -2.6165 -0.2648 -0.2291 -0.1468 -0.1247 -0.1055 -0.8802 -0.0800 -0.1379 -0.0884 -0.1111 -0.3555 -0.2556 -1.7888 -0.9244 -0.1134 -0.3385 -0.4988 -1.5235 -0.2080 -0.4371 -0.2541 -0.3722 -0.3488 -0.1160 -0.0461 -0.1520 -0.1284 -0.4877 -0.7384 -0.2469 -0.1625 -0.0919 -0.0594 -0.4204 -0.6187

Transformers have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling. We propose a novel neural architecture Transformer-XL that enables learning dependency beyond a fixed length without disrupting temporal coherence. It consists of a segment-level recurrence mechanism and a novel positional encoding scheme. Our method not only enables capturing longer-term dependency, but also resolves the context fragmentation problem.
S-0 [UNK] have a potential of learning [UNK] [UNK] but are limited by a [UNK] context in the setting of language [UNK] [UNK] propose a novel neural architecture [UNK] that enables learning dependency beyond a fixed length without [UNK] temporal [UNK] [UNK] consists of a [UNK] [UNK] mechanism and a novel [UNK] encoding [UNK] [UNK] method not only enables capturing [UNK] [UNK] but also [UNK] the context fragmentation [UNK]
H-0 -0.3602469563484192 a novel neural architecture [UNK] that enables learning dependency beyond a fixed length without [UNK] temporal [UNK] [UNK] consists of a [UNK] [UNK] mechanism and a novel [UNK] encoding [UNK] [UNK] method .
P-0 -2.6825 -1.4001 -0.0427 -0.0278 -0.4756 -0.5836 -0.1250 -0.0397 -0.1123 -0.0313 -0.0801 -0.0057 -0.0625 -0.0624 -0.2568 -0.1473 -0.6857 -0.4404 -0.5872 -0.1196 -0.1033 -0.3800 -0.2844 -0.3009 -0.1307 -0.1259 -0.2720 -0.3956 -0.1504 -0.3195 -0.3839 -0.0469 -0.2630 -1.1234

from prophetnet.

qiweizhen avatar qiweizhen commented on May 17, 2024

I encountered this story—which is about Taylor Swift clones—when it won the Gulf Coast Barthelme Prize a couple of years ago. The judge was Steve Almond, who wrote, “I tried quite hard to resist choosing “Taylor Swift” as the winner of this year’s Barthelme Award. Why? Because all the stories I received were worthy and many were more technically ambitious when it came to language and form, by which I guess I mean experimental. . . . But what the hell. In the end, I just wanted to read this thing again and again.” Which is exactly right. Whatever you think of the actual Taylor Swift, this story is just plain fun.
S-0 [UNK] encountered this [UNK] is about [UNK] [UNK] [UNK] it won the [UNK] [UNK] [UNK] [UNK] a couple of years [UNK] [UNK] judge was [UNK] [UNK] who [UNK] [UNK] tried quite hard to resist choosing [UNK] [UNK] as the winner of this [UNK] [UNK] [UNK] [UNK] [UNK] all the stories [UNK] received were worthy and many were more technically ambitious when it came to language and [UNK] by which [UNK] guess [UNK] mean [UNK] . . . [UNK] what the [UNK] [UNK] the [UNK] [UNK] just wanted to read this thing again and [UNK] [UNK] is exactly [UNK] [UNK] you think of the actual [UNK] [UNK] this story is just plain [UNK]
H-0 -0.5018416047096252 the [UNK] [UNK] [UNK] [UNK] [UNK] won the [UNK] [UNK] [UNK] [UNK] a couple of years ago . [X_SEP] the [UNK] [UNK] [UNK] [UNK] judge was [UNK] [UNK] who tried quite hard to resist choosing [UNK] [UNK] as the winner .
P-0 -2.3202 -0.8823 -0.1841 -1.0270 -0.4661 -0.7258 -2.6165 -0.2648 -0.2291 -0.1468 -0.1247 -0.1055 -0.8802 -0.0800 -0.1379 -0.0884 -0.1111 -0.3555 -0.2556 -1.7888 -0.9244 -0.1134 -0.3385 -0.4988 -1.5235 -0.2080 -0.4371 -0.2541 -0.3722 -0.3488 -0.1160 -0.0461 -0.1520 -0.1284 -0.4877 -0.7384 -0.2469 -0.1625 -0.0919 -0.0594 -0.4204 -0.6187

Transformers have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling. We propose a novel neural architecture Transformer-XL that enables learning dependency beyond a fixed length without disrupting temporal coherence. It consists of a segment-level recurrence mechanism and a novel positional encoding scheme. Our method not only enables capturing longer-term dependency, but also resolves the context fragmentation problem.
S-0 [UNK] have a potential of learning [UNK] [UNK] but are limited by a [UNK] context in the setting of language [UNK] [UNK] propose a novel neural architecture [UNK] that enables learning dependency beyond a fixed length without [UNK] temporal [UNK] [UNK] consists of a [UNK] [UNK] mechanism and a novel [UNK] encoding [UNK] [UNK] method not only enables capturing [UNK] [UNK] but also [UNK] the context fragmentation [UNK]
H-0 -0.3602469563484192 a novel neural architecture [UNK] that enables learning dependency beyond a fixed length without [UNK] temporal [UNK] [UNK] consists of a [UNK] [UNK] mechanism and a novel [UNK] encoding [UNK] [UNK] method .
P-0 -2.6825 -1.4001 -0.0427 -0.0278 -0.4756 -0.5836 -0.1250 -0.0397 -0.1123 -0.0313 -0.0801 -0.0057 -0.0625 -0.0624 -0.2568 -0.1473 -0.6857 -0.4404 -0.5872 -0.1196 -0.1033 -0.3800 -0.2844 -0.3009 -0.1307 -0.1259 -0.2720 -0.3956 -0.1504 -0.3195 -0.3839 -0.0469 -0.2630 -1.1234

Two outputs I just tried.

It looks like that you didn't tokenize the provided text into word pieces. For now, tokenizing whole word into word pieces is commonly used to alleviate some vocabulary problems, and you may refer to here.

from prophetnet.

marcelbra avatar marcelbra commented on May 17, 2024

@qiweizhen as you said, after

  1. Downloaded date from here
  2. used preprocess_cnn_dm.py to create train/test/valid+src/tgt

I ran this and the output is 100% identical.

It replaces words but I feel like the summarization look ok. Another paragraph:

I'm noticing 2 types of replacements:

  1. Beginning of sentence
  2. Any type of punctuation

Wild guess is, word+punct is observed -> not found in dict -> replaced.
For beginning of sentence idk.

Merkel was educated at Karl Marx University, Leipzig, where she studied physics from 1973 to 1978. While a student, she participated in the reconstruction of the ruin of the Moritzbastei, a project students initiated to create their own club and recreation facility on campus. Such an initiative was unprecedented in the GDR of that period, and initially resisted by the University; however, with backing of the local leadership of the SED party, the project was allowed to proceed. At school she learned to speak Russian fluently, and was awarded prizes for her proficiency in Russian and mathematics. She was the best in her class in mathematics and Russian, and completed her school education with the best possible average Abitur grade 1.0.
S-0 [UNK] was educated at [UNK] [UNK] [UNK] [UNK] where she studied physics from 1973 to [UNK] [UNK] a [UNK] she participated in the reconstruction of the ruin of the [UNK] a project students initiated to create their own club and recreation facility on [UNK] [UNK] an initiative was unprecedented in the [UNK] of that [UNK] and initially resisted by the [UNK] [UNK] with backing of the local leadership of the [UNK] [UNK] the project was allowed to [UNK] [UNK] school she learned to speak [UNK] [UNK] and was awarded prizes for her proficiency in [UNK] and [UNK] [UNK] was the best in her class in mathematics and [UNK] and completed her school education with the best possible average [UNK] grade [UNK]
H-0 -0.604485273361206 she studied physics from 1973 to [UNK] [UNK] a [UNK] . [X_SEP] she was the best in her class in mathematics and [UNK] .
P-0 -1.8327 -1.4006 -0.3280 -1.2938 -0.0892 -0.1472 -0.9993 -0.6959 -0.1060 -1.2664 -0.9566 -0.1618 -0.6522 -1.1117 -1.3683 -0.1477 -0.1564 -0.0790 -0.0344 -0.1136 -0.4401 -0.1032 -0.7496 -0.2032 -0.6752

from prophetnet.

qiweizhen avatar qiweizhen commented on May 17, 2024

@qiweizhen as you said, after

  1. Downloaded date from here
  2. used preprocess_cnn_dm.py to create train/test/valid+src/tgt

I ran this and the output is 100% identical.

It replaces words but I feel like the summarization look ok. Another paragraph:

I'm noticing 2 types of replacements:

  1. Beginning of sentence
  2. Any type of punctuation

Wild guess is, word+punct is observed -> not found in dict -> replaced.
For beginning of sentence idk.

Merkel was educated at Karl Marx University, Leipzig, where she studied physics from 1973 to 1978. While a student, she participated in the reconstruction of the ruin of the Moritzbastei, a project students initiated to create their own club and recreation facility on campus. Such an initiative was unprecedented in the GDR of that period, and initially resisted by the University; however, with backing of the local leadership of the SED party, the project was allowed to proceed. At school she learned to speak Russian fluently, and was awarded prizes for her proficiency in Russian and mathematics. She was the best in her class in mathematics and Russian, and completed her school education with the best possible average Abitur grade 1.0.
S-0 [UNK] was educated at [UNK] [UNK] [UNK] [UNK] where she studied physics from 1973 to [UNK] [UNK] a [UNK] she participated in the reconstruction of the ruin of the [UNK] a project students initiated to create their own club and recreation facility on [UNK] [UNK] an initiative was unprecedented in the [UNK] of that [UNK] and initially resisted by the [UNK] [UNK] with backing of the local leadership of the [UNK] [UNK] the project was allowed to [UNK] [UNK] school she learned to speak [UNK] [UNK] and was awarded prizes for her proficiency in [UNK] and [UNK] [UNK] was the best in her class in mathematics and [UNK] and completed her school education with the best possible average [UNK] grade [UNK]
H-0 -0.604485273361206 she studied physics from 1973 to [UNK] [UNK] a [UNK] . [X_SEP] she was the best in her class in mathematics and [UNK] .
P-0 -1.8327 -1.4006 -0.3280 -1.2938 -0.0892 -0.1472 -0.9993 -0.6959 -0.1060 -1.2664 -0.9566 -0.1618 -0.6522 -1.1117 -1.3683 -0.1477 -0.1564 -0.0790 -0.0344 -0.1136 -0.4401 -0.1032 -0.7496 -0.2032 -0.6752

Hi, I tried your input sentence, whose tokenized text should be like this:
image

from prophetnet.

cpipi avatar cpipi commented on May 17, 2024

Hi, had someone faced to problem with numpy during binary trainable files generating?
I have:

  • Windows 10
  • Python 3.7
  • Pip 20.0.2
  • numpy 1.18.4
  • torch 1.5.0

I am running this script like here:

fairseq-preprocess \ --user-dir prophetnet \ --task translation_prophetnet \ --source-lang src --target-lang tgt \ --trainpref gigaword/prophetnet_tokenized/train --validpref gigaword/prophetnet_tokenized/dev --testpref gigaword/prophetnet_tokenized/test \ --destdir gigaword/processed --srcdict vocab.txt --tgtdict vocab.txt \ --workers 20

while generating files I am getting warnings if I use numpy == 1.8 and errors if I use 1.7

In case of numpy 1.8: execution never ends, and shows only warnings :

(virtenv) C:\..\..\..\prophetnet\src>bash binary.sh c:\..\..\anaconda3\lib\site-packages\numpy\_distributor_init.py:32: UserWarning: loaded more than 1 DLL from .libs: c:\..\..\anaconda3\lib\site-packages\numpy\.libs\libopenblas.IPBC74C7KURV7CB2PKT5Z5FNR3SIBV4J.gfortran-win_amd64.dll c:\..\..\anaconda3\lib\site-packages\numpy\.libs\libopenblas.PYQHXLVVQ7VESDPUVUADXEVJOBGHJPAY.gfortran-win_amd64.dll stacklevel=1) Namespace(align_suffix=None, alignfile=None, bpe=None, cpu=False, criterion='cross_entropy', dataset_impl='mmap', destdir='gigaword/processed', empty_cache_freq=0, fp16=False, fp16_init_scale=128, fp16_scale_tolerance=0.0, fp16_scale_window=None, joined_dictionary=False, log_format=None, log_interval=1000, lr_scheduler='fixed', memory_efficient_fp16=False, min_loss_scale=0.0001, no_progress_bar=False, nwordssrc=-1, nwordstgt=-1, only_source=False, optimizer='nag', padding_factor=8, seed=1, source_lang='src', srcdict='vocab.txt', target_lang='tgt', task='translation_prophetnet', tensorboard_logdir='', testpref='gigaword/prophetnet_tokenized/test', tgtdict='vocab.txt', threshold_loss_scale=None, thresholdsrc=0, thresholdtgt=0, tokenizer=None, trainpref='gigaword/prophetnet_tokenized/train', user_dir='prophetnet', validpref='gigaword/prophetnet_tokenized/dev', workers=20) | [src] Dictionary: 30521 types c:\..\..\anaconda3\lib\site-packages\numpy\_distributor_init.py:32: UserWarning: loaded more than 1 DLL from .libs: c:\..\..\anaconda3\lib\site-packages\numpy\.libs\libopenblas.IPBC74C7KURV7CB2PKT5Z5FNR3SIBV4J.gfortran-win_amd64.dll c:\..\..\anaconda3\lib\site-packages\numpy\.libs\libopenblas.PYQHXLVVQ7VESDPUVUADXEVJOBGHJPAY.gfortran-win_amd64.dll stacklevel=1) OMP: Error #110: Memory allocation failed. c:\users\\anaconda3\lib\site-packages\numpy\_distributor_init.py:32: UserWarning: loaded more than 1 DLL from .libs: c:\users\\anaconda3\lib\site-packages\numpy\.libs\libopenblas.IPBC74C7KURV7CB2PKT5Z5FNR3SIBV4J.gfortran-win_amd64.dll c:\users\\anaconda3\lib\site-packages\numpy\.libs\libopenblas.PYQHXLVVQ7VESDPUVUADXEVJOBGHJPAY.gfortran-win_amd64.dll stacklevel=1)
and it stucks there until I interrupt it by keyboard.
In fact, that it did processed some files to gigaword/processed directory, but as much as I can say that not enough.

when i try numpy 1.17 it explicitly shows errors:

``(virtenv) C:......\prophetnet\src>bash binary.sh
c:....\anaconda3\lib\site-packages\numpy_distributor_init.py:32: UserWarning: loaded more than 1 DLL from .libs:
c:....\anaconda3\lib\site-packages\numpy.libs\libopenblas.IPBC74C7KURV7CB2PKT5Z5FNR3SIBV4J.gfortran-win_amd64.dll
c:....\anaconda3\lib\site-packages\numpy.libs\libopenblas.TXA6YQSD3GCQQC22GEQ54J2UDCXDXHWN.gfortran-win_amd64.dll
stacklevel=1)
Namespace(align_suffix=None, alignfile=None, bpe=None, cpu=False, criterion='cross_entropy', dataset_impl='mmap', destdir='gigaword/processed', empty_cache_freq=0, fp16=False, fp16_init_scale=128, fp16_scale_tolerance=0.0, fp16_scale_window=None, joined_dictionary=False, log_format=None, log_interval=1000, lr_scheduler='fixed', memory_efficient_fp16=False, min_loss_scale=0.0001, no_progress_bar=False, nwordssrc=-1, nwordstgt=-1, only_source=False, optimizer='nag', padding_factor=8, seed=1, source_lang='src', srcdict='vocab.txt', target_lang='tgt', task='translation_prophetnet', tensorboard_logdir='', testpref='gigaword/prophetnet_tokenized/test', tgtdict='vocab.txt', threshold_loss_scale=None, thresholdsrc=0, thresholdtgt=0, tokenizer=None, trainpref='gigaword/prophetnet_tokenized/train', user_dir='prophetnet', validpref='gigaword/prophetnet_tokenized/dev', workers=20)
| [src] Dictionary: 30521 types
c:....\anaconda3\lib\site-packages\numpy_distributor_init.py:32: UserWarning: loaded more than 1 DLL from .libs:
c:....\anaconda3\lib\site-packages\numpy.libs\libopenblas.IPBC74C7KURV7CB2PKT5Z5FNR3SIBV4J.gfortran-win_amd64.dll
c:....\anaconda3\lib\site-packages\numpy.libs\libopenblas.TXA6YQSD3GCQQC22GEQ54J2UDCXDXHWN.gfortran-win_amd64.dll
stacklevel=1)
Process SpawnPoolWorker-11:
Traceback (most recent call last):
File "c:\users\anaconda3\lib\multiprocessing\process.py", line 297, in _bootstrap
self.run()
File "c:\users\anaconda3\lib\multiprocessing\process.py", line 99, in run
self._target(*self._args, **self.kwargs)
File "c:\users\anaconda3\lib\multiprocessing\pool.py", line 110, in worker
task = get()
File "c:\users\anaconda3\lib\multiprocessing\queues.py", line 354, in get
return ForkingPickler.loads(res)
File "c:\users\anaconda3\lib\site-packages\fairseq_cli\preprocess.py", line 13, in
from fairseq import options, tasks, utils
File "c:\users\anaconda3\lib\site-packages\fairseq_init
.py", line 9, in
import fairseq.criterions # noqa
File "c:\users\anaconda3\lib\site-packages\fairseq\criterions_init
.py", line 10, in
from fairseq.criterions.fairseq_criterion import FairseqCriterion
File "c:\users\anaconda3\lib\site-packages\fairseq\criterions\fairseq_criterion.py", line 6, in
from torch.nn.modules.loss import Loss
File "c:\users\anaconda3\lib\site-packages\torch_init
.py", line 136, in
from torch._C import *
ImportError: numpy.core.multiarray failed to import`

I tried go further with first case, but on inference I got some errors again, and I thought it might be because of this step. Also, I need 1.7 to run another technology

If you have questions, please, ask. Any help would be appreciated!

from prophetnet.

gouldju1 avatar gouldju1 commented on May 17, 2024

I think a Colab tutorial would be really valuable. There were a few unknowns, unanswered questions, and hurdles I had to work through to be able to run.

from prophetnet.

anish-newzera avatar anish-newzera commented on May 17, 2024

I also think a Colab tutorial will be really helpful for us to use the model. As the exact steps that need to be performed are slightly unclear

from prophetnet.

alexgaskell10 avatar alexgaskell10 commented on May 17, 2024

I would also like a Colab tutorial please!

from prophetnet.

ryzhik22 avatar ryzhik22 commented on May 17, 2024

Is there any news about Colab tutorial? It will be really very helpful! =)

from prophetnet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.