Comments (3)
@PranjalChitale @VarunGumma
Also, even if I bypass this issue fairseq-interactive isn't taking in the input file.
Following is the input command:
fairseq-interactive ${ckpt_dir}/final_bin \ --distributed-world-size 1 --memory-efficient-fp16 \ --path ${ckpt_dir}/models/checkpoint_best.pt \ --task translation \ --source-lang SRC --target-lang TGT \ --batch-size 256 --buffer-size 2500 --beam 5 \ --num-workers 24 \ --skip-invalid-size-inputs-valid-test \ --input $outfname.bpe > $outfname.log 2>&1
In the above syntax, --input parameter has the valid outfname.bpe file but in the logs I am unable to check this as input. I am attaching the cfg.interactive, the one logged by the script, input should not be equal to '-'.
"interactive":{ "_name":"None", "buffer_size":2500, "input":"-", "force_override_max_positions":"None"}
from indictrans2.
The issue described above is due to the IndicNLP resources not being installed and the path not being set correctly.
Please refer to this link for guidance.
Additionally, because the preprocessing failed, the outfname.bpe file was not created successfully.
Regarding the README for the distillation branch, it may have been accidentally removed in the previous commit.
An up-to-date README will be added soon.
from indictrans2.
@PranjalChitale As mentioned above, I was able to solve the initial issue of IndicNLP, and after having the correct outfname.bpe (verified the file) I faced the above mentioned issue where the fairseq-interactive wasn't considering in the given input.
from indictrans2.
Related Issues (20)
- Translation of Proverbs and Idioms HOT 1
- use with ctranslate HOT 1
- Hardware Requirement HOT 1
- Handle src==tgt inputs in triton inference server
- Issues for the Urdu and Kashmiri HOT 2
- Flash Attention on Mac HOT 2
- Model Optimization HOT 1
- Convert fairseq tokenizer (vocab and final_bin) to HF Autotokenizer HOT 3
- Loosing Formatting post translation HOT 3
- Convert fairseq weights to ctranslate2 HOT 1
- Distillation of en-indic base model HOT 1
- Distillation: Unable to start the training HOT 2
- Saving Distillation model HOT 1
- Fairseq dictionary Size HOT 1
- help in finetuning ai4bharat/indictrans2-indic-en-1B HOT 2
- For Odia translations model is generating ଯ଼ in results which is not existing alphabet in Odia language. HOT 4
- Numerals Not Translated Correctly in IndicTrans2 HOT 3
- Installation issue. HOT 1
- Translations are not proper when source contain the different format of numbers. HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from indictrans2.