Giter VIP home page Giter VIP logo

knnlm's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

knnlm's Issues

Interpretability/Appendix A from paper

Hello!

I'm working with knn-lms and was interested in having a similar observation as you had in Appendix A from this repo's paper (i.e. the retrieved context and its probability). Any tips on how to do that easily? I couldn't find any hints in the paper.

Thank you!

Worse test set perplexity than expected.

What is your question?

Using the provided snippet, I am only seeing 17.80 PPL on the test set of Wikitext-103 using faiss approx. distance, although the paper reports 16.50 PPL. Am I using the correct command?

Code

python eval_lm.py data-bin/wikitext-103 \
    --path wt103_checkpoint_best.pt \
    --sample-break-mode complete --max-tokens 3072 \
    --context-window 2560 --softmax-batch 1024 \
    --gen-subset test --dstore-filename dstore_train/dstore \
    --indexfile dstore_train/knn.index  \
    --model-overrides "{'knn_keytype': 'last_ffn_input'}" \
    --k 1024 --lmbda 0.25 --dstore-size 103225485 --knn-keytype last_ffn_input \
    --probe 32 --knnlm --knn-sim-func "do_not_recomp_l2" --no-load-keys

What have you tried?

I have rebuilt the knn index twice already, but I will try again.

What's your environment?

  • fairseq Version (e.g., 1.0 or master): [email protected]:urvashik/knnlm.git@1c0a4e0ee29fc037b53a1449e7724af0e07dcc41#egg=fairseq
  • PyTorch Version: 1.7.1
  • OS (e.g., Linux): Linux
  • How you installed fairseq: pip install -e .
  • Build command you used (if compiling from source): Followed instructions in README
  • Python version: 3.6
  • CUDA/cuDNN version: Titan X (Pascal) and CUDA 10.1 (from nvcc -V)
  • GPU models and configuration: n/a
  • Any other relevant information: n/a

Instructions for creating wiki 3b index?

I successfully replicated the instructions from the README, and am struggling to scale to larger data size. I was wondering whether you have access to the wiki 3B index (or a path on the FAIR cluster) or instructions to create it?
Thanks!

Book Corpus train/valid/test split method

Hi, thanks for open-sourcing the inspiring knn-LM model!
I'm trying to reproduce your results on Book Corpus dataset, but I found there is no standard train/valid/test splits. Could you please describe how to split Book Corpus dataset?

Dataset link has expired

Hi!

I am replicating kNN-LM and was wondering if it would be possible to share dict.txt or the link to the original dataset. As the original link has expired and it turned out not possible to train on the dataset in the experiment.

Thank you very much for your kind help!

Is sampling without building FAISS Index possible?

Hi - I went through your whitepaper and really excited about the potential applications it can bring.

I followed the readme up until building the index because I'm testing and evaluating on Colab which doesn't have 400gb sadly. I was able to load the checkpoint and evaluate the loss and perplexity without issues.

Is it possible to sample outputs given initial context like how it's described on Figure 6 of the Whitepaper?

overflow of int16

๐Ÿ› Bug

The int16 dtype (when --dstore-fp16 is activated) will cause overflow issue since int16 cannot handle all integer ids in a large vocab like Wikitext-103. This will make the vocab ids of many words negative and lead to incorrect perplexities. Thus I recommend to change all the int16 to int given that the token ids do not take much space anyway.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.