How do I use only Reformulator with checkpoint of the reformulator?,about google/active-qa

Comments (33)

graviraja commented on July 27, 2024 1

@kaustumbh7 Use tensorflow version 1.12.2 and if there is tensorflow-estimator installed, uninstall it.

from active-qa.

rodrigonogueira4 commented on July 27, 2024

Extract the .zip file to $PRETRAINED
Write the checkpoint path in the files $REFORMULATOR_DIR/checkpoint and $REFORMULATOR_DIR/initial_checkpoint.txt, like this:
echo "model_checkpoint_path: "$PRETRAINED_DIR/translate.ckpt-6156696"" > checkpoint
cp -f checkpoint $REFORMULATOR_DIR
cp -f checkpoint $REFORMULATOR_DIR/initial_checkpoint.txt
Start Training:
python -m px.nmt.reformulator_and_selector_training
--environment_server_address=localhost:10000
--hparams_path=px/nmt/example_configs/reformulator.json
--enable_reformulator_training=true
--enable_selector_training=false
--train_questions=$SQUAD_DIR/train-questions.txt
--train_annotations=$SQUAD_DIR/train-annotation.txt
--train_data=data/squad/data_train.json
--dev_questions=$SQUAD_DIR/dev-questions.txt
--dev_annotations=$SQUAD_DIR/dev-annotation.txt
--dev_data=data/squad/data_dev.json
--glove_path=$GLOVE_DIR/glove.6B.100d.txt
--out_dir=$REFORMULATOR_DIR
--tensorboard_dir=$OUT_DIR/tensorboard

from active-qa.

bhoomit commented on July 27, 2024

Thanks @rodrigonogueira4 for the response.

But I don't want to train, just want to do inference.

from active-qa.

rodrigonogueira4 commented on July 27, 2024

In this case, type on the terminal:

echo "model_checkpoint_path: "$PRETRAINED_DIR/translate.ckpt-6156696"" > checkpoint
cp -f checkpoint $REFORMULATOR_DIR
cp -f checkpoint $REFORMULATOR_DIR/initial_checkpoint.txt

And use the following code (you only need to replace 'path/to/reformulator_dir' with your $REFORMULATOR_DIR)

from px.nmt import reformulator
from px.proto import reformulator_pb2

questions = ['question 1', 'question 2']

reformulator_instance = reformulator.Reformulator(
    hparams_path='px/nmt/example_configs/reformulator.json',
    source_prefix='<en> <2en> ',
    out_dir='path/to/reformulator_dir',
    environment_server_address='localhost:10000')

# Change from GREEDY to BEAM if you want 20 rewrites instead of one.
responses = reformulator_instance.reformulate(
    questions=questions,
    inference_mode=reformulator_pb2.ReformulatorRequest.GREEDY)

# Since we are using greedy decoder, keep only the first rewrite.
reformulations = [r[0].reformulation for r in responses]

print reformulations

from active-qa.

bhoomit commented on July 27, 2024

This is great. One more question: environment_server_address is not required for inference, right?

from active-qa.

rodrigonogueira4 commented on July 27, 2024

That is right, you can use a dummy address or environment_server_address=None

from active-qa.

graviraja commented on July 27, 2024

The reformulator with the checkpoint 1460356 is giving better results than with the checkpoint 6156696. Why is this happening?

For questions : ['how can i apply for nsa?', 'what is the minimum working hours required for a day?']

The 1460356 ckpt model is giving the following results.

Where as the 6156696 ckpt model is giving

from active-qa.

rodrigonogueira4 commented on July 27, 2024

The model trained on the machine translation task (checkpoint 1460356) produces sentences that are more grammatically correct, but the model trained with RL on the Q&A task (checkpoint 6156696) produces sentences that have a better F1 score despite not being grammatically correct.

from active-qa.

bvnagaraju commented on July 27, 2024

where can I download checkpoint 1460356 ?

from active-qa.

rodrigonogueira4 commented on July 27, 2024

The link is in the README:
https://storage.googleapis.com/pretrained_models/translate.ckpt-1460356.zip

from active-qa.

bvnagaraju commented on July 27, 2024

Is anyone had the below error while just doing inferences?
from px.proto import aqa_pb2
ImportError: cannot import name 'aqa_pb2

from active-qa.

bvnagaraju commented on July 27, 2024

The model trained on the machine translation task (checkpoint 1460356) produces sentences that are more grammatically correct, but the model trained with RL on the Q&A task (checkpoint 6156696) produces sentences that have a better F1 score despite not being grammatically correct.

hey how did you get this working?
I am seeing below error
ImportError: cannot import name 'aqa_pb2

from active-qa.

graviraja commented on July 27, 2024

The model trained on the machine translation task (checkpoint 1460356) produces sentences that are more grammatically correct, but the model trained with RL on the Q&A task (checkpoint 6156696) produces sentences that have a better F1 score despite not being grammatically correct.

hey how did you get this working?
I am seeing below error
ImportError: cannot import name 'aqa_pb2

I am using python2 and in compile_protos.sh i modified the python to python2
and the versions are the following :
grpc==0.3.post19
grpcio==1.16.1
grpcio-tools==1.16.1
protobuf==3.6.1

from active-qa.

graviraja commented on July 27, 2024

The model trained on the machine translation task (checkpoint 1460356) produces sentences that are more grammatically correct, but the model trained with RL on the Q&A task (checkpoint 6156696) produces sentences that have a better F1 score despite not being grammatically correct.

I have ran the selector model on dev data and the reformulations, scores produced by the model with checkpoint 1460356 are better than 6156696.

Reformulations, scores for 1460356 ckpt model.

Reforumlations, scores for the 6156696 ckpt model.

from active-qa.

JohannesTK commented on July 27, 2024

Does the gRPC server have to run? Getting a grpc.FutureTimeoutError:

python2 reformulate.py 
Num encoder layer 2 is different from num decoder layer 4, so set pass_hidden_state to False
# hparams:
  src=source
  tgt=target
  train_prefix=None
  dev_prefix=None
  test_prefix=None
  train_annotations=None
  dev_annotations=None
  test_annotations=None
  out_dir=/tmp/active-qa/reformulator
# Vocab file data/spm2/spm.unigram.16k.vocab.nocount.notab.source exists
  using source vocab for target
# Use the same embedding for source and target
Traceback (most recent call last):
  File "reformulate.py", line 10, in <module>
    environment_server_address='localhost:10000')
  File "/root/active-qa/px/nmt/reformulator.py", line 130, in __init__
    use_placeholders=True)
  File "/root/active-qa/px/nmt/model_helper.py", line 171, in create_train_model
    trie=trie)
  File "/root/active-qa/px/nmt/gnmt_model.py", line 56, in __init__
    trie=trie)
  File "/root/active-qa/px/nmt/attention_model.py", line 65, in __init__
    trie=trie)
  File "/root/active-qa/px/nmt/model.py", line 137, in __init__
    hparams.environment_server, mode=hparams.environment_mode))
  File "/root/active-qa/px/nmt/environment_client.py", line 152, in make_environment_reward_fn
    grpc.channel_ready_future(channel).result(timeout=30)
  File "/root/active-qa/venv/local/lib/python2.7/site-packages/grpc/_utilities.py", line 134, in result
    self._block(timeout)
  File "/root/active-qa/venv/local/lib/python2.7/site-packages/grpc/_utilities.py", line 84, in _block
    raise grpc.FutureTimeoutError()
grpc.FutureTimeoutError

from active-qa.

kaustumbh7 commented on July 27, 2024

The reformulator with the checkpoint 1460356 is giving better results than with the checkpoint 6156696. Why is this happening?

For questions : ['how can i apply for nsa?', 'what is the minimum working hours required for a day?']

The 1460356 ckpt model is giving the following results.

Where as the 6156696 ckpt model is giving

@graviraja Hey! I am trying to do the same thing but I get invalid reformulations like this-

My questions were- ['how can i apply for nsa?', 'what is the minimum working hours required for a day?']

from active-qa.

graviraja commented on July 27, 2024

@kaustumbh7 which glove you are using?

from active-qa.

kaustumbh7 commented on July 27, 2024

In this case, type on the terminal:

echo "model_checkpoint_path: "$PRETRAINED_DIR/translate.ckpt-6156696"" > checkpoint
cp -f checkpoint $REFORMULATOR_DIR
cp -f checkpoint $REFORMULATOR_DIR/initial_checkpoint.txt

And use the following code (you only need to replace 'path/to/reformulator_dir' with your $REFORMULATOR_DIR)
from px.nmt import reformulator
from px.proto import reformulator_pb2

questions = ['question 1', 'question 2']

reformulator_instance = reformulator.Reformulator(
    hparams_path='px/nmt/example_configs/reformulator.json',
    source_prefix='<en> <2en> ',
    out_dir='path/to/reformulator_dir',
    environment_server_address='localhost:10000')

# Change from GREEDY to BEAM if you want 20 rewrites instead of one.
responses = reformulator_instance.reformulate(
    questions=questions,
    inference_mode=reformulator_pb2.ReformulatorRequest.GREEDY)

# Since we are using greedy decoder, keep only the first rewrite.
reformulations = [r[0].reformulation for r in responses]

print reformulations

@graviraja I just followed these instructions. Glove, do you mean for the server ? I simply followed all the steps as given in active-qa readme.

from active-qa.

graviraja commented on July 27, 2024

@kaustumbh7 yes glove i mean for the server. Then in that case probably loading of the pretrained model is not happening correctly. Can you once cross check whether the checkpoint is loading correctly or not properly.

from active-qa.

kaustumbh7 commented on July 27, 2024

@graviraja Glove for the server is the same as mentioned in the readme section.

It does show- loaded train model parameters from data/pretrained/translate.ckpt-1460356, time 0.88s.

Did you run the same code as I did ? #9 (comment)

Did you run the code in python2?

Is any pre-processing of questions = ['question 1', 'question 2'] required?

from active-qa.

graviraja commented on July 27, 2024

@kaustumbh7 I have run the reformulator_and_selector_training.py code by keeping the
--enable_reformulator_training=false, and modified a bit like adding the
questions = ['question 1', 'question 2'] in the code.
Yes, I ran the code with python 2.

from active-qa.

kaustumbh7 commented on July 27, 2024

@graviraja Okay. I will try that out. Thanks for your help! 😊

from active-qa.

konrajak commented on July 27, 2024

@kaustumbh7 I am running into the same problem, only getting completely nonsensical reformulations. Have you managed to solve the issue?

from active-qa.

kaustumbh7 commented on July 27, 2024

@konrajak Not yet. But I am working on it.

from active-qa.

kaustumbh7 commented on July 27, 2024

@kaustumbh7 I have run the reformulator_and_selector_training.py code by keeping the
--enable_reformulator_training=false, and modified a bit like adding the
questions = ['question 1', 'question 2'] in the code.
Yes, I ran the code with python 2.

@graviraja I tried this as well but still got the same result.

Did you download the selector weights as well?

Did you give --enable_selector_training=false in reformulator_and_selector_training.py?

Can you please share your modified reformulator_and_selector_training.py file?

from active-qa.

graviraja commented on July 27, 2024

Hi @kaustumbh7

Below are the steps I have done:

After the pre-processing step, run the environment
In the training reformulator code in readme, i have done the copying the checkpoint steps.
I have run the same command in the training reformulator section, only change is keeping the argument --enable_reformulator_training=false , --enable_selector_training=false

Code changes:

The code is same till the read data part in main method.

  # Read data.
  questions, annotations, docid_2_answer = read_data(
      questions_file=FLAGS.train_questions,
      annotations_file=FLAGS.train_annotations,
      answers_file=FLAGS.train_data,
      preprocessing_mode=FLAGS.mode)
  dev_questions, dev_annotations, dev_docid_2_answer = read_data(
      questions_file=FLAGS.dev_questions,
      annotations_file=FLAGS.dev_annotations,
      answers_file=FLAGS.dev_data,
      preprocessing_mode=FLAGS.mode,
      max_lines=FLAGS.max_dev_examples)

After that, I have done the following:

  custom_questions = ['How can i apply for nsa?']
  responses = reformulator_instance.reformulate(
      questions=custom_questions,
      inference_mode=reformulator_pb2.ReformulatorRequest.GREEDY)

  # Discard answers.
  custom_reformulations = [[rf.reformulation for rf in rsp] for rsp in responses]

And commented all the later code.

Hope this helps!

from active-qa.

kaustumbh7 commented on July 27, 2024

Hi @kaustumbh7

Below are the steps I have done:

After the pre-processing step, run the environment

In the training reformulator code in readme, i have done the copying the checkpoint steps.

I have run the same command in the training reformulator section, only change is keeping the argument --enable_reformulator_training=false , --enable_selector_training=false

Code changes:

The code is same till the read data part in main method.
  # Read data.
  questions, annotations, docid_2_answer = read_data(
      questions_file=FLAGS.train_questions,
      annotations_file=FLAGS.train_annotations,
      answers_file=FLAGS.train_data,
      preprocessing_mode=FLAGS.mode)
  dev_questions, dev_annotations, dev_docid_2_answer = read_data(
      questions_file=FLAGS.dev_questions,
      annotations_file=FLAGS.dev_annotations,
      answers_file=FLAGS.dev_data,
      preprocessing_mode=FLAGS.mode,
      max_lines=FLAGS.max_dev_examples)
After that, I have done the following:
  custom_questions = ['How can i apply for nsa?']
  responses = reformulator_instance.reformulate(
      questions=custom_questions,
      inference_mode=reformulator_pb2.ReformulatorRequest.GREEDY)

  # Discard answers.
  custom_reformulations = [[rf.reformulation for rf in rsp] for rsp in responses]
And commented all the later code.

Hope this helps!

@graviraja It didn't work. I still get nonsensical reformulations. I don't understand what the problem is.
Thank you very much for your help.

from active-qa.

graviraja commented on July 27, 2024

@kaustumbh7 which pretrained model of reformulator you are using? If it is 6156696-ckpt, the reformulations are not good.

from active-qa.

kaustumbh7 commented on July 27, 2024

@graviraja I am using 1460356 provided here-https://storage.googleapis.com/pretrained_models/translate.ckpt-1460356.zip

from active-qa.

graviraja commented on July 27, 2024

@kaustumbh7 which tensorflow version you are using?

from active-qa.

kaustumbh7 commented on July 27, 2024

@graviraja My tensorflow version is 1.13.1.

from active-qa.

kaustumbh7 commented on July 27, 2024

@graviraja Thank you very much! It worked! 👍🏼😁

from active-qa.

Pem14604 commented on July 27, 2024

from px.nmt import reformulator
from px.proto import reformulator_pb2

Hi when i try to import this it says
from px.proto import aqa_pb2
ImportError: cannot import name 'aqa_pb2' from 'px.proto' (C:\Anaconda\lib\site-packages\px\proto_init_.py).

#As there is no .py file in proto folfer, please help.

from active-qa.

How do I use only Reformulator with checkpoint of the reformulator? about active-qa HOT 33 CLOSED

Comments (33)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent