Giter VIP home page Giter VIP logo

adarank's People

Contributors

rueycheng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

adarank's Issues

Display ranking

Hi @rueycheng

Finally, I have managed to get the nDCG scores from k = 1 to k = 20

Now, I would like to display a ranking with the results.

I have tried this:

# Return ranking
 docno = load_docno(test_file, letor=True)
 print_trec_run(qid_test, docno, pred, output=open(ranking_file, 'wb')) 

However, docnos cant' be found in my test_file (docno array is empty)

In print_trec_run:
format(qid=qid[i], docno=docno[i], rank=rank, sim=pred[i], run_id=run_id))

I get this error:
IndexError: index 7 is out of bounds for axis 0 with size 0

I have seen that docnos follow one of these patterns:

docno_pattern = re.compile(r'#\s*docid\s*=\s*(\S+)')
docno_pattern = re.compile(r'#\s*(\S+)')

However, I think there are no docnos in my test file :(

It looks like this:

0 qid:1 65:0.8635398447491647 88:0.5042806128839266
0 qid:1 0:0.4336122872341365 1:0.5701433653876129 5:0.464003602765715 39:0.4336122872341365 68:0.2044479985767591 81:0.2044479985767591
0 qid:1 0:0.4116103074959742 4:0.5412136437389908 39:0.4116103074959742 67:0.5412136437389908 68:0.1940740750173357 81:0.1940740750173357
0 qid:1 60:0.351932125005294 68:0.2570511854639458 74:0.5833888176004248 76:0.6353350635618945 81:0.2570511854639458
0 qid:1 20:0.2779414512496159 24:0.5269511067979753 37:0.5269511067979753 63:0.2995265310626487 92:0.5269511067979753
0 qid:1 20:0.5645782010909166 94:0.8253795822849901

How could I get a ranking with this test file? or What could I do in order to change the test file format?

Thanks in advance.

ValueError

Hi,

I am trying to use this AdaRank implementation which seems amazing.
However, when I try to train the model, I get a ValueError.

My code is:

 X, y, qid = load_svmlight_file(train_file, query_id=True)
 print('X shape:', X.shape)
 print('y shape:', y.shape)
 print('qid shape:', qid.shape)
 
 X_test, y_test, qid_test = load_svmlight_file(test_file, query_id=True)

 scorer = NDCGScorer(k=10)
 
 '''
 Run AdaRank for 100 iterations optimizing for NDCG@10. 
 When no improvement is made within the previous 10 iterations, 
 the algorithm will stop.
 '''
 model = AdaRank(max_iter=100, estop=10, scorer=scorer).fit(X, y, qid)

The shapes of my data are:

X shape: (140, 105)
y shape: (140,)
qid shape: (140,)

The error happens in line:
model = AdaRank(max_iter=100, estop=10, scorer=scorer).fit(X, y, qid)

The error is:
ValueError: shapes (3,) and (91,) not aligned: 3 (dim 0) != 91 (dim 0)

Specifically, the error is in this line of the library:
weighted_average = np.dot(weights, score)

When I try to debug in the library, I get these:

Number of queries: 3
Weights shape: (3,)
Number of weak ranker scores: 105

And inside for fid, score in enumerate(weak_ranker_score)::

fid: 0
Score shape: (91,)

So it seems that there is a problem of sizes between weights and scores, but I don't know what's going on.

I have 3 different qid, 140 documents for training with 105 features each one.

Hope you can help me,
Thanks in advance

Example data

I have been trying to use the algorithm but I am not sure how the dataset should look like. Could you provide an example dataset for training and testing? Or at least, could you provide more information about what are X, y and qid types?

Thank you in advance <3

Given a query, return a ranking

Hi @rueycheng,

After doing some research I think print_trec_run is not what I'm looking for. Perhaps you can help me with this:

Once I have the AdaRank model trained with a set of documents and queries in Svmlight format, I would like to return a ranking based on a query entered by the user. That is, I am not interested in generating a ranking from a test set with other different documents, but rather in having a ranking with scores for the documents with which I have trained the model. I don't know if I explain myself.

That is, after training the model:

 X, y, qid = load_svmlight_file(svm_file, query_id=True)
 model = AdaRank(max_iter=100, estop=10, scorer=NDCGScorer(k=10)).fit(X, y, qid)

I want to calculate the scores for the same X documents given a qid_test (or more than one). And with those scores, return a ranking like:

qid_test docno   rank   score
----------------------------
1        3       1      0.7
1        5       2      0.23
1        6       3      0
2        6       1      1
2        5       2      0
2        3       3      0

Is it possible to get something like this?

Thanks in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.