Giter VIP home page Giter VIP logo

berozain / lucenecisi Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 1.68 MB

Information Retrieval with Lucene and CISI dataset. Index documents and search between them with IB, DFR, BM-25, TF-IDF, Boolean, Axiomatic, LM-Dirichlet similarity and calculate Recall, Precision, MAP (Mean Average Precision) and F-Measure

Home Page: https://berozain.com

License: Apache License 2.0

Java 100.00%
cisi information-retrieval lucene axiomatics bm-25 dfr f-measure mean-average-precision precision recall

lucenecisi's Introduction

Lucene and CISI dataset

Information Retrieval with Lucene and CISI dataset

This is an example of how you can use Lucene to Information Retrieval with the CISI dataset. Index documents and search between them with IB, DFR, BM-25, TF-IDF, Boolean, Axiomatic, LM-Dirichlet similarity. You can enable and disable stemmer and set custom stop words. We use Lucene version 9.5.0 in this project. Don't forget to change the paths inside the code to your computer, then run it. You can use Eclipse to open and run this project.

Query

You can write query easily like Lending book or for advanced search you can use this format docTitle="" docContent="" docAuthors="" to find best results.

Evaluation

There are 111 queries with the most relevant results in order of relevance in the CISI dataset. In the evaluation section, we check how similar our results are to the best results. So we calculate Recall, Precision, MAP (Mean Average Precision) and F-Measure for all queries.

Resources

  1. Lucene
  2. CISI dataset

Developed by

  1. Behrouz Amoushahi
  2. DR Mehdi Jabalameli

lucenecisi's People

Contributors

berozain avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.