Giter VIP home page Giter VIP logo

queryexpansionsystem's Introduction

#Apriori-based query expansion system for Chinese IR

This is the python-based prototype of Chinese query expansion system using CKIP eHownet dictionary and association mining algorithm – Apriori in order to explore more query options for user in Google. However, as this still belongs to init version and prototype, codes of interface, HTTP server and database use simple sqlite and CGI script for development and it doesn’t integrate the web-framework like Django and doesn't use the ways like multiprocessing or threading to improve cal performance. In addition, the full usage of ehownet through SQL is not for free, thus the number of terms in the current dictionary to expand the user query is limited.

the slide for more detail

Quick-start

  1. Jieba is required to be installed in advance
easy_install jieba

or 

pip install jieba
  1. Run simple_httpd.py to start the http development server
  2. Open index.html by any your preferred browser to enter the entry of the system
  3. Know about the status by checking logs\server_info.log

Preview

Demo

Technical Overview

  • Make use of Google Web Search API to get the web snippet from Google index server
  • Use Bag of word and TF/IDF for feature exaction
  • Use Apriori algorithmto mine the association rule in a webpage and use eHowbet and a simple weighted scheme to prioritize the rules
  • Introduction to utilizing CKIP ehownet

Use case diagram

Demo1

Flow chart

Demo2

Retrieval based on two dimensional system

Demo3

License

The MIT License (MIT) Copyright (c) 2013 Yang Yao-Nien

Permission is hereby granted, free of charge, to any person obtaining a copy ofthis software and associated documentation files (the "Software"), to deal inthe Software without restriction, including without limitation the rights touse, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ofthe Software, and to permit persons to whom the Software is furnished to do so,subject to the following conditions: The above copyright notice and this permission notice shall be included in allcopies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS ORIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESSFOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS ORCOPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHERIN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR INCONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

queryexpansionsystem's People

Contributors

paulyang0125 avatar

Watchers

admin avatar James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.